Author

Topic: [~1000 GH/sec] BTC Guild - 0% Fee Pool, LP, SSL, Full Precision, and More - page 105. (Read 379078 times)

newbie
Activity: 42
Merit: 0
TECHNICAL EXPLANATION:
The problem was related to the cross server synchronization during block calculations.   Block calculations are done by a script which checks for new blocks once per minute.  At one point tonight, we had two blocks solved between the script running due to a 40-second block.  This caused a problem in the synchronization code, causing it to exit.

When the calculations are performed, the code sets a "lock" in the database telling it that a reward is being calculated, so that if it were to get stuck due to interserver connection problems, it would not have another instance of the script run and potentially duplicate the work.  Since the code exited due to a synchronization problem, this lock did not deactivate, and the automatic block allocations were halted until I manually restarted them.

This is a problem I will work on fixing this weekend.  It's a result of the code becoming much larger due to the integration of 3 servers with the pool, something it was not originally designed for. 

Magic, got it.
legendary
Activity: 1750
Merit: 1007
Fixing block allocations now, I knew the change to how the backups are made would cause me some trouble before going to sleep.


UPDATE:
  Blocks from the night have been fixed.  Time stamps are a little off on them, but share counts and allocation are good.  Part of the housekeeping for the optimization involved a heavy modification to how the worker share counts are stored from now on.  As such, the historical DB of _WORKER_ share counts has been reset prior to block 424.  The share counts for USERS are still present and visible in the block listing.

TECHNICAL EXPLANATION:
The problem was related to the cross server synchronization during block calculations.   Block calculations are done by a script which checks for new blocks once per minute.  At one point tonight, we had two blocks solved between the script running due to a 40-second block.  This caused a problem in the synchronization code, causing it to exit.

When the calculations are performed, the code sets a "lock" in the database telling it that a reward is being calculated, so that if it were to get stuck due to interserver connection problems, it would not have another instance of the script run and potentially duplicate the work.  Since the code exited due to a synchronization problem, this lock did not deactivate, and the automatic block allocations were halted until I manually restarted them.

This is a problem I will work on fixing this weekend.  It's a result of the code becoming much larger due to the integration of 3 servers with the pool, something it was not originally designed for. 
member
Activity: 70
Merit: 11
A 4th server is being prepared in advance to keep all the servers well under their maximum capacity.

Has anyone ever told you that you're a real cool cat?
legendary
Activity: 1750
Merit: 1007
@eleuthria

Hey, I just did a cash out and reset my workers, I use the API to calculate and benchmark earnings between PCs and suddenly the reset_shares, and reset_stales are in the negatives?

maybe combining 3 servers broke something in the process?

Part of my optimization to the servers involved deleting old data from the slave servers once they were confirmed to be sync'd with the master server, it may have caused some funny reset share/stale numbers.  Resetting again should solve this problem.
legendary
Activity: 1750
Merit: 1007
I applied a fix to the US East and EU servers.  Since they're not running on SSDs, the backups from RAM to the HD were causing short locks on the table.  Since applying the fix, the idles on my miner pointed at US East have halted.

A 4th server is being prepared in advance to keep all the servers well under their maximum capacity.
full member
Activity: 121
Merit: 100
@eleuthria

Hey, I just did a cash out and reset my workers, I use the API to calculate and benchmark earnings between PCs and suddenly the reset_shares, and reset_stales are in the negatives?

maybe combining 3 servers broke something in the process?
member
Activity: 98
Merit: 10
Servers overloaded.  That didn't take long  Undecided
member
Activity: 98
Merit: 10
why are there so many Invalid blocks lately?
from block 350 - 405, 3 blocks found were invalid:
http://www.btcguild.com/blocks.php

I thought invalid ones are supposed to be very rare Huh

It is because the entire bitcoin network has an enormous hashing power and it means that split second timing is all the difference between to submissions of a block solution and after that it is a matter of confirmation which clears up who was the victor.  Also, even when something is statically rare doesn't mean that occurrances don't group together from time to time and in fact groups like that are common.  Gotta take a mathematical course in stats Smiley  Frankly I have forgotten most of it since college ... about 15 years ago (degree was a designed as a five year degree which is odd and up took six due to dropping a spring quarter to work to be able to finish, and save my car ... TMI).  So maybe I am dating myself a bit Smiley
hero member
Activity: 602
Merit: 500
Anyone else getting lots of stales lately? I'm on USWest (and in the USwest) and I'm getting about 7.5% stales on each of my rigs.

No idles though which is good.
sr. member
Activity: 280
Merit: 250
Firstbits: 12pqwk
why are there so many Invalid blocks lately?
from block 350 - 405, 3 blocks found were invalid:
http://www.btcguild.com/blocks.php

I thought invalid ones are supposed to be very rare Huh
member
Activity: 98
Merit: 10
Idles are getting longer and my GPU usage is dropping and staying down for longer now, however, not long enough to affect temperature in any significant way.

Here is a snapshot with times CDT (-5):

Quote
09/06/2011 23:40:59, 734c0bba, accepted
09/06/2011 23:41:00, cd396f8d, accepted
09/06/2011 23:41:08, warning: job finished, miner is idle
09/06/2011 23:41:12, 796a7295, accepted
09/06/2011 23:41:14, 7345c4ac, accepted
09/06/2011 23:41:19, e4fbfa05, accepted
09/06/2011 23:41:25, c474e253, accepted
09/06/2011 23:41:29, 30ed12da, accepted
09/06/2011 23:41:40, 52f771b9, accepted
09/06/2011 23:41:49, e7a1be8c, accepted
09/06/2011 23:42:00, 2936c855, accepted
09/06/2011 23:42:08, warning: job finished, miner is idle
09/06/2011 23:42:12, long poll: new block 0000074eb14149bb
09/06/2011 23:42:25, aad1b16d, accepted
09/06/2011 23:42:33, f334faef, accepted
09/06/2011 23:42:44, 9596f0d8, accepted
09/06/2011 23:42:50, ea74deb0, accepted
09/06/2011 23:42:52, d531547c, accepted
09/06/2011 23:42:54, f769e620, accepted
09/06/2011 23:43:06, warning: job finished, miner is idle
09/06/2011 23:43:11, e65bb1b6, accepted

poclbm also was showing an RPC communication error for a couple of seconds during one of the idles. 

No packet loss that I am seeing on my route to useast.btcguild.com.  The worst variance was via one packet of the lot and during a traceroute it looks to be when on Level3 in Tampa.   Doesn't look to significant though.  I have to believe the communication stalls are from your server itself.

Quote
    Packets: Sent = 25, Received = 25, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 59ms, Maximum = 87ms, Average = 61ms

  8    89 ms    66 ms    63 ms  ae-1-12.bar2.Tampa1.Level3.net [4.69.137.117]

Not bad all things considered.  That is the worst hop with all the rest tight and quick.
member
Activity: 98
Merit: 10
Starting to see miner idles now too.  They seem to group together over a period of say 15 minutes each lasting perhaps a second.  No sign of even a blip on the temperature and GPU usage graphs, so they are minor so far.  I am connected to useast specifically due to issues with the route to uswest for me.  I plan to jump back to btcguild.com for the load balanced DNS round robin tomorrow evening or Saturday morning if the route looks clean [compared to the others] ... in fact, even if it doesn't, I will try and see how it goes.
member
Activity: 98
Merit: 10
Gratz to BTC Guild!!

We hit 1THash/s!!!

Thanks for all your hard work Eleuthria Cheesy

Congrats .. and running awesome for me!!!  I came here hoping to be the first to yell 1TH/s, but I was beat soundly  Grin

Yeah and it seems like just a week or so ago we were at like 400+  Cool

A week of what ... 3 days ago Smiley
hero member
Activity: 626
Merit: 500
Mining since May 2011.
Gratz to BTC Guild!!

We hit 1THash/s!!!

Thanks for all your hard work Eleuthria Cheesy

Congrats .. and running awesome for me!!!  I came here hoping to be the first to yell 1TH/s, but I was beat soundly  Grin

Yeah and it seems like just a week or so ago we were at like 400+  Cool
member
Activity: 98
Merit: 10
Gratz to BTC Guild!!

We hit 1THash/s!!!

Thanks for all your hard work Eleuthria Cheesy

Congrats .. and running awesome for me!!!  I came here hoping to be the first to yell 1TH/s, but I was beat soundly  Grin
full member
Activity: 336
Merit: 100
Gratz to BTC Guild!!

We hit 1THash/s!!!

Thanks for all your hard work Eleuthria Cheesy
member
Activity: 98
Merit: 10
EU server has been working great for me at least. Smiley

Awesome work on this pool, it's nice to have a server close by. Only 1% stales now too.

Edit: Uh oh, just now I started getting random disconnects on some of my miners.

BTC Guild has become extremely popular.  I have a hunch that another server is found to be required before long (hint Minneapolis is one of the best connected cities in the United States thanks to the University of Minnesota's involvement in it and predecessor network going back 40+ years.  Chicago is known to be a good central location as well.  If not these two cities and still try to stay central, Dallas has shown to have well connected reliable hosting a well.  Just thinking out loud Wink
member
Activity: 98
Merit: 10
still getting LOTS of miner idles. ~20% of all my blocks are idle. Works perfect if I use something else (bitcoins.lc)

anyone know why?

Do some network analysis between you and the pool you are connected too.  I have posted in the forums (probably deepbit thread) how to do some basic analysis if you don't know how.  Perhaps one of pool servers is giving you trouble via network problems between you and them and you can point to one of the other severs for a little while if it works better for you.
hero member
Activity: 626
Merit: 500
Mining since May 2011.
I talked to another user on IRC that had some similar network issues. It turns out that when he dropped his Phoenix AGGRESSION from 14 to 12 it fixed it. Myself, I found that my miners (all 5830's) won't even connect to any network if aggression is > 12. No idea why. On my Linux computers I get better performance with aggression=10 vs. anything higher anyway. Just something to try if you run Phoenix.

This fixed my problem too. I don't know about idles on poclmb though, I stopped using it because with Phoenix/phatk I got 10-20 more Mhash/s, do that times 12 GPU and it's like adding a free GPU.  Cheesy

Using:
-k phatk PLATFORM=0 DEVICE=0 VECTORS BFI_INT FASTLOOP=false WORKSIZE=128 AGGRESSION=12
newbie
Activity: 23
Merit: 0
Jump to: