Pages:
Author

Topic: Could BTCGuild be cheating its miners? - page 3. (Read 5822 times)

sr. member
Activity: 448
Merit: 250
August 08, 2011, 10:09:14 PM
#41
A better analogy is this one:

A certain town has 500 people. You suspect that maybe some of them are psychic. So you ask each person to try to correctly identify the suits of some cards they cannot see. Of the 500 people, you find 3 that did extremely well in the test. So you say that you suspect those 3 are psychic.

But, of course, they could have succeeded merely by luck. So you decide to do a new study. You'll test those 3 people. But wait, someone already did that. So you'll just check the previous results. Lo and behold, the data shows that those 3 people succeed way beyond what you'd expect by mere chance, perhaps the chances were only 3 in 500 that they could do that well by chance.

You cannot use the same data both to decide which pool to accuse of cheating and to validate that same accusation. Some pool has to have the worst luck, so bad that it's hard to believe looking only at that pool that its luck was that bad due to mere chance.

That was the point I was bringing up with mentioning Ars. For every bad luck streak that someone is having, generally, someone else is killing it. Probability always works itself out in the end.

OP, you ever been to Vegas? Do you claim the casino is cheating when the guy next to you wins big while you lose the shirt off your back?

Come to Arsbitcoin; SMPPS and a super-legit and responsive pool op. That way you don't have to worry about the big bad BTCGuild breathing down your neck.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
August 08, 2011, 09:51:12 PM
#40
A better analogy is this one:

A certain town has 500 people. You suspect that maybe some of them are psychic. So you ask each person to try to correctly identify the suits of some cards they cannot see. Of the 500 people, you find 3 that did extremely well in the test. So you say that you suspect those 3 are psychic. (However, you would expect about 3 out of 500 to do that well by chance. So these are just the people you suspect.)

So, you'll continue your analysis. You'll test those 3 people. But wait, someone already did that. So you'll just check the previous results. Lo and behold, the data shows that those 3 people succeed way beyond what you'd expect by mere chance, perhaps the chances were only 3 in 500 that they could do that well by chance.

You cannot use the same data both to decide which pool to accuse of cheating and to validate that same accusation. Some pool has to have the worst luck, so bad that it's hard to believe looking only at that pool that its luck was that bad due to mere chance.
legendary
Activity: 1190
Merit: 1000
August 08, 2011, 09:32:37 PM
#39
If the pool is stealing from miners, it's not like sometimes they're going to cheat and sometimes they're going to sneak in bonuses.

I think this is an excellent point. Pretend you are a pool operator out to cheat. Having 5 visibly bad days seems like a bad way to steal. It would be both easier and less visible to skim off a small amount on a constant basis rather than inserting code that robs a large amount on a set schedule.
And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events.

Also keep in mind, the original poster has been examining all the pools for statistical clusters of bad luck (as evidence of malfeasance). When the OP presented the odds of finding the cluster, he failed to mention the other 2 major pools he examined (deepbit and slush) and did not include them in his search space. The OP presented the odds of his finding occurring as if he had only examined 1 pool instead of 3. Had he found bad luck in slush or deepbit, he would be talking about them instead of btcguild in exactly the same way.

It is as if he flipped a coin 1000 times, dropped 700 of the results completely and then chose a run of NINE tails over a 10 flip period. I would absolutely expect him to find a run of "NINE" tails out of 10 under those circumstances.  Roll Eyes

I would also expect him to try to convince me of how "unlikely" the existence of that event was.

Vladimir,

Thanks for doing that. I guess I was doing it wrong using Binomial Distribution when I should have been using Poisson. My understanding is that Binomial Distribution is based on a fixed set of trials whereas Poisson is based independent events occurring over time, which is the correct thing to use for finding bitcoin blocks.

So my 0.0036 (277:1) figure was incorrect it should have been 0.00695 (144:1) as you said.

Quote
And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events.
I want to know if the pool is being bounced up and down on purpose (with the down days likely being a little more down than the up days are up) to try to make it more difficult to see what is going on.

So eleuthria says that during difficulty 1563027 there is a 90% chance the number of blocks found should have been higher so 10% chance they were this low by chance. That's because that +70% and other high luck days that followed the low days made up for a lot of it.

If some more positive luck days are added to the pool it could be made to look just fine without any missing blocks problems at all. But the low days and high days will remain there as evidence that manipulation -- such as stealing and then a cover up -- took place.

Not very many bitcoins were stolen over all so far. In fact the estimated amount of bitcoins stolen could go to 0 in the future if more positive luck days are added to make up for it.

Both Vladamir and Mad7 are conflating two calculations of odds:
finding this behavior in the next 5 days for btcguild
finding this behavior in any 5 day period for any pool

Vladimir correctly calculated the odds of finding this behavior in the next 5 days of btcguild. However, that is not relevant. We need to calculate the odds of searching every possible 5 day period for every pool (or even just the 3 biggest) and finding this sort of luck. The OP set out to find the luckiest pool to mine in and to find the unluckiest pool to accuse of cheating.
sr. member
Activity: 373
Merit: 262
August 08, 2011, 09:30:13 PM
#38
Vladimir,

Thanks for doing that. I guess I was doing it wrong using Binomial Distribution when I should have been using Poisson. My understanding is that Binomial Distribution is based on a fixed set of trials whereas Poisson is based independent events occurring over time, which is the correct thing to use for finding bitcoin blocks.

So my 0.0036 (277:1) figure was incorrect it should have been 0.00695 (144:1) as you said.

Quote
And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events.
I want to know if the pool is being bounced up and down on purpose (with the down days likely being a little more down than the up days are up) to try to make it more difficult to see what is going on.

So eleuthria says that during difficulty 1563027 there is a 90% chance the number of blocks found should have been higher so 10% chance they were this low by chance. That's because that +70% and other high luck days that followed the low days made up for a lot of it.

If some more positive luck days are added to the pool it could be made to look just fine without any missing blocks problems at all. But the low days and high days will remain there as evidence that manipulation -- such as stealing and then a cover up -- took place.

Not very many bitcoins were stolen over all so far. In fact the estimated amount of bitcoins stolen could go to 0 in the future if more positive luck days are added to make up for it.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
August 08, 2011, 08:59:31 PM
#37
If the pool is stealing from miners, it's not like sometimes they're going to cheat and sometimes they're going to sneak in bonuses.

I think this is an excellent point. Pretend you are a pool operator out to cheat. Having 5 visibly bad days seems like a bad way to steal. It would be both easier and less visible to skim off a small amount on a constant basis rather than inserting code that robs a large amount on a set schedule.
And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events.
legendary
Activity: 1190
Merit: 1000
August 08, 2011, 06:02:42 PM
#36
Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.

It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.

Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.

This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?

Am I overlooking something?

This has been possible from the beginning.  Miners know if they have the winning share, and the pool reports who found a block.  Unfortunately it's the only way I know of to truly audit a pool you suspect of withholding blocks, and it would require a large number of users to provide reasonable assurance one way or another.

That is the case if you want to prove a negative. We do not have to prove that you have not been cheating. All you need do is point out the fact that no proof has yet been presented and leave it at that. The OP got his odds wrong, as people who are not familiar with math have been doing for decades in Vegas.
legendary
Activity: 1750
Merit: 1007
August 08, 2011, 05:58:14 PM
#35
Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.

It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.

Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.

This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?

Am I overlooking something?

This has been possible from the beginning.  Miners know if they have the winning share, and the pool reports who found a block.  Unfortunately it's the only way I know of to truly audit a pool you suspect of withholding blocks, and it would require a large number of users to provide reasonable assurance one way or another.
donator
Activity: 2772
Merit: 1019
August 08, 2011, 05:49:57 PM
#34
Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.

It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.

Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.

This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?

Am I overlooking something?
legendary
Activity: 1190
Merit: 1000
August 08, 2011, 05:26:12 PM
#33
Thank You for the data.

So basically, we have 21.4221% probability of 'at most' 1462 blocs found while 1493.16 blocks are expected. This is well in the realm of possibility.



Don't feel bad for taking MadScientist75 at his word, once. From this point forward, you should double check all of his starting assumptions, all of his data, all of his calculations and all of his conclusions. Those of us that know MadScientist75, know to do this.

I think the person who mentioned standard deviation earlier in the thread was doing so as a slight to MadScientist75. It is a #btcguild inside joke that stems from this little gem:
Jul 26 20:40:25     Mad7Scientist: Are you using standard deviation anywhere in your equations?
Jul 26 20:40:43 sonicrules1234, what is standard deviation?

Hilarity follwed. You can grep the online logs for #btcguild if you want the whole conversation.  Cheesy

Anyway, Eleutheria already put this thread to bed with actual math and actual data. So I don't get to have fun with it.  Angry
hero member
Activity: 868
Merit: 1002
August 08, 2011, 05:22:32 PM
#32
I'm sure Mad7Trollface will be posting his apology and retraction any second now.
 Roll Eyes Roll Eyes Roll Eyes Roll Eyes
hero member
Activity: 812
Merit: 1001
-
August 08, 2011, 05:12:46 PM
#31
Thank You for the data.

So basically, we have 21.4221% probability of 'at most' 1462 blocs found while 1493.16 blocks are expected. This is well in the realm of possibility. Just like tossing a coin 2986 times and getting 1462 heads.

If we take only two last 'unlucky' periods than we have expected number of blocks 649.32 and actual 619 with probability of 'at most' 619 blocks found 12.0440%.

for whatever it worth.


legendary
Activity: 1750
Merit: 1007
August 08, 2011, 05:05:18 PM
#30
Since Vladmir likes his Poisson distribution, here's the hard data for the completed difficulties that have been tracked by the pool:

Code:
577,129,642 shares submitted during difficulty 1690906.  Number of blocks found: 334.  Expected blocks from that many shares: 341.13.

Odds of
Exactly 334       2.0246%
More Than 334    63.7237%
Less Than 334    34.2517%

481,715,969 shares submitted during difficulty 1563027.  Number of blocks found: 285.  Expected blocks from that many shares: 308.19.

Odds of
Exactly 285      0.9651%
More Than 285   90.3068%
Less Than 285    8.7281%

599,334,163 shares submitted during difficulty 1379223.  Number of blocks found: 433.  Expected blocks from that many shares: 434.54.

Odds of
Exactly 433      1.9116%
More Than 433   51.6715%
Less Than 433   46.4169%

358,943,796 shares submitted during difficulty 876954.  Number of blocks found: 410.  Expected blocks from that many shares: 409.30.

Odds of
Exactly 410     1.9687%
More Than 410  47.3083%
Less Than 410  50.7230%


Grand total (Hopefully it's safe to do this, I'm sure Vladmir will verbally assault me though if I'm wrong):
Number of blocks found: 1462
Expected blocks found: 1493.16

Odds of
Exactly 1462      0.7520%
More Than 1462   78.5779%
Less Than 1462   20.6701%

It's worth noting that there were _MANY_ restarts of pushpool instances/server moves during 867k - 1.56m difficulties.  Some of them were definitely duplicating work during that time meaning the share counts were inflated on some rounds due to the same work being issued twice.  However, I would expect the combined duplication of work was less than 1%, which is immaterial given the numbers we're working with.

I'll leave interpretation up to Vladmir, I don't want to start ranting interpretations of these numbers that may not be accurate.
hero member
Activity: 812
Merit: 1001
-
August 08, 2011, 04:30:18 PM
#29
Why do you people talk about all kinds of irrelevant stuff (from math point of view)?

Fact 1.

Quote
There are only 3 variables needed to determine probability of any 'at most N blocks found' outcome.

1. Difficulty.
2. Number of diff1 shares.
3. Number of blocks found.

That is it!. Total hashrate of the network is not important, total hashrate of somebody's mining rig or pool is not important. There are not even any standard deviations involved, even though some math geniuses above imply that without it everything is lost.

Basically, everything but the above 3 variables is utterly irrelevant.

Fact 2.

Quote
Poisson distribution is the only relevant distribution here, not standard not any others. Again read http://en.wikipedia.org/wiki/Poisson_distribution.

If any of that is above your head, here is simpler way to think of it.

If during some period of time when D is diffuculty and N1 is diff1 shares submitted to a pool, than expected number of blocks solved B1 = N/D.

Number of blocks found B2 is known.

Push B1 and B2 into poisson formula and you get probability. Again this calculator http://www.sbrforum.com/betting-tools/poisson-calculator/ will give you all the probabilities you want based on B1 and B2.

Think about D diff1 shares accepted by the pool as two coin tosses where head means block solved and tail means not solved. On average it should be one block solved per every shares. As simple as that.

Now... if we have 12 mil shares without a solved block and difficulty is 2 millions (or 6 mil shares with difficulty of 1 million) it is essentially the same as tossing a coin 12 times in a row and getting only tails. Not impossible, but what are the odds? (about a quarter of one percent actually)

Large pools should really be dead on target over any given ~2 week constant difficulty period (so far at least).

Fact 3

Quote
Probability of a pool finding at most 105 blocks when 133 is expected is 0.6952%.

Having said that, expectation of finding 133 blocks is stated by OP. I have no idea where he got this number from and since it is not derived by him from difficulty and number of diff1 shares but using some other method, which I do not understand, it is a highly suspect number.




newbie
Activity: 52
Merit: 0
August 08, 2011, 04:23:28 PM
#28
...So having it shoot up to +70% increases the chance that there is manipulation going on, just as -40% would, although it is hard to figure out where the pool got those extra solved blocks from if it was manipulation...
What?
sr. member
Activity: 448
Merit: 250
August 08, 2011, 04:16:29 PM
#27
online poker is rigged!

I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them?

Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down...


Assuming the network as a whole is constant isn't entirely accurate though.  Looking at the charts at bitcoinwatch can show you the network can vary quite a lot.  One pool's up/down does not mean another pool is having the opposite.  However, the odds of consistently solving blocks faster than expected at a difficulty are the same as consistently taking longer.

I spent the last week going over the changes I had made to pushpool, trying to find some simple change I made that would've caused it to not push valid shares upstream to bitcoind, and I've come up blank.  I replaced the pools with stock pushpool [outside of db-mysql.c and a long-poll disable bit] the other day and it changed nothing.  Other pools are running JoelKatz's patches so I have no reason to believe that is causing any issues either.

Software side issues are ruled out.  The only thing itching in the back of my mind is the way the servers have been split up to run against their own bitcoind + wallet, but that shouldn't mean anything, especially when we've been doing the same thing for a long time and have shown ups/downs regularly in luck.  Splitting miners [and making sure they come back to the same server with results] across 6 smaller pools should have no difference in variance from all using one large pool as long as each pool is running different headers [receiving addresses] for the hashes.

I thought the charts/hashrates recorded by bitcoinwatch were extrapolated from the difficulty and rate of solution of blocks. If that is the case, it is much more constant than it appears, as probability is responsible for the wiggle vs. actual computing power entering and leaving the network.

I have no opinion in either direction on this, I just wanted to bring up the Ars pool as an example of statistics not playing out as they should for a loooong stretch...and we're back to the whole random number generator thing...
sr. member
Activity: 448
Merit: 250
August 08, 2011, 04:10:03 PM
#26
The arsbitcoin.com pool's 800 BTC buffer is interesting. It looks like that pool is only 1/6th the size of BTCGuild.com.

Is there a way to find out over what length of time those 800 BTC were accumulated?

We switched to SMPPS on 7-06-11.
sr. member
Activity: 373
Merit: 262
August 08, 2011, 04:04:53 PM
#25
The arsbitcoin.com pool's 800 BTC buffer is interesting. It looks like that pool is only 1/6th the size of BTCGuild.com.

Is there a way to find out over what length of time those 800 BTC were accumulated, edit: or that they actually have 800 BTC?

edit:
If you consider the possibility that Ars Bitcoin and BTCGuild are working together, that 800 BTC buffer could explain where the missing blocks from BTCGuild went. Then at some point someone will keep the buffer. If there is negative manipulation of the pool, those blocks that are solved will show up somewhere and someone may see that a mystery mining pool has appeared. Transferring them to another pool would be a great way to hide.

I see that Ars Bitcoin keeps track of who solved each block. It would be possible to give the person who is appointed to win a share that is guaranteed to win wouldn't it?
legendary
Activity: 1750
Merit: 1007
August 08, 2011, 03:50:47 PM
#24
online poker is rigged!

I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them?

Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down...


Assuming the network as a whole is constant isn't entirely accurate though.  Looking at the charts at bitcoinwatch can show you the network can vary quite a lot.  One pool's up/down does not mean another pool is having the opposite.  However, the odds of consistently solving blocks faster than expected at a difficulty are the same as consistently taking longer.

I spent the last week going over the changes I had made to pushpool, trying to find some simple change I made that would've caused it to not push valid shares upstream to bitcoind, and I've come up blank.  I replaced the pools with stock pushpool [outside of db-mysql.c and a long-poll disable bit] the other day and it changed nothing.  Other pools are running JoelKatz's patches so I have no reason to believe that is causing any issues either.

Software side issues are ruled out.  The only thing itching in the back of my mind is the way the servers have been split up to run against their own bitcoind + wallet, but that shouldn't mean anything, especially when we've been doing the same thing for a long time and have shown ups/downs regularly in luck.  Splitting miners [and making sure they come back to the same server with results] across 6 smaller pools should have no difference in variance from all using one large pool as long as each pool is running different headers [receiving addresses] for the hashes.
sr. member
Activity: 373
Merit: 262
August 08, 2011, 03:49:23 PM
#23
If it goes to -100% one day then +100% the next day there is manipulation going on, even though in that particular case there is no indication of stealing. If there is manipulation going on, it's very important to know that.

If there is manipulation, then most likely there is also stealing going on. Why would there be manipulation of the luck in the pool without stealing? Can anyone explain this?

I'm trying to show that there is a high probability of manipulation of the pool going on, not that there is stealing going on. Once we determine that there is manipulation, it will be easy to conclude that there is stealing, unless the long term output of the pool is above the expected amount.

So having it shoot up to +70% increases the chance that there is manipulation going on, just as -40% would, although it is hard to figure out where the pool got those extra solved blocks from if it was manipulation. Why? maybe somebody is trying to divert attention away from the low luck days, and if so it worked fairly well because people on the forum who started talking about the bad luck quit talking about it when luck shot way up.

And the probability of the 5 day period is is 0.0036 (277:1) not 0.01.

If someone wants to find out what the expected(?) deviation for positive and negative luck is on the pool that would be nice. Those are the numbers you would get if you took all the positive and then negative values over a very long time on a normal non manipulated pool and averaged them.
sr. member
Activity: 448
Merit: 250
August 08, 2011, 03:32:43 PM
#22
online poker is rigged!

I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them?

Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down...

Pages:
Jump to: