Pages:
Author

Topic: Optimal pool abuse strategy. Proofs and countermeasures - page 3. (Read 30541 times)

sr. member
Activity: 373
Merit: 250
This was also an issue back when doublec ran his pool using puddinpop's server and client programs (before slush started his) which used a slightly different mining method but the end result was similar.  It was originally in "connected" mode and then changed to "contributed" due to user feedback, but there was much debate between the two.  Because it took days for the group to find a block, this was a large problem since it would actively dictate the behavior of potential users and whether they'd join the pool or not.  My first post on the forums in fact was my thoughts on the matter, in which I suggested something similar to what slush is currently doing with the time-weighting.  
full member
Activity: 238
Merit: 100
"However, if you stop mining before the block is found, there is an extra “unearned” income."

I can't seem to find an explanation of what this "unearned income" is. The only possibility I can think of is that it becomes more difficult to find hashes which have not been found before, as work progresses on a block, thereby making the earlier shares easier to find. If this is not the case, then I cannot see where the share based system would introduce unfairness.

When you submit a share, there is 1/difficulty chance it is a winning hash. But all the non-winning shares do not help finding the winning hash. So if you stop contributing, all your previous effort has zero value to the pool and pool should pay you zero for them. If it pays something, it is "unearned" income.


Quote
"Therefore, switching between pools in connected mode and between such pools and individual mining does not change your payoff."

This statement seems to directly contradict the hypothesis that cheating is in fact possible...

"Connected" mode is defined in previous paragraphs of the paper. A truly connected mode is not practical because it means miners are sending all the hashes (not just shares) to server. In connected mode, if you do not sending hashes this very moment, you get zero payment if the block is found because it couldn't have been you that found the block. A connected mode can be approximated by taking into account only shares contributed during, say, last minute or like in current slush's pool, by wighting recent shares much more than the older ones. In such mode cheating is mathematically possible, but the benefit is so small that it is note worth the effort and switching costs (it always costs some hashes to redirect ones mining effort).

Slush's old pool (before anti-cheating measures) was in "contributed" mode. All your past shares counted. It was prone to cheating described in this paper.
newbie
Activity: 1
Merit: 0
Sorry, I'm failing to understand the nature of the "unearned income", discussed in the paper. The critical section appears to be the first two paragraphs of section 2 "Pool cheating". In particular, the sentence:

"However, if you stop mining before the block is found, there is an extra “unearned” income."

I can't seem to find an explanation of what this "unearned income" is. The only possibility I can think of is that it becomes more difficult to find hashes which have not been found before, as work progresses on a block, thereby making the earlier shares easier to find. If this is not the case, then I cannot see where the share based system would introduce unfairness.

Also, the paragraph starts with the sentence:

"Therefore, switching between pools in connected mode and between such pools and individual mining does not change your payoff."

This statement seems to directly contradict the hypothesis that cheating is in fact possible...

I expect all the conclusions made in the paper and the adjustments made on its basis are sound, but the paper does seem to lack at least one critical point, so some elaboration would be appreciated.
sr. member
Activity: 416
Merit: 277
Good catch to spot this pool exploit! I completely missed this in spite of thinking about pooled mining quite a lot.

If you have any doubts about the reality of this exploit then consider the following thought experiment:
Suppose there are ten miners hashing at equal rates. After they each submit one work-unit to the pool, if a block is then found they each get 1/10th of the total profit. If one of the miners leaves and is replaced by a new miner at that stage and each miner then submits another work-unit then, if a block is then found the new miner gets 1/20th of the profit having done the same amount of work as the miner whose place he took. It's only fair if the share of the profit is proportional to the work done.

One solution to the problem is to make the payout for the block proportional to the time since the last block. The mining pool would have to have a slush fund to cover the cases where the block generation rate falls below the long-term average.

ByteCoin
sr. member
Activity: 294
Merit: 273
I think you are all missing the point a little.  The concern is not cheating but undetectable cheating. 

Bad guy can mask his behaviour pretty easily

Really?  My guess is that any of these strategies would differ drastically from the typical miner.  Your statistics may show differently, but I'm guessing most people who bother to use a mining pool leave their clients running pretty steadily in at least large chunks (if not 24/7), rather than connecting and disconnecting all the time.  All of the cheating strategies require disconnecting early, a behaviour that can be distributed but not masked by using multiple anonymous accounts--at any meaningful degree of exploitation, a significant proportion of those clients have to be in at the beginning of each round, with that same client disconnecting consistently before the end.  The only portion of the attack that can be masked through multiple accounts is the obtaining of statistics on the beginning of each round, which is not behaviourally necessary to identify the cheating signature.  Detection code consists of the following:

1.  Record connected/disconnected periods using getwork requests, not submitted shares (presumably already done based on your leaderboard's online/offline capability)
2.  Implement minimum send threshold of .5 bitcoins, or similar amount of value as the bitcoin exchange rate and network hashrate both grow.
3.  On send/payout of any account, calculate the average in and out times across blocks worked on by each worker, compared to the overall average.
4.  If times are notably shifted towards the cheating strategy once, send the following email:
Quote
Dear user, the server has detected that your worker account "X" is frequently connecting and disconnecting at odd times.  Perhaps there is a problem with your client software or internet connection?  Since it exists to reduce the variance of the mining process for a large group of people, pooled mining works best when your client is connected for longer periods at a time, or consistently during regular periods of the day and week.  If you are unable to connect your client consistently, perhaps pooled mining is not for you.  Should the problem persist, the next payout from that worker may be held by the server until you have an opportunity to resolve it.  Thank you for your cooperation!
If it happens again, do what you said but be willing to give someone one more chance if they can come up with a reasonable explanation for their mining client's bizarre behaviour.

...and then, because the only way to get around this method of cheat detection is to use accounts once and then burn them at an incredible rate....
5.  Implement captcha on account and worker registration.

Problem solved!

Keep in mind that for any of these attacks to work the cheater has to *consistently* be in at the beginning of each round (and I mean the very beginning, because even a short delay or randomisation will significantly impact the profits through loss of short blocks) and avoid the really long ones at all costs (because going all the way to the end with any kind of consistency, again, wipes out that edge very fast).  And I fail to see how this behaviour can be masked across any combination of accounts, because no legitimate user is going to randomly be connected at the precise beginning of every single round and then never stay connected to the end of a long block.

For any given detection code in step 4 above you can calculate both the chance of accidentally catching someone who's not trying to cheat and the maximum margin of cheating possible--you'll see what I mean pretty quickly.  As an example, if you're up for it Raulo or Ryo, try calculating these for detection code that flags workers whose average time logging in a worker for the first time (as opposed to just continuing from the end of a previous block) is < 5 minutes after a round begins, and average time out for good on a round is < 25 minutes later in it.  You'll notice that the cheating edge is blown away by trying to beat the detection filter, yet the probability of a user randomly obtaining these stats by accident is redonkulous.  Am I drastically missing a strategy here?

Sincerely,
eMansipater
legendary
Activity: 1386
Merit: 1097
For those that like to see, say estimated performance, what if you included: Number of shares in the last 2 hours?

If you read yesterday's posts here and in pool thread carefully, I mentioned this many times.

But things changed, looks like there will be all stats back soon.
sr. member
Activity: 247
Merit: 252
Hiding number of my shares seems weird. I can see them in my miner. Why hide information that came from me in the first place?

You see your shares, but you don't know to which block it fits. But shares on profile page were cleared when block was found. That's the difference.

edit: sorry for not reading thread carefully, but as davidonpda mentioned, any kind of stats would be nice. Your site is very handy for monitoring miners performance.
legendary
Activity: 1386
Merit: 1097
Hiding number of my shares seems weird. I can see them in my miner. Why hide information that came from me in the first place?

You see your shares, but you don't know to which block it fits. But shares on profile page were cleared when block was found. That's the difference.
sr. member
Activity: 247
Merit: 252
Hiding number of my shares seems weird. I can see them in my miner. Why hide information that came from me in the first place?
hero member
Activity: 726
Merit: 500
Slush, I miss the number of shares counters.  It was a good way for me to tell how my machines were doing relative to one another.  Is it possible to at least show a percentage or bar graph comparing each of a user's miners for the current block and maybe overall mining history?  It does not need to be compared to the pool as a whole.
legendary
Activity: 1386
Merit: 1097
I think you are all missing the point a little.  The concern is not cheating but undetectable cheating. 

Bad guy can mask his behaviour pretty easily, so with anonymous registrations it is still a problem. And I'm also trying to keep the system to be fully automatic, with clear rules. I don't like to disconnecting users because I 'think' the worker is 'maybe' cheating.
full member
Activity: 238
Merit: 100
I think you are all missing the point a little.  The concern is not cheating but undetectable cheating.  Every method of cheating the pool proposed thus far, and indeed virtually any statistically significant exploit(by definition), is easily detectable by the pool operator. 

With a little effort by the bad guy, slush will have to have an investigative team to find the problem. Switching users, switching IPs, being honest for some time or cheating with only a part of ones hashrate, all it make harder for the pool operator to detect. There is a lot of normal variance and it masks quite a bit. And it will be even harder, if there are many pools. Sure, if you cheat with, say, 25% of the pool hashrate, you are exposing yourself but with up to 5%, not that much.
sr. member
Activity: 294
Merit: 273
I think you are all missing the point a little.  The concern is not cheating but undetectable cheating.  Every method of cheating the pool proposed thus far, and indeed virtually any statistically significant exploit(by definition), is easily detectable by the pool operator.  Simply detect, warn, and remove.  Case closed.  Although I applaud Slush's efforts to keep abreast of this issue!  I just think his/her valuable time could be better spent elsewhere.
full member
Activity: 238
Merit: 100
Below is a script for calculating cheater's income in a pool that keeps only some number of most recent shares. It makes the switching from the pool much less profitable.

Code:
#!/usr/bin/awk -f
BEGIN{srand();lambda=0.07 # when to stop mining for the pool as a ratio of difficulty.
payratio=0.05 # how many last shares pool counts. Here 5% of difficulty
print "Fractions of income from"
print "Pool     solo     total"
for (i=1;i<=100000000;i++)
{

  r=rand()
  j=-log(r)


  if (j>lambda)
  {
    payolaind+=(j-lambda) # independent payoff
    payola+=(j<=payratio ? lambda :  (lambda-(j-payratio) > 0 ? (lambda-(j-payratio))/payratio : 0 )) # pool payoff counting only last payratio shares
  }
  else
    payola++  #full payoff. Cheater stayed in the pool for the whole round.

  if (i%1000000==0) print payola/i,payolaind/i,(payola+payolaind)/i
}
}


The above code calculates payoff for a pool that keeps 5% of the difficulty number of shares and the cheater exits from the pool after 7% of the difficulty (this value has to be much lower, lower than original 43%). The calculated value is about 2% which should be low enough to discourage this behavior as switching miners has some overhead (and the pool also has some latency overhead compared to solo mining). Keeping 10% difficulty, results in about 4% gain.  
full member
Activity: 238
Merit: 100
I believe having the value of contributed shares decrease over time would also fix the cheating problem, ensuring that the expected gain of joining is the same at all times. And you'd still be able to display the statistics like before.

This will not help since all old shares are in essence worth zero. Only the winning one is worth 50 BTC. I(λ) function is always positive. No matter how low is your payout after switching out of the pool, you still get "unearned" gains.

It may be possible to make cheating impractical but at the expense of making the pool more like individual mining. Because switching between pool and solo mining costs you some hashes (even if you do it as cleverly as possible), if the payout from cheating is low enough, it may be not worth cheating. I did Monte Carlo simulations  that show that counting only last 10% of difficulty of shares in the pool, reduces the extra cheating income to about 5% but the variance in the pool gets larger. But with multiple pools it get more interesting.
full member
Activity: 238
Merit: 100
Correct me if I am wrong, but isn't the expected number of winning hashes after n attempts simply n*p? 1-(1-p)^n is not the probability to find one block after n hashes, it is the probability to find at least one block after n hashes (and, as I said, in a typical scenario there would not be n hashes, as hashing stops after the first block has been found). (I tend to get these probability calculations wrong all the time, though - my apologies in case I am just spouting nonsense).

Yes, the average is n*p but what we need is the probability of finding indeed at least one block and by reverse not finding any. This is what drives the strategy. If the block is found early, keep mining for the pool. If not, start individual mining and return after the pool finds a block.
full member
Activity: 238
Merit: 100
slush and twobitcoins,

You are right. The block-shares resetting still allows for cheating. I was blinded by this nullification stuff and didn't notice it will erase some cheating gains but not all.

Slush "blinding" approach is probably best strategy currently (except awarding 50 BTC to the person who found the block which solves the porblem once and for all Smiley ) but I still think by some bitcoin network analysis (due to different network distances between the nodes) one can find at least the continent, where the current block is found. And the strategy would be to join the pool if the last block was found in Europe and leave the pool if not.  This way you will be more likely connected to the pool for fresh rounds and less likely for stale.

Contributed method is very tricky because in fact, all that matters is the winning block and the non-winning shares are all wasted.  

By the way, for those who want to have some practical test of this cheating strategy, instead of special functions, here is the Monte Carlo run which finds the round length (with exp(-x) distribution) and performs swiching to individual mining after some fraction of shares. Here is the awk code.
Code:
#!/usr/bin/awk -f
BEGIN{srand();lambda=0.434
print "pool  independ.  total"
for (i=1;i<=10000000;i++)
{
# calculate current round length..
  r=rand()
  y=-log(r)

  if (y > lambda) # we switched from the pool after lambda
  {
    payolaind+=(y-lambda)  # independent payout based on fractions not connected to the pool
    payola+=lambda/y         # pool payout reduced since we stopped mining after lambda
  }
  else
    payola++             #normal pool payout


  if (i%100000==0) print payola/i,payolaind/i,(payola+payolaind)/i
}
}
newbie
Activity: 47
Merit: 0
I must admit I didn't go through every line of the paper, so I just wanted to double check if I understood the problem correctly, in simple terms.

Is the problem this, as an example: assume there are 10 miners in a pool, and they each calculate 1000 hashes before the winning block is found. Then among those 1000 hashes, each of them has created the same expected amount of shares (let's say they have 2 shares on average). That would be the fair case.

The problem is that usually the shares won't be the last hashes to be found. So miner X might already have generated 2 shares among his first 500 hashes. If he stops exactly then, he'll have the same number of hashes as the other minors, for only half the work (on average). Of course had he continued to do 500 more hashes, he likely would have created more shares, too, so the calculation is not THAT simple.

Or in other words: if you stop right after receiving a share, you will have done less work for your shares than the fair miners, on average.

Not sure if all the maths in the paper adds up, as I said, I didn't check thoroughly. For example it might be incorrect to talk about n hashes, because if the winning block is found before n hashes, there wouldn't even be n attempts at hashing (I assume once a block is found, all calculations reset to the start :-) Correct me if I am wrong, but isn't the expected number of winning hashes after n attempts simply n*p? 1-(1-p)^n is not the probability to find one block after n hashes, it is the probability to find at least one block after n hashes (and, as I said, in a typical scenario there would not be n hashes, as hashing stops after the first block has been found). (I tend to get these probability calculations wrong all the time, though - my apologies in case I am just spouting nonsense).

legendary
Activity: 1386
Merit: 1097
As the pool was before, with full statistics, users could check whether their revenue was really ((50/total_shares) * nb_shares). They could also, by talking with each other, evaluate if the pool reported a plausible number of total shares. While there were ways for the pool to cheat, the statistics allowed users to check it didn't cheat too much.

Currently, you don't see realtime stats, but you can compare those numbers on found blocks older than 2 hours, so not such big difference. Also, as I said, this is only temporary solution. Once I'll change some internal things, I'll show realtime stats per actual hour, so everybody will see almost the same numbers as before this last update.

Quote
As long as you release data on how much a share is worth, "cheating" becomes possible.

But you can calculate it itself, too. There is no fancy stats now, but I don't hide any needed info. You know how many shares you submitted and how many reward did you get. You also know that one round is worth of 50BTC. There's nothing hidden! You only cannot calculate/use this numbers at realtime.

Quote
I was wondering if you had found a way to preserve both openness and fairness.

Yes, described above.

Quote
I believe having the value of contributed shares decrease over time would also fix the cheating problem, ensuring that the expected gain of joining is the same at all times.

Tell me how, I'll implement this! But this attitude open completely different way of cheating, I spent on this many hours by thinking. And once I'll need to take a financial risk of pool unlucky/cheating, pool mining won't be for free anymore :-).
Ryo
newbie
Activity: 28
Merit: 1
If you critize my trustworthy

I have no opinion about you, good or bad. My interest is what can be done, not what is actually done.

Quote
then yes, I can cheat users all the time, I don't need to hide stats to do so.

As the pool was before, with full statistics, users could check whether their revenue was really ((50/total_shares) * nb_shares). They could also, by talking with each other, evaluate if the pool reported a plausible number of total shares. While there were ways for the pool to cheat, the statistics allowed users to check it didn't cheat too much.

My understanding of the way your pools works now is: miners find shares, they receive bitcoins, but they can't know how much the pools pays each share. As long as you release data on how much a share is worth, "cheating" becomes possible. I was wondering if you had found a way to preserve both openness and fairness.

I believe having the value of contributed shares decrease over time would also fix the cheating problem, ensuring that the expected gain of joining is the same at all times. And you'd still be able to display the statistics like before.
Pages:
Jump to: