Author

Topic: SOLVED [Bitcoins.lc] Issue with invalid shares (10 BTC bounty) (Read 16737 times)

kjj
legendary
Activity: 1302
Merit: 1026
Ugh.  Why wasn't it merged in a month ago?
legendary
Activity: 1596
Merit: 1100

Oh.  This is an old and well known problem.  Nothing to do with pushpool at all.

luke-jr worked around it in bitcoind with http://luke.dashjr.org/programs/bitcoin/w/bitcoind/luke-jr.git/shortlog/refs/heads/getwork_dedupe but there are other options for varying the coinbase as well.

ius
newbie
Activity: 56
Merit: 0
I have opened a pull request to revert the offending commits: https://github.com/jgarzik/pushpool/pull/24
member
Activity: 222
Merit: 12
The problem lies in poclbm's support for X-Roll-NTime.
There are 2 patches for this applied to our source, first disable the header that allows X-Roll-Ntime.
And secondly disabling the support completely.

Seems to have fixed all issues we had.

jgarzik: The patch you merged that added support that, consisted of 2 commits for those two - we undid them.

Now - back to coding, our pool is growing extremely fast.
Sorry about the withhold of the solution. I'm all in for the sprit of the community, and all further patches will be publicly available at my own github.

We're _STILL_ having problems with low number of duplicates, but not even close to 5-6%.

Once again, sorry. Smiley
If you want to, donate the "bounty" to ius which works hard to help us out with both bugs and improvements.
Cheesy

You talk well. But I do not understand anything. We must quit this business. In this I do not understand much.
kjj
legendary
Activity: 1302
Merit: 1026
The client will already complain if the clock is wrong by a lot.  I think that should be enough.  NTP is trivial to install, configure, and run on any OS.
legendary
Activity: 1428
Merit: 1000
https://www.bitworks.io
The remaining duplicates are probably coming from the clock problem ArtForz found.  Since you are running NTP on your server, I recommend patching bitcoind to disable the crappy clock adjustment.

The file is util.cpp, look for function GetAdjustedTime().

Code:
int64 GetAdjustedTime()
{
    return GetTime() + nTimeOffset;
}

Change to:

Code:
int64 GetAdjustedTime()
{
    return GetTime();
}

That should probably end up in the official client, in case any devs are watching.  If they really want to keep the clock adjustment even though NTP does a much better job, the getwork() function in rpc.cpp should be changed.  pBlock->nTime needs to be set to nPrevTime, which increases monotonically, after the call to IncrementExtraNonce().


Great point, I was wondering about that myself and can definitely see why it causes issues. Applying this to the official client with an NTP flag for those running right time may be worthwhile. Some PCs are great at it while others are poor because of hardware and/or software.
kjj
legendary
Activity: 1302
Merit: 1026
The remaining duplicates are probably coming from the clock problem ArtForz found.  Since you are running NTP on your server, I recommend patching bitcoind to disable the crappy clock adjustment.

The file is util.cpp, look for function GetAdjustedTime().

Code:
int64 GetAdjustedTime()
{
    return GetTime() + nTimeOffset;
}

Change to:

Code:
int64 GetAdjustedTime()
{
    return GetTime();
}

That should probably end up in the official client, in case any devs are watching.  If they really want to keep the clock adjustment even though NTP does a much better job, the getwork() function in rpc.cpp should be changed.  pBlock->nTime needs to be set to nPrevTime, which increases monotonically, after the call to IncrementExtraNonce().
newbie
Activity: 56
Merit: 0
If you want to, donate the "bounty" to ius which works hard to help us out with both bugs and improvements.
http://blockexplorer.com/tx/6647483f1329e5fab259b5d5619a7c20a28d6b440a1d866d25971dba78001912
sr. member
Activity: 403
Merit: 250
The problem lies in poclbm's support for X-Roll-NTime.
There are 2 patches for this applied to our source, first disable the header that allows X-Roll-Ntime.
And secondly disabling the support completely.

Seems to have fixed all issues we had.

jgarzik: The patch you merged that added support that, consisted of 2 commits for those two - we undid them.

Now - back to coding, our pool is growing extremely fast.
Sorry about the withhold of the solution. I'm all in for the sprit of the community, and all further patches will be publicly available at my own github.

We're _STILL_ having problems with low number of duplicates, but not even close to 5-6%.

Once again, sorry. Smiley
If you want to, donate the "bounty" to ius which works hard to help us out with both bugs and improvements.
legendary
Activity: 1428
Merit: 1000
https://www.bitworks.io
Did he pay for the fix, if so I find myself conflicted about whether or not it should be released. Maybe cover his cost plus what he lost and he should step up in the spirit of the community. You may want to go to the source of the fix and see if he will discuss since he posted his name earlier.
newbie
Activity: 56
Merit: 0
I'm offering to pay you 5BTC should you release a detailed explanation of the problem, and the fix.  I don't even mine in your pool (I actually hopped on IRC the first day you started to give you a couple suggestions/features that I would have needed to join).  This way, you are now splitting the cost and the entire community as a whole benefits.
Code:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

As one of the people who was trying to help track the bug down, I'm a little offended too with the "I'm going to withhold the fix", especially given that Pushpool is open source and cost a lot more than a day to develop.

I'm going to raise the bounty by another 2 btc for releasing the fix, but encourage you to just do the right thing instead of being mercenary and taking the money.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk30xMYACgkQaj+LAPvd0qQCtQCfQjbjz0HLky0RwIO6j1lqLw36
3fwAmwSbONdoFibFZ/jbZ/bjGE8RIf53
=m0/A
-----END PGP SIGNATURE-----
legendary
Activity: 1596
Merit: 1100
I'm planning to keep that to me self, due to i think the other pools based on pushpoold will eventually face the same problem.
This issue costed the pool/me personally 10 BTC + a whole day of hard work with debugging, tcpdump and ~10 people that tried in all ways to help us.

So...  you won't even tell the author of pushpool?

pushpool certainly cost me a lot more than "10 BTC + a whole day of hard work."

Zip for donations or thanks, just people asking for free support all the time, too.

full member
Activity: 155
Merit: 100
This has been bugging the crap out of me since I saw you refused to release the actual problem.

I'm offering to pay you 5BTC should you release a detailed explanation of the problem, and the fix.  I don't even mine in your pool (I actually hopped on IRC the first day you started to give you a couple suggestions/features that I would have needed to join).  This way, you are now splitting the cost and the entire community as a whole benefits.

Or heck, just PM me the problem and I'll write up the problem/solution myself after I test it in a dev environment.

-Phil
hero member
Activity: 797
Merit: 1017
I wonder how would, say, the linux project ends if anyone that spots and correct a bug would ask something to release the patch.
member
Activity: 222
Merit: 12
I'm not selling the fix, but other pool operators may contact me and we can negotiate something so both parts are happy.
That sounds pretty contradictory.

"I'm not selling this pile of drugs I've got, but if somebody wants some then they can contact me and we can negotiate something so both parties are happy".
sr. member
Activity: 403
Merit: 250
Just to make things clear:

I'm not selling the fix, but other pool operators may contact me and we can negotiate something so both parts are happy.
AFAIK that's okey with the license to.
member
Activity: 76
Merit: 10
pushpool is released under GNU Public License version 2 (LICENSE)

Pretty sure this means if you make modifications to the code (especially for money), you are in breach of this unless you release these modifications to the community (but I'm not a lawyer, might be wrong!)

If you want to recoup the 10BTC 'cost' of fixing, then I would set up a new bitcoin address for donations related to this specific bugfix, once you reach 10BTC and bitcoins.lc have made the money back for the original payment to ius, then split the rest between bitcoins.lc and ius.

...on another note: bitcoins.lc seems to be fully up again.  I would say that having approx an hour downtime since I started mining there on Tuesday seems pretty good, otherwise it's been totally stable - keep up the good work Jine!

Will

The GPL puts you under no obligation to release source code you have changed, it only forbids him from redistributing it without the source code or from trying to sell the product at all; offering a bounty to fix an issue with the code is perfectly fine as far as the GPL is concerned and so long as he doesn't attempt to sell the fix to other pool owners he is perfectly fine.
hero member
Activity: 797
Merit: 1017
pushpool is released under GNU Public License version 2 (LICENSE)

Pretty sure this means if you make modifications to the code (especially for money), you are in breach of this unless you release these modifications to the community (but I'm not a lawyer, might be wrong!)

No, GPL forces you to release the source code only if you release a modified version of the software.
hero member
Activity: 767
Merit: 500
pushpool is released under GNU Public License version 2 (LICENSE)

Pretty sure this means if you make modifications to the code (especially for money), you are in breach of this unless you release these modifications to the community (but I'm not a lawyer, might be wrong!)

If you want to recoup the 10BTC 'cost' of fixing, then I would set up a new bitcoin address for donations related to this specific bugfix, once you reach 10BTC and bitcoins.lc have made the money back for the original payment to ius, then split the rest between bitcoins.lc and ius.

...on another note: bitcoins.lc seems to be fully up again.  I would say that having approx an hour downtime since I started mining there on Tuesday seems pretty good, otherwise it's been totally stable - keep up the good work Jine!

Will
sr. member
Activity: 403
Merit: 250
newbie
Activity: 42
Merit: 0
I'm planning to keep that to me self, due to i think the other pools based on pushpoold will eventually face the same problem.
This issue costed the pool/me personally 10 BTC + a whole day of hard work with debugging, tcpdump and ~10 people that tried in all ways to help us.

If I'm about to release this, the pool need something in return.
We'll see if this is just an isolated issue with our pool, or if anyone else will face the same problem.
I agree with you
hero member
Activity: 731
Merit: 503
Libertas a calumnia
The server is up but the pool is down now...
member
Activity: 69
Merit: 10
Kupo!
another server attack ?
member
Activity: 70
Merit: 10
Server is down no website, connection problems mining.

D:
full member
Activity: 238
Merit: 100
Server is down no website, connection problems mining.
hero member
Activity: 797
Merit: 1017
If there's a bug with pushpool (which is what it sounded like from IRC) then you should let it be known. The developer of that was good enough to let you use it freely... share the love.

edit: Also it'll save me a weekend of serious debugging. If I track it down myself I won't be keeping secrets!

+1

If it's about pushpool, then you should tell them.
sr. member
Activity: 403
Merit: 250
I'm planning to keep that to me self, due to i think the other pools based on pushpoold will eventually face the same problem.
This issue costed the pool/me personally 10 BTC + a whole day of hard work with debugging, tcpdump and ~10 people that tried in all ways to help us.

If I'm about to release this, the pool need something in return.
We'll see if this is just an isolated issue with our pool, or if anyone else will face the same problem.
member
Activity: 222
Merit: 12
What was the problem?
sr. member
Activity: 403
Merit: 250
We finally solved our issue with rejected shares!
Credits to ius over at #bitcoins.lc for helping us out.

His account was just credited with 10 BTC ($270)
Thanks all for trying to help us! We really appreciate it!

If any other pool operator want to know what it is, please contact me and we can work something out...
newbie
Activity: 56
Merit: 0
I'm chasing a hunch. If anyone currently using bitcoins.lc would like to compare results from disabling Long Polling vs. leaving LP on, I'd find it instructive.
sr. member
Activity: 403
Merit: 250
#bitcoins.lc at irc.quakenet.org or use the webchat:
http://www.bitcoins.lc/chat

/ Jim
kjj
legendary
Activity: 1302
Merit: 1026
Where the hell is your IRC channel?  I've been lurking in what I thought was the channel all day and haven't seen a peep.
sr. member
Activity: 403
Merit: 250
The bounty have not been claimed yet.
Still looking for a solution(!)

A few people over at our IRC-channel is working HARD to figure out the problem.

I think a packet sniffer would also tell us the answer to this, and wouldn't require a restart of anything?

We tcpdumped all bitcoind->pushpoold comminucation and founded 4 duplicate getwork's.
What we should do about it, or why it's that way - we don't know.
kjj
legendary
Activity: 1302
Merit: 1026
I think you are going to need to restart pushpoold with full logging on, and hope that you get enough information from the logs.

Right now, I can't tell if the problem is in get_work handing out the same work to many people, or if the problem is in check_hash (or hist_lookup) finding improper duplicates.  (both are in msg.c)

The answer is in the history elist, but I don't think there is any way to interrogate it while running.  Looks like the logs might show enough info.
sr. member
Activity: 403
Merit: 250
What are the specs of the computer it's running on (including bandwidth), too little ram could potentially cause shares to not be accepted.

Enough for both DB and MySQL.

Quote
jine@bitcoins:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          2022       1372        649          0        105        611
-/+ buffers/cache:        655       1366
Swap:          234          4        230

Bandwidth is redundant 100mbit (200+200mbit full duplex)
CPU for main VM is 8 total cores of two Xeon E5504
newbie
Activity: 53
Merit: 0
Worker's doing duplicated work.

See screenshot below.


It's from all workers, all over the pool.
Duplicated work, and we really can't figure out why or what to do about it.
I can see my IP Cheesy
kjj
legendary
Activity: 1302
Merit: 1026
Has nothing to do with the midstate.  The midstate is calculated on the first 512 bits, which is identical in the two examples given in IRC.  The difference is in the timestamps, which is in the second half and will not change the midstate at all.

Also, the extraNonce is in the generation transaction.  The mining client never sees it and can't change it.
hero member
Activity: 700
Merit: 500
bitcoind isn't threaded AFAIK (yet)

Some more data:


14:56:31 <@jine> 28  133 total - SELECT * FROM `shares` WHERE `reason` = 'unknown-work'
14:56:54 <@jine> 193  114 total - SELECT *  FROM `shares` WHERE `reason` = 'stale'
14:57:40 <@jine> 239  898 total - SELECT * FROM `shares` WHERE `reason` = 'duplicate'
14:58:18 <@jine> 10  424  570 total - SELECT *  FROM `shares` WHERE `reason` IS NULL

SELECT * FROM shares WHERE solution ="*one of the duplicated solutions*"
Returns: http://jine.be/2


Example shares - query: SELECT * FROM `shares` WHERE `reason` IS NOT NULL LIMIT 465990 , 30
Returns: http://bitcoins.lc/_files/shares.csv

Jobbernowl:

No, but it only caches authentication AFAIK - not shares.

SomeoneWeird:

We were running stable of both versions before, when the problem started occouring.
Upgraded and restarted both pushpoold and bitcoind without any change.



What are the specs of the computer it's running on (including bandwidth), too little ram could potentially cause shares to not be accepted.
sr. member
Activity: 403
Merit: 250
bitcoind isn't threaded AFAIK (yet)

Some more data:


14:56:31 <@jine> 28  133 total - SELECT * FROM `shares` WHERE `reason` = 'unknown-work'
14:56:54 <@jine> 193  114 total - SELECT *  FROM `shares` WHERE `reason` = 'stale'
14:57:40 <@jine> 239  898 total - SELECT * FROM `shares` WHERE `reason` = 'duplicate'
14:58:18 <@jine> 10  424  570 total - SELECT *  FROM `shares` WHERE `reason` IS NULL

SELECT * FROM shares WHERE solution ="*one of the duplicated solutions*"
Returns: http://jine.be/2


Example shares - query: SELECT * FROM `shares` WHERE `reason` IS NOT NULL LIMIT 465990 , 30
Returns: http://bitcoins.lc/_files/shares.csv

Jobbernowl:

No, but it only caches authentication AFAIK - not shares.

SomeoneWeird:

We were running stable of both versions before, when the problem started occouring.
Upgraded and restarted both pushpoold and bitcoind without any change.

member
Activity: 222
Merit: 12
I might be completely wrong here, but it like there could be some threading issues with bitcoind when using long polling.

edit: Never mind, answered my own question, before the next one is sent to it... duh.
member
Activity: 222
Merit: 12
So does suggest that in the transaction is getting incremented when nonce overflows but hash of the header is still getting sent out with it?
hero member
Activity: 700
Merit: 500
newbie
Activity: 56
Merit: 0
Summarizing discussion from IRC: the problem is reproducible in that getwork is returning the same midpoint for ~14% of requests but slightly different data.

Code:
[07:41] I'm seeing duplicate midstates
[07:42]       2 03dde4101cba6ba4d3a9d98bf6f074324b5ef4104e76257aa1f6e4374df10311
[07:42]       2 1b34e9976f563be464fbab7eb16bcae30aea5bc2428bb39cc168a1ea380fa4a0       2 203f9fc272d15f9a48b7f5c01252e14737302d104669a5a7ad40d5bf20065393
[07:42] (here's how I reproed)
[07:42] for i in `seq 0 99`; do (curl -d '{"method":"getwork","params":[],"id":1}' http://xxx:[email protected]:8080/ > /tmp/state${i} &) ; done
[07:42] awk -F\" '{print($10)}' /tmp/state*|sort|uniq -c|sort|grep -v '^      1'
[08:10] here's one midstate/data
[08:10] "midstate":"befe8ff2573584e44cd82e1f818c7867d83c4d47104b9c504651945c4b9fb8fe","target":"ffffffffffffffffffffffffffffffffffffffffffffffffffffffff00000000","data":"00000001458bbc3db0fdb6d264cfcd58bec77b06259835f18f8df83400000c680000000010ef5d24e007b8bf8e73d82cd2d62c2c87ed0f2dc404c7ecfa8ea1b83734d8754df2024e1a1d932f00000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000"
[08:10] here's another midstate/data with same midstate, different data
[08:10] "midstate":"befe8ff2573584e44cd82e1f818c7867d83c4d47104b9c504651945c4b9fb8fe","target":"ffffffffffffffffffffffffffffffffffffffffffffffffffffffff00000000","data":"00000001458bbc3db0fdb6d264cfcd58bec77b06259835f18f8df83400000c680000000010ef5d24e007b8bf8e73d82cd2d62c2c87ed0f2dc404c7ecfa8ea1b83734d8754df202511a1d932f00000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000"
[08:10] notice df202511a1
[08:11] as opposed to df2024e1a1
member
Activity: 222
Merit: 12
No specific reason why that'd help, but it helps with the process to know that the bug happens regardless. But Im not doka.
hero member
Activity: 700
Merit: 500
Hi!

We're having serious problems with invalid shares (mainly duplicated work!).
I'm looking for one or more people to help us solve this issue asap.

We have a bounty of 10 BTC for a solution that works and permanently solves the problem

Please contact me on IRC (#bitcoins.lc) or [email protected]
I'll provide any information necessary.

Some general info:
* Invalid shares are to 90%+ duplicated work
* Using latest git version of both bitcoind and pushpoold
* MySQL (MariaDB (Aria)) storage for shares
* Lates memcache + dependencies from apt-get (Debian Squeeze)
* 32 bit OS.

We need help asap.

Regards, Jim

First thing(s) i'd suggest:

  • - Grab the stable releases of both bitcoin and pushpool
  • - How fast is the harddrive? If it's not writing to the database quick enough that might be causing stale shares
  • - Move to a 64bit os

PM me when your online I can probably help.
member
Activity: 222
Merit: 12
I don't know much about the pushpool side of things  Wink
hero member
Activity: 797
Merit: 1017
This is happening to honest users. I'm one of them  Sad
kjj
legendary
Activity: 1302
Merit: 1026
Are those the real usernames?  Now that I've signed up, I can see that you use randomly generated worker names.

It looks like someone is trying to scam you by sending in one result from many locations, hoping to earn many shares for little work.

Or is this happening to honest users?

If you can't tell, I've signed up and sent one of my workers to you.  Check your PMs for the IP and username.

Unfortunately, I'm going to be unavailable for about an hour, and it doesn't look like you can wait.  Hopefully someone else can step in and help.

Edit: usernames are random by design
hero member
Activity: 767
Merit: 500
something to do with the way you allocate to workers from main user accounts?  clients running more than one miner on the same worker?  Those are my initial thoughts.  Are you assigning work from a user account but accepting work from a worker account?
sr. member
Activity: 403
Merit: 250
Worker's doing duplicated work.

See screenshot below.


It's from all workers, all over the pool.
Duplicated work, and we really can't figure out why or what to do about it.
hero member
Activity: 630
Merit: 500
Bumping, not because I can help, but because this pool is awesome and needs someone to earn their $300 worth of BTC!
kjj
legendary
Activity: 1302
Merit: 1026
What do you mean invalid shares?  Like you think you found a block, but it isn't accepted by the network because the network found a block moments before you found yours?  Or is someone sending duplicates of their low difficulty solutions to you?
sr. member
Activity: 403
Merit: 250
Hi!

We're having serious problems with invalid shares (mainly duplicated work!).
I'm looking for one or more people to help us solve this issue asap.

We have a bounty of 10 BTC for a solution that works and permanently solves the problem

Please contact me on IRC (#bitcoins.lc) or [email protected]
I'll provide any information necessary.

Some general info:
* Invalid shares are to 90%+ duplicated work
* Using latest git version of both bitcoind and pushpoold
* MySQL (MariaDB (Aria)) storage for shares
* Lates memcache + dependencies from apt-get (Debian Squeeze)
* 32 bit OS.

We need help asap.

Regards, Jim
Jump to: