Pages:
Author

Topic: Are GPU's Satoshi's mistake? (Read 8213 times)

legendary
Activity: 1148
Merit: 1008
If you want to walk on water, get out of the boat
October 10, 2011, 03:32:18 AM
#85
What if satoshi is a group of people?
hero member
Activity: 756
Merit: 500
October 10, 2011, 12:33:57 AM
#84
I wonder if Satoshi will ever come out to the open.  Or is he already among us? Smiley
legendary
Activity: 1526
Merit: 1134
October 05, 2011, 03:05:52 PM
#83
Satoshi knew about GPUs back in at least April of 2009, which was only a few months after launch. So for sure he knew about it before the system was designed.

Quote from: satoshi
Eventually, most nodes may be run by specialists with multiple GPU cards.  For now, it's nice that anyone with a PC can play without worrying about what video card they have, and hopefully it'll stay that way for a while.  More computers are shipping with fairly decent GPUs these days, so maybe later we'll transition to that.
sr. member
Activity: 406
Merit: 257
October 04, 2011, 11:21:37 AM
#82
Currently 76 bytes? Not using the getwork protocol. A getwork response without http headers is already close to 600 bytes... to transfer 76 bytes of data.

Good to know.  Never looked inside a getwork request.  I just knew it was 76 bytes of actual header information.  600 for 76 seems kinda "fat" but then again as you point out even at 600b per header the bandwidth is trivial so likely there was no concern with making the getwork more bandwidth efficient.

Nice analysis on complete bandwidth economy.  Really shows bandwidth is a non-issue.  We will hit a wall on larger pools computational power long before bandwidth even becomes a topic of discussion.   I think a 64bit nonce solves the pool efficiency problem more elegantly but the brute force method is just to convert a pool server into an associated collection of independent pool servers (i.e. deepbit goes to a.deepbit, b.deepbit, c.deepbit ... z.deepbit) each processing a portion of pool requests.

Still had Satoshi imagines just a few years into this experiment  miners would be getting up to 3GH per machine he likely would have gone with a 64bit nonce.  When he started a high end CPU got what 100KH/s?  4 billion nonce range lasts a long time with sub MH performance. 
That also shouldnt become a major issue, merely the current implementation scaling badly there.
To increment the extraNonce in the coinbase transaction, we rehash every transaction in the block and rebuild the whole merkle tree, so the whole thing ends up scaling with (blocksize * requestrate). Adding the obvious optimization of storing the opposite side for the merkle branch and only rehashing coinbase and its merkle branch, we need an additional (log2(#tx in block) * 32) bytes of memory but scale roughly with (log2(#tx in block) * requestrate) for getwork.
At something like a current average block (~10kB in ~20 transactions), that comes out to ~240 vs. 8 sha 256 operations for 4Ghps worth of work.
Scaling to visa-level 10 kTX/sec (that'd be 3GB blocks containing ~6M transactions ...), it's ... about 50 sha256 operations for 4Ghps worth of work.
So for a pool roughly the size of the current network that'd be... for current tx volume 24k sha256/sec vs 150k sha256/sec for 10k tx/s.
And using something like "miners increment block time themselves", this can be cut down by another factor of 60.
Scaling this for increasing hashrates due to Moore's law... well... that applies to both sides.
So for the getwork+PoW side, I just don't see any hard issues coming up.
I expect to see way bigger problems on the transaction handling side of things scaling to such massive levels, assuming every tx has 2 inputs + outputs on average, you'd be verifying about 20k ECDSA sigs/second and on every block you're marking ~12M outputs as spent and storing ~12M new outputs in some kind of transactional fashion, probably just the list of current unspent outputs would be on the order of 10s of GB ... ugh.
full member
Activity: 168
Merit: 100
October 04, 2011, 10:58:39 AM
#81
Satoshi used google servers and Suupah Komputahz! chi chang!
donator
Activity: 2058
Merit: 1054
October 04, 2011, 10:14:15 AM
#80
Satoshi didn't see the pool miners coming for sure.
Satoshi understands probability so he clearly expected pools to emerge. It's likely though he didn't think they need any special consideration in the design.
member
Activity: 87
Merit: 10
October 04, 2011, 08:20:17 AM
#79
Satoshi didn't see the pool miners coming for sure.  But the algorithm still takes their collective power into account.  The most successful miners still pass on a fair amount of BTC.  I guess in the end it comes down to electricity. 
donator
Activity: 1218
Merit: 1079
Gerald Davis
October 04, 2011, 07:45:53 AM
#78
Currently 76 bytes? Not using the getwork protocol. A getwork response without http headers is already close to 600 bytes... to transfer 76 bytes of data.

Good to know.  Never looked inside a getwork request.  I just knew it was 76 bytes of actual header information.  600 for 76 seems kinda "fat" but then again as you point out even at 600b per header the bandwidth is trivial so likely there was no concern with making the getwork more bandwidth efficient.

Nice analysis on complete bandwidth economy.  Really shows bandwidth is a non-issue.  We will hit a wall on larger pools computational power long before bandwidth even becomes a topic of discussion.   I think a 64bit nonce solves the pool efficiency problem more elegantly but the brute force method is just to convert a pool server into an associated collection of independent pool servers (i.e. deepbit goes to a.deepbit, b.deepbit, c.deepbit ... z.deepbit) each processing a portion of pool requests.

Still had Satoshi thought that just a few years into this experiment miners would be getting up to 3GH per machine (30,000X his original performance) he likely would have gone with a 64bit nonce.  When he started a high end CPU got what 100KH/s?  4.2 billion nonce range is good for 11.9 hours @ 100KH/s.  No real reason for a larger nonce since the header changes more frequently than that due to block changes and transactions (even if queued into batches).  A 30,000 increase in performance suddenly shrinked that nonce lifespan though.
donator
Activity: 1218
Merit: 1079
Gerald Davis
October 04, 2011, 07:37:35 AM
#77
What DeathAndTaxes said, the Merkle root is the "executive summary" of the transactions. And, inclusion of transactions in the block is on a "best effort" basis - everyone chooses which transactions to include, and currently most miners/pools include all transactions they know. But it's ok if a miner is missing a few recent transactions, he'll get (a header corresponding to) them in the next getwork.

Thanks this is how I believed it worked but wasn't sure.  In that case the use of larger nonce value (say 64bit) would make pool server even more efficient.  Then again to those who are opposed to the concept of pool mining likely don't want pools to become more efficient.  Grin
sr. member
Activity: 406
Merit: 257
October 04, 2011, 03:55:27 AM
#76
Currently 76 bytes? Not using the getwork protocol. A getwork response without http headers is already close to 600 bytes... to transfer 76 bytes of data.

But yes, optimally it'd be 76 bytes (+ a simple header).

There'd be ways to cut that down even more, version is constant, nBits only changes every 2016 blocks, hashPrevBlock only changes on a new block, why send those every time?
Another option, allow miners to update nTime themselves.
work submits could be cut down in pretty much the same way, requiring only hMerkleRoot, nTime and nNonce. If there's too many increasing share difficulty would be trivial.
So, a simple more efficent protocol would have per 4Ghps:
hMerkleroot + nTime every 60 seconds or whatever the tx update interval is, hPrevblock every 10 minutes avg.
hMerkleroot + nTime + nNonce every second

at 100% efficiency, diff 1 shares and with some overhead that comes out to around 1 byte/second avg send and 45 byte/second or so avg receive for a poolserver for 4Ghps of miners.
Or about 24kbit/s send and 1Mbit/s receive for a pool the size of the whole current bitcoin network. Yeah.
If hashrates increase in the future, increase share difficulty by a few powers of 2 and you cut down the incoming rate accordingly...
So for the pool-miner interface, you can scale it up quite a few orders of magnitude before bandwidth becomes an issue.

For the network side, scaling to transaction volumes that are allowed by the current max network rule of 1MB/block, we need to receive and send the tx in that block and the block itself, that comes out to... 53kbit/s average.
The 1MB block size limit should be enough to fit about 4 tx/second average.
So... your average home DSL will become a problem when scaling up more than an order or magnitude above the current limits, we'd *need* some kind of hub-leaf setup beyond that, and assuming the hubs are decent servers you could easily get another 2-3 orders of magnitude. ... which would be roughly on par with visas peak tx capacity levels...
So doesn't look like bandwidth would become a major issue.
hero member
Activity: 630
Merit: 500
October 04, 2011, 03:29:06 AM
#75
And even if we reach "Visa levels" as described here, miners would only have to download, at peaks, 76*4.000 = 304KB/s if I got it right (a new header each time a new transaction arrives and changes the Merkle Tree).
No, as I explained, the miner doesn't need to get a new header when there's a new transaction. He just keeps mining on a header which doesn't include all the new transactions. When he finishes 4GH he gets a new header with all the recent transactions. That's how it's done right now, it's not a potential future optimization.

That's what I meant 2 phrases after:

And even if it was, the Merkle Tree doesn't really need to be updated at each new transaction, that can be done on bulks.

So, yeah, as you said, miners definitely don't need lots of bandwidth, not even on "Visa levels". Only pool operators need.
donator
Activity: 2058
Merit: 1054
October 04, 2011, 02:37:32 AM
#74
And even if we reach "Visa levels" as described here, miners would only have to download, at peaks, 76*4.000 = 304KB/s if I got it right (a new header each time a new transaction arrives and changes the Merkle Tree).
No, as I explained, the miner doesn't need to get a new header when there's a new transaction. He just keeps mining on a header which doesn't include all the new transactions. When he finishes 4GH he gets a new header with all the recent transactions. That's how it's done right now, it's not a potential future optimization.
hero member
Activity: 630
Merit: 500
October 04, 2011, 02:32:39 AM
#73
Thank you DeathAndTaxes for the full explanation.

So, currently, it's 76 bytes at each 4GH. Easily manageable. And even if we reach "Visa levels" as described here, miners would only have to download, at peaks, 76*4.000 = 304KB/s if I got it right (a new header each time a new transaction arrives and changes the Merkle Tree). I can download at that speed from my home connection today, so it probably wouldn't be a major problem for miners in the future. And even if it was, the Merkle Tree doesn't really need to be updated at each new transaction, that can be done on bulks.
So, nothing that frighting. Only pool operators would need lots of bandwidth, but at this stage, such operators could use local caches distributed in different parts of the world and other techniques to decrease their load.

Interesting.
donator
Activity: 2058
Merit: 1054
October 04, 2011, 12:11:43 AM
#72
Or is it an indirect hash of something in the header which is itself a hash to all transactions? Even if it's that, wouldn't such header have to be retransmitted each time a new transaction is propagated?
What DeathAndTaxes said, the Merkle root is the "executive summary" of the transactions. And, inclusion of transactions in the block is on a "best effort" basis - everyone chooses which transactions to include, and currently most miners/pools include all transactions they know. But it's ok if a miner is missing a few recent transactions, he'll get (a header corresponding to) them in the next getwork.
donator
Activity: 1218
Merit: 1079
Gerald Davis
October 03, 2011, 07:37:21 PM
#71
I see what you are saying now.  I was updating my post (above) while you were responding.

Most of the need for header changes comes from nonce exhaustion.  Lets look at the times a miner needs to change headers.
a) block change - once per 600 seconds
b) new transaction - once per 13 seconds (at current transaction volume)
c) nonce exhaustion -  once per 4000/(MH/s)

For most miners nonce exhaustion creates the majority of the header changes.  For example a 2GH miner needs a new header every 2 seconds.  For every block change it exhausts it's nonce range 300 times.  For every transaction on the network it exhausts its nonce range 7 times.

Now transaction volume will grow but it is unlikely it will grow longterm faster than Moore's law.  Average hashing power will increase at a rate equal to Moore's law.  That is a doubling every 24 months.  2^5 = 32 fold every decade.  A decade from now that 2GH miner will be a 64GH miner.  That is 16 header requests per second.   

Even with real-time inclusion of transactions nonce exhaustion makes up the majority of the load on a pool server and that will only increase.  If a server were to delay including transactions to once per minute by holding all transactions till the next minute and then including them all in a block change (which would only slightly delay confirmations) then nonce exhaustion makes up an ever greater % of server load.
kjj
legendary
Activity: 1302
Merit: 1026
October 03, 2011, 07:24:45 PM
#70
This would also slow transactions, and/or not decrease traffic by nearly as much as you expect.  The mining pool node is constantly updating its Merkle tree, so a new getwork request includes not just a different coinbase with a new extranonce, but also a different set of transactions, some new.  A 64 bit nonce would roughly triple the average transaction confirmation time, unless the node trips the long polling system, which makes extra traffic.

How?  Regardless of the nonce size a block will be confirmed on average every 10 minutes. At worst case scenario transactions can always be included in the next block.   

Are new transactions after the start of a block included in the current block (as opposed to next block)?  If so then on average never updating transaction list would add 5 minutes to first confirmation and nothing to subsequent confirmations.   If not then confirmations are no slower.

If we assume that the average transaction happens about 5 minutes before the next block is found, the current system makes it very, very likely that the transaction will be included in the current block.  This means that the expected waiting time for a transaction is just a bit over 5 minutes.

With a 64 bit nonce, all mining clients will only update their work every 10 minutes (on average), when a new longpoll hits.  So, the average transaction will wait 5 minutes before anyone even starts working on a block that includes it, and then 10 minutes more (on average, of course) for that block to be found.  So, the total wait time is then 15 minutes, instead of 5.  The worst case, sending a new transaction just moments after all the pools update their miners, goes from 20 minutes to 30.
donator
Activity: 1218
Merit: 1079
Gerald Davis
October 03, 2011, 07:12:52 PM
#69
This would also slow transactions, and/or not decrease traffic by nearly as much as you expect.  The mining pool node is constantly updating its Merkle tree, so a new getwork request includes not just a different coinbase with a new extranonce, but also a different set of transactions, some new.  A 64 bit nonce would roughly triple the average transaction confirmation time, unless the node trips the long polling system, which makes extra traffic.

How?  Regardless of the nonce size a block will be confirmed on average every 10 minutes. At worst case scenario transactions can always be included in the next block.  

Are new transactions after the start of a block included in the current block (as opposed to next block)?  If so then on average never updating transaction list would add 5 minutes to first confirmation and nothing to subsequent confirmations.   If not then confirmations are no slower.

Still even w/ 64bit nonce there is no reason you couldn't update merkle tree between blocks. Look at it this way.  Take a hypothetical 1TH/s pool.  On average it needs to compute and issue 15,000 header per minute for it's pool members.  Looking @ block explorer the last 24 hours had 6407 total transactions.  That is one average 4 per minute.  If the pool had 1000 members using a 64bit nonce and only changing header on transactions would cut that down to 4,000 headers per minute.  If pool only updated headers once per minute (on average adding a few seconds to each transaction confirmation time) it would be only 1,000 headers per second.
kjj
legendary
Activity: 1302
Merit: 1026
October 03, 2011, 06:59:06 PM
#68
The proof of work is not a hash of the entire block then?

Or is it an indirect hash of something in the header which is itself a hash to all transactions? Even if it's that, wouldn't such header have to be retransmitted each time a new transaction is propagated?

It is a hash of the header which contains the Merkle Root of all transactions in the block plus the hash of the last block.  That is how the "chain" is efficiently created.  If you know a previous block is valid and each block contains the hash of the prior block in the current block then you know the current block is valid by following the chain from the genesis block. Every transaction in the current block is confirmed because the merkle root is the hash that contains all the hashes of the transaction in the block.  If an extra transaction was added or one taken away the merkle root hash would be invalid.

https://en.bitcoin.it/wiki/Block_hashing_algorithm

All together the only thing that is hashed is the header which is 80 bytes.  The nonce is determined by the miner so the pool actually only transmits 76 bytes.  The miner then tries all nonces from 0 to 2^32 -1 (roughly 4 billion attempted hashes).  A shares is 2^32 hashes so 1 share ~= 1 header transmitted.

Since nonce only has 2^32 possibilities the pool server needs to provide a new header (containing extra nonnce) after every 4 billion hashes.

Thus bandwidth requirement (without any overhead) is ~ 72 bytes every 4 GH (the same header can be used for 4 billion hashes) of hashing power that the miner has.   Even a 40GH miner wouldn't require very much bandwidth.  It would require 10 headers per second and would produce 10 shares (lower difficulty solutions) per second.  The headers would require ~800 bps inbound and the outgoing shares would require ~2kbps outbound.

Now for the server that bandwidth requirement can be larger as they will have significantly more aggregate traffic.  A 5TH/s mining pool would need to issue 23,283 headers per second but even that is only 1.8Mbps.

Still bandwidth is really a non-issue.  As difficulty rise and the pool gets larger the computational load on server becomes the larger bottleneck.  Every 2^32 hashes a miner will need a new header and that requires the pool to change the generation transaction and thus requires a new hash and that changes the merkle root which requires a new hash.

If it ever became a problem where pools simply couldn't handle the load, changing the size of the nonce could make the problem more manageable.  The Nonce is only 32 bit and is the only element a pool miner changes thus every pool miner needs a new header ever 4 billion hashes. A 100MH/s miner needs a new header every 40 seconds.  A 4GH/s miner needs a new header every second.  If the nonce value was larger more hashes could be attempted without changing the header.  For example if nonce value was 64bit a 4GH miner would only need a ONE header every 17 minutes instead of one every second (unless a block was found).  Most miners would never change headers except when a block is found.  The load on pool server could be cut by a factor of billions.

This would also slow transactions, and/or not decrease traffic by nearly as much as you expect.  The mining pool node is constantly updating its Merkle tree, so a new getwork request includes not just a different coinbase with a new extranonce, but also a different set of transactions, some new.  A 64 bit nonce would roughly triple the average transaction confirmation time, unless the node trips the long polling system, which makes extra traffic.
kjj
legendary
Activity: 1302
Merit: 1026
October 03, 2011, 06:53:33 PM
#67
That's the opposite of independence. It means that the same party needs to do both CPU and GPU to validate their block. So end users can't mine because they don't have GPUs. And there's no way to adjust difficulty separately since you don't have separate blocks to count.

No it wouldn't.  It would simply be a public double signing.

Two algorithms lets call them C & G (for obvious reasons).

A pool of G miners find a hash below their target and sign the block and publish it to all other nodes in network.  The block is now half signed.
A pool of C miners then take the half signed block and look for a hash that meets their independent target.  The block is now fully signed.

Simply adjust the rules for a valid block then for those half signing they can only generate a reward half the size + half transaction fees.  The second half does the same.  So the G miner (or pool) who half signs the block gets 25BTC + 1/2 transaction fees, the C miners would complete the half signed block get the other 25 BTC + 1/2 the transaction fees.

A block isn't considered confirmed until both halves of the hash pair are complete and published.  If you want block signing to take 10 minutes of average adjust the difficulty for each half so that average solution takes 5 minutes for half signed block.

While I doubt any dual algorithm solution is needed it makes more sense to have both keys required otherwise bitcoin becomes vulnerable to the weaker of either algorithm (which is worse than having single algorithm).

There are subtle, er, issues with this idea.  I think they are actually problems, but I haven't worked through all the details yet, so I'm not confident enough to use that label yet.  Think carefully about the coinbase transactions, and how those are included (or not) in the half signatures.  I'm pretty sure that this system ends up being no better than just the second half system, but it could be modified to be as good as whichever system was slower at the moment.
donator
Activity: 1218
Merit: 1079
Gerald Davis
October 03, 2011, 05:35:23 PM
#66
The proof of work is not a hash of the entire block then?

Or is it an indirect hash of something in the header which is itself a hash to all transactions? Even if it's that, wouldn't such header have to be retransmitted each time a new transaction is propagated?

It is a hash of the header which contains the Merkle Root of all transactions in the block plus the hash of the last block.  That is how the "chain" is efficiently created.  If you know a previous block is valid and each block contains the hash of the prior block in the current block then you know the current block is valid by following the chain from the genesis block. Every transaction in the current block is confirmed because the merkle root is the hash that contains all the hashes of the transaction in the block.  If an extra transaction was added or one taken away the merkle root hash would be invalid.

https://en.bitcoin.it/wiki/Block_hashing_algorithm

All together the only thing that is hashed is the header which is 80 bytes.  The nonce is determined by the miner so the pool actually only transmits 76 bytes.  The miner then tries all nonces from 0 to 2^32 -1 (roughly 4 billion attempted hashes).  A shares is 2^32 hashes so 1 share ~= 1 header transmitted.

Since nonce only has 2^32 possibilities the pool server needs to provide a new header (containing extra nonnce) after every 4 billion hashes.

Thus bandwidth requirement (without any overhead) is ~ 72 bytes every 4 GH (the same header can be used for 4 billion hashes) of hashing power that the miner has.   Even a 40GH miner wouldn't require very much bandwidth.  It would require 10 headers per second and would produce 10 shares (lower difficulty solutions) per second.  The headers would require ~800 bps inbound and the outgoing shares would require ~2kbps outbound.

Now for the server that bandwidth requirement can be larger as they will have significantly more aggregate traffic.  A 5TH/s mining pool would need to issue 23,283 headers per second but even that is only 1.8Mbps.

Still bandwidth is really a non-issue.  As difficulty rise and the pool gets larger the computational load on server becomes the larger bottleneck.  Every 2^32 hashes a miner will need a new header and that requires the pool to change the generation transaction and thus requires a new hash and that changes the merkle root which requires a new hash.

If it ever became a problem where pools simply couldn't handle the load, changing the size of the nonce could make the problem more manageable.  The Nonce is only 32 bit and is the only element a pool miner changes thus every pool miner needs a new header ever 4 billion hashes. A 100MH/s miner needs a new header every 40 seconds.  A 4GH/s miner needs a new header every second.  If the nonce value was larger more hashes could be attempted without changing the header.  For example if nonce value was 64bit a 4GH miner would only need a ONE header every 17 minutes instead of one every second (unless a block was found).  Most miners would never change headers except when a block is found.  The load on pool server could be cut by a factor of billions.
Pages:
Jump to: