Pages:
Author

Topic: The case for moving from a 160 bit to a 256 bit Bitcoin address - page 3. (Read 6944 times)

legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political
well we can only speculate why Satoshi designed it that way.  If a bad curve was chosen for ECDSA yes it would be broken but less broken on unused addresses...sounds like a weak argument but maybe still better than nothing?  so I see what you mean.

I see what you mean about the 512 bits.  Is it possible it has something to do with the fact that the 512 version of pubkey is uncompressed, so its safer to hash it again?
hero member
Activity: 770
Merit: 629
I guess they are saying that using the unhashed public key it would be more feasible to design some collision attack, based on its contents, rather than there being a small number of possible outputs.   Makes sense on the surface as a fail-safe precaution.

I don't know exactly what you mean here.  "interaction between ECDS and hash function" sounds totally weird, because it would mean that *inadvertedly* the hash function would a) suffer from a reduced output space if its inputs are public ECDS keys or b) would accidentally expose the private key, or make that easier.
Both arguments don't hold water, because:
a) the private key being a random number, and the relationship private->public key being a bijection (an exponentiation in a cyclic group of prime order) the public key is just as random as any random number, so there's no "structure" in the public key.
b) if a given hash function were to solve accidentally, even partially, the discrete log problem of ECDS, that would be a monumental exploit !!

I wasn't arguing for using the direct public key (although one could consider that).  I was arguing for using the SHA-256 hash of the public key directly, without channelling it through RIPEMD-160, and to use the full 256 hash bits.

Quote
Quote
2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

I'm not following you at all.  Why would 'only 512 bits' make the extension attack not work?  I don't deeply understand how this attack works but why would the attack not be possible?

Well, the extension attack is of no meaning here, for different reasons, but the extension attack comes about when one knows the hash of a given document.  In the Merkel-Damgard construction, the document to be hashed it sliced into blocks of 512 bits, and the hash output of the first block is fed into the hash input of the second block and so on.  So if you know the final hash, you can add a bloc with your own content, irrespective of what was the hashed content up to that point, and use the published hash and your new block, to present the hash of a document, whatever it was, appended with your block.

For instance, if there was a secret text giving a whole explanation of a battle plan, were hashed (and you don't know the content), you could take that hash, add a block of your likings, for instance saying "PS: all of the above is wrong, the attack will take place at noon", and you would be able to calculate the correct hash of the document, with this bloc appended to it.  This would totally alter the meaning of the document, and you would still be able to produce a correct hash.

But when hashing the key, there's no such meaning of appending a bloc.  Nobody is going to look at any fake message with "more stuff in it than just the original key".  Moreover, there's only one single call to the compression function, and Merkle-Damgard isn't even used for this.

This attack is an attack in a totally different setting of the use of a hash than it is with a bitcoin address, and hence doesn't make sense.  It makes sense when you can do a man-in-the-middle attack with a transmitted document of which you can't change the content, but you can append content to it, and you can provide a different hash.
hero member
Activity: 770
Merit: 629
Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.

I have to say I don't know if that was historically so.  But the argument "to save space" doesn't hold water in the actual implementation, because of these reasons:

1) if the public key were published instead of its hash, it wouldn't be necessary to publish it *a second time* during the signature: now people publish first the hash, and next, the public key when signing.  If the address were directly the public key, one would win the hash room, because there wouldn't be any need to publish the key AGAIN.

2) the first argument actually falls down, because from the signature and the text, one can DERIVE the public key.  In that way, the hash COULD win some room, because the public key never needs to be published: from the signature, one can derive it, calculate the hash, and verify whether it corresponds to the address.  But this cannot be the reason, because this feature (the fact that one can calculate the public key from the signature) is not used.

3) there's a lot of other "wasting room" on the transaction, and the most important one to me is the hash of 256 bits indicating the transaction.  This is overkill.   There is also the 32-bit number indicating the "output number of the transaction" as if transactions would contain 4 billion outputs.   This is wasted space.

In fact, the re-use of an address is what makes "indicating the transaction" necessary.  If an address occurred only once, there wouldn't be any need to specify a "transaction hash": given that an address can only correspond to one single past output, that output (and its corresponding transaction) would be uniquely determined by the signature (and hence the corresponding public key, and hence the corresponding address).  So there wouldn't be any need, nor to specify the transaction hash (256 bits saved) nor the output number (32 bits saved).

So I have a hard time imagining that the 100 or so bits saved by the 160 bit hash, instead of the 256 bit key, was a serious reason.

However, however, there is an extremely good reason to use a hash of a public key that is not published.  That reason is: if ever the ECC crypto fails, one can still use a hash-based signature scheme (even if it is just to switch to another cryptographic signing system).  The "public key" is then the "secret key" behind the hash.

==> but this scheme fails miserably if one has multiple use of the same address, because if one has used one of these in the past, then the "secret public key" is now public.

So it looks to me that the whole cryptographic design of this "hashed public key" is quite messy, even though it works.  Whenever one finds a "design rationale" in one direction, there's something else that screws this:
1) winning room on the chain --> screwed by wasting room elsewhere
2) protecting public key in case ECC fails --> screwed by address multiple usage possible

That same multiple address usage also wastes a lot of space, because of the need of indicating which transaction and which output number which have way too large bit consumption for their actual needs.
legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political

I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad

Yes, I know these arguments, but they don't hold water (well, they are correct, but don't justify anything).  They at best answer the argument "why did Satoshi didn't use RIPEMD-160 alone".  The question is why didn't he stick to just SHA-256, and kept a 256 bit address.  And if he really wanted to decrease the number of bits from 256 to 160 (but why on earth ? - to save space ?  then why waste a lot of space in the signature + key ?), he could simply keep the first 160 bits of the hash.  

Let's go through a few arguments that are given there.
1) the "interaction between ECDS and hash function".  That's totally ridiculous, because given that the secret key is a random number, the public key is a random number too (there's a bijection between both, it is a cyclic group).  So there's no "interaction" between a random number, and the hash function.  But if ever there was a "funny interaction" between ECDS and the first hash function (which makes no sense as I just outlined: it's a random number), say, that it always gives one of the 10 same hashes, then applying a second hash function is not going to solve it, on the contrary.  Talking about interactions, by cascading things, you can only *increase* the chance of an undesirable interaction.


I guess they are saying that using the unhashed public key it would be more feasible to design some collision attack, based on its contents, rather than there being a small number of possible outputs.   Makes sense on the surface as a fail-safe precaution.



Quote
2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

I'm not following you at all.  Why would 'only 512 bits' make the extension attack not work?  I don't deeply understand how this attack works but why would the attack not be possible?


legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political
Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.

My understanding was that there's additional benefits to hashing the pubkey.  For example, if a wallet implementation was flawed and used a bad 'k value'. 
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.
hero member
Activity: 770
Merit: 629

I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad

Yes, I know these arguments, but they don't hold water (well, they are correct, but don't justify anything).  They at best answer the argument "why did Satoshi didn't use RIPEMD-160 alone".  The question is why didn't he stick to just SHA-256, and kept a 256 bit address.  And if he really wanted to decrease the number of bits from 256 to 160 (but why on earth ? - to save space ?  then why waste a lot of space in the signature + key ?), he could simply keep the first 160 bits of the hash.  

Let's go through a few arguments that are given there.
1) the "interaction between ECDS and hash function".  That's totally ridiculous, because given that the secret key is a random number, the public key is a random number too (there's a bijection between both, it is a cyclic group).  So there's no "interaction" between a random number, and the hash function.  But if ever there was a "funny interaction" between ECDS and the first hash function (which makes no sense as I just outlined: it's a random number), say, that it always gives one of the 10 same hashes, then applying a second hash function is not going to solve it, on the contrary.  Talking about interactions, by cascading things, you can only *increase* the chance of an undesirable interaction.

2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

But the strangest of all these "protections" is the following: bitcoin allows address re-usage.  This could have been prohibited in the protocol (a second transaction with the same address in an OUTPUT would be invalid, in the same way as a double spent input is invalid).  But it isn't.   As such, this super duper protection of the *public* key with convolved hashes, is totally moot if ONE UTXO of a given address is spent, because then the public key is known.  All other UTXO of the same address (in different transactions most probably) have their public key exposed, and only the safety of ECC is protecting them, without hashes.   It is astoundingly weird to go through so much hassle to avoid the possibility to discover the public key from the address, just to allow you to publish that public key for that same address, which can still be used in other UTXO.

legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political

I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me. 


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
That's a smart point ! If we can count on the fact that there will never be a hard fork changing this total, of course.

Assuming we stick to the 64 bit size for a value, keep the 21 million Bitcoin cap, and we expand to the maximum number of decimal points offered by a 64 bit number we could comfortably expand the resolution to

264 = 18,446,744,073,709,551,616
      =   2,100,000,000,000,000,000 "millisatoshis"

So a factor of 1000 or about 210

So that would lead to about 261 addresses containing a millisatoshi each.
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
Unless it is directly related to the subject of this thread, for example the fact there can currently be only 3 transactions per second so how long would it take to burn through 255 addresses at only 3 TPS, any posts that stray into "block size debate" territory will be immediately deleted.  There are enough threads about that subject already.
hero member
Activity: 770
Merit: 629
There are only 2,100,000,000,000,000 satoshis.

Therefore the maximum number of Bitcoins addresses that can contain value at any one time is only about 251 and that assumes only one satoshi per address.

At the time the block chain contains 255 transactions it would be somewhere between a petabyte and an exabyte in size.  So a full node would need a multi petabyte drive to store the entire block chain.

That's a smart point ! If we can count on the fact that there will never be a hard fork changing this total, of course.

I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  One can say that this was meant to save some room, but there are many other occasions where a lot of room has been wasted.  For instance, the fact of posting, in a transaction input, as well the signature as the public key is redundant: one can recover the public key from the signature and the message, which would save the room for the key (256 bits in compressed format).  Also, it is strange to have a 256 bit transaction hash: it is hard to think that one would need 256 bits for the transaction, and only 160 bits for the address.

To me, there were 2 possible logical values for the address size: 128 bits, or 256 bits.  128 bits, because this is the intrinsic security level of the ECDS with 256 bit keys ; and 256 bits because this is the total entropy in the key.  160 bits is a strange compromise.  The idea that there could have been a back door in SHA-256, and that RIPEMD-160, of different origin, might have another back door, but both of them together, are "safe" sounds weird, and not a good reason to change the number of bits.  One could have used 2 times RIPEMD-160 on 128 bit parts of the 256 bit hash if that were the problem, to keep 256 bits.

As the number of satoshis is 2^51, in fact, there can, as you rightly point out, never be more than 2^51 valid UTXO at a given moment, so 128 bits would in fact be enough ; but given the wasting of room with the 256 bit transaction hash, and the publishing of both signature and key, keeping the full entropy of the key with a 256 bit address would have been the maximally safe option. 

legendary
Activity: 1302
Merit: 1008
Core dev leaves me neg feedback #abuse #political
this thread has a lot of hot air,  segwit p2wsh uses 256-bit hashes, and there is a 256bit address format proposed.

Is the thread over now?
gmaxwell:  Thanks Greg for your answer.  Sorry I accidentally deleted your post in my LBC delete fest.

LBC:  This thread was never intended to be about LBC.  I wish I had never mentioned it at all.  I have gone back and deleted the posts specifically about the merits/failings of LBC.

The thread is over unless anyone has comments about the application of the birthday problem to the hash space of Bitcoin.

Specifically, is the math presented in the OP correct?  Is this a real concern?

Let's say segwit shall not pass... We would need a new proposal, right?

Also, I think any new address format should be fully backwards compatible.   
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
There are only 2,100,000,000,000,000 satoshis.

Therefore the maximum number of Bitcoins addresses that can contain value at any one time is only about 251 and that assumes only one satoshi per address.

At the time the block chain contains 255 transactions it would be somewhere between a petabyte and an exabyte in size.  So a full node would need a multi petabyte drive to store the entire block chain.
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
this thread has a lot of hot air,  segwit p2wsh uses 256-bit hashes, and there is a 256bit address format proposed.

Is the thread over now?
gmaxwell:  Thanks Greg for your answer.  Sorry I accidentally deleted your post in my LBC delete fest.

LBC:  This thread was never intended to be about LBC.  I wish I had never mentioned it at all.  I have gone back and deleted the posts specifically about the merits/failings of LBC.

The thread is over unless anyone has comments about the application of the birthday problem to the hash space of Bitcoin.

Specifically, is the math presented in the OP correct?  Is this a real concern?
legendary
Activity: 1932
Merit: 2077
Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

Nope. There are 2^97 private keys for each address. Proof is left to the mathematician.


Ok, for the sake of semplicity I didn't want to talk here about compressed/uncompressed keys ...
legendary
Activity: 1120
Merit: 1037
฿ → ∞
Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

Nope. There are 2^97 private keys for each address. Proof is left to the mathematician.


What is the maximum key generation rate you have every recorded, even for a short period of time?
I remember about 2500 Mkeys/s.

Yes - somewhere between 2.3 - 2.5 Gkeys/s - interesting part is, that 1.9 Gkeys/s of that done by one man.

legendary
Activity: 1932
Merit: 2077
The birthday problem can be applied to estimate the probability of a collision given the size of a hash function.

.........

In this case we should be able to safely use up to 2103 Bitcoin addresses before collisions would be a problem.

Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

What is the maximum key generation rate you have every recorded, even for a short period of time?
I remember about 2500 Mkeys/s.
legendary
Activity: 2646
Merit: 1137
All paid signature campaigns should be banned.
The birthday problem can be applied to estimate the probability of a collision given the size of a hash function.

The Taylor series expansion of the exponential function can be used to estimate the probability of a collision as follows:

    p(n, d) = 1 - e-n(n - 1)/(2d)

where n is the number of hashes and d is the total number of possible hashes.

Given a hash function size of h in bits and a number of Bitcoin addresses generated 2x this can be simplified as follows:

    p(n, d) = 1 - e-2x(2x - 1)/(2(2h))

               = 1 - e-(22x - 2x)/(2h + 1)

               = 1 - e-[22x/(2h + 1) - 2x/(2h + 1)]

               = 1 - e-(22x - h - 1 - 2x - h - 1)

Within the limits of my Excel spreadsheet for h = 160 the probability of a collision for various values of x is:

Quote
 x   p(2x, 2160)
83   1
82   0.999664537
81   0.864664717
80   0.39346934
79   0.117503097
78   0.030766766
77   0.007782062
76   0.001951219
75   0.000488162
74   0.000122063
73   3.05171E-05
72   7.62937E-06
71   1.90735E-06
70   4.76837E-07
69   1.19209E-07
68   2.98023E-08
67   7.45058E-09
66   1.86265E-09
65   4.65661E-10
64   1.16415E-10
63   2.91038E-11
62   7.27596E-12
61   1.81899E-12
60   4.54747E-13
59   1.13687E-13
58   2.84217E-14
57   7.10543E-15
56   1.77636E-15
55   0
This yields the interesting result that after "using up" only 283 out of our 2160 Bitcoins addresses we will almost certainly have one or more address collisions.  In fact we really only want to "use up" about 255 Bitcoin addresses in order to keep the probability of a collision very near to zero (within the limits of the estimation function, my simplification, and the resolution abilities of the Excel spreadsheet).

How fast can we use up or generate 255 addresses?  

Well as of this posting the large bitcoin collider project has already sequentially searched the first 252.19 Bitcoin addresses and is on track to search the first 254.46 addresses within a year.  Given some serious resources to attack the problem, orders of magnitude more addresses could be generated and checked per year.

This thread is not about the merits or failing of the LBC so posts related to the LBC will be deleted.  I only use the maximum key generation rate of the LBC as an example of how fast keys can be created.

So far the LBC has had a peak key generation rate of about 2.5 x 109 key pairs per second.  That is about

   (3.154 x 107)(2.5 x 109) = 7.885 x 1016

   or approximately 256 key pairs per year.

If we were to move from the current 160 bit Bitcoin address to a new system that would fully utilize all 256 bit available for Bitcoin addresses the situation is much better:

Quote
   x   p(2x, 2256)
131   1
130   0.999664537
129   0.864664717
128   0.39346934
127   0.117503097
126   0.030766766
125   0.007782062
124   0.001951219
123   0.000488162
122   0.000122063
121   3.05171E-05
120   7.62937E-06
119   1.90735E-06
118   4.76837E-07
117   1.19209E-07
116   2.98023E-08
115   7.45058E-09
114   1.86265E-09
113   4.65661E-10
112   1.16415E-10
111   2.91038E-11
110   7.27596E-12
109   1.81899E-12
108   4.54747E-13
107   1.13687E-13
106   2.84217E-14
105   7.10543E-15
104   1.77636E-15
103   0
In this case we should be able to safely use up to 2103 Bitcoin addresses before collisions would be a problem.

I think we should discuss phasing out the old 160 bit Bitcoin address version and replacing it with a new 256 bit address version.
Pages:
Jump to: