Pages:
Author

Topic: SHA-256 broken, collisions found... Bitcoin then? - page 2. (Read 17151 times)

legendary
Activity: 3472
Merit: 4801
For what values of t1 and t2?


These ones (found them doing a search for the shortest MD5 collision):

[t1 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

[t2 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70


so t1 = t2. doesn't suprise that anyfunc(t1) = anyfunc(t2), right?

that's not a collision. for a collision t1 != t2 and func(t1) = func(t2) must be true.


molecular,

Take another look.
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
These ones (found them doing a search for the shortest MD5 collision):

[t1 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

[t2 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70



so t1 = t2. doesn't suprise that anyfunc(t1) = anyfunc(t2), right?

that's not a collision. for a collision t1 != t2 and func(t1) = func(t2) must be true.


If you look even more closely you'll see there are other differences also. Smiley
donator
Activity: 2772
Merit: 1019
For what values of t1 and t2?

These ones (found them doing a search for the shortest MD5 collision):

[t1 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

[t2 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70



so t1 = t2. doesn't suprise that anyfunc(t1) = anyfunc(t2), right?

that's not a collision. for a collision t1 != t2 and func(t1) = func(t2) must be true.
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
For what values of t1 and t2?

These ones (found them doing a search for the shortest MD5 collision):

[t1 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

[t2 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70

legendary
Activity: 3472
Merit: 4801
. . .when I tried MD5(t1+MD5(t1)) it produced the same hash as MD5(t2+MD5(t2)).
For what values of t1 and t2?
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
as long as you can put in a certain 128 byte block at the end that is aligned on a 64-byte boundary (as far as I understand).

I think you mean at the beginning because when I tried MD5(t1+MD5(t1)) it produced the same hash as MD5(t2+MD5(t2)).
legendary
Activity: 980
Merit: 1008
I believe an attacker could easily produce two different non-standard transactions that hashed to the same txid. That would be a disaster, they could split the blockchain and/or double-spend by broadcasting one version of the transaction to half the network and the other to the other half of the network.
[...]
Also, I'm not sure that the known collision attacks would work for a transaction.  The two examples given are for Postscript and X.509, both of which are notorious for allowing arbitrary garbage.  I'm not sure that the garbage hiding capacity of bitcoin transactions will allow the necessary alignment.
This is why it's important not to allow the transaction to end with arbitrary data. The MD5 collision attack basically renders any previous blocks, in the data being hashed, irrelevant. You can construct a 1KB and a 1GB file with the same hash, as long as you can put in a certain 128 byte block at the end that is aligned on a 64-byte boundary (as far as I understand).
donator
Activity: 2772
Merit: 1019
Except that bitcoin is sha256(sha256(x))

The example was taken from molecoin, a new altcoin I invented to explain stuff to people within the limits of an 80 column terminal.
legendary
Activity: 1176
Merit: 1011
Double-hashing doesn't help at all:  If HASH(t1) == HASH(t2)  then HASH(HASH(t1)) == HASH(HASH(t2))
But does a hash value like HASH(t1) actually appear anywhere? In practical Bitcoin situations, we only have double-hash values like HASH(HASH(t1)), right?

I'd consider a Hash algorithm 'broken' if, for a given hash value H, we can generate data G (within reasonable time) with HASH(G)=H.

But if SHA256 gets broken, it would by no means imply that the double-SHA256 used by Bitcoin was also broken.

but if instead of HASH(HASH(t1)) we do HASH(HASH(t1)+t1) (where the + means append) then we get:
Yes, I agree that (regardless of the fact that SHA256 is not even remotely close to be broken in any way) this kind of 'nested double hashing' would have been safer. Also because it avoids loss of entropy that the current double-hashing suffers from (even though it's probably insignificant).
full member
Activity: 196
Merit: 100
SHA-256 does have a flaw:  I don't understand it.  If you cant explain it to me then it is too complicated.

maybe this helps to figure it out?


nick@zero ~ $ echo "123" | sha256sum
181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b  -
nick@zero ~ $ echo "124" | sha256sum
ca2ebdf97d7469496b1f4b78958f9dc8447efdcb623953fee7b6996b762f6fff  -
nick@zero ~ $ echo "125" | sha256sum
a5e45837a2959db847f7e67a915d0ecaddd47f943af2af5fa6453be497faabca  -
nick@zero ~ $ echo "verylongdatalongerthaneventhechecksumitselfjustaddingrandombitsnow9823480293849 20834092834029834029834028934092834" | sha256sum
3dff4001b5954d595b6d6b3a4ec3971c2eef82da397e6a81a514090052918ed7  -


now let's mine for a bit




nick@zero ~ $ for nonce in {0..999}; do echo $nonce x`echo $nonce | sha256sum`; done | grep x00
691 x0024839ec9632d382486ba7aac7e0bda3b4bda1d4bd79be9ae78e7e1e813ddd8 -
964 x00ae0900e3ba03583e3561d76de50754935c10913065d737f9cf4c8e86e54bda -
996 x009cbb4830299d01fc84a6a56d4f07707d7d073673f6cde576027bafbac75168 -


ah, found 3 blocks, cool


Except that bitcoin is sha256(sha256(x))
donator
Activity: 2772
Merit: 1019
If sha-256 was "broken" in the way md5 is "broken"  (possibility to create a collision if certain prerequisites are given), would this imply that it's also easy to find nonce so that md5(nonce+data) < diff ("create a collision with 0000000000*")?

I'm not sure wether or not "easy collision-finding" implies easy mining, but it might be true (maybe depending on the algorithm used).

I can't even judge which might be harder to accomplish: "easy collision-finding" or "easy mining".

Thanks Gavin for reminding me about the same-hash-transaction-construction-attack, I hadn't considered this here up to this point.

In case easy collision finding implies easy mining, that attack is moot, though, because the chain will be dead anyway due to cheap mining.

No, it doesn't imply that at all.


oh, ok. Then it indeed makes sense to think about what would be possible to do with a collision attack.

As far as I know, no one has ever actually tried to attack a hashing algorithm in a way that would make mining easier.  But all hashing attacks require certain freedoms in the inputs, and the scheme we use for mining doesn't really give an attacker much room to work with.  Go look at the header and think carefully about where each field comes from and how much freedom an attacker has to change it.  The Merkle root in particular is a pain in the ass.  Even if you could run SHA256 backwards and create valid inputs from desired outputs, you'd still have the problem of creating valid(!) transactions for those hashes.

The most realistic attack would be a statistical test that could estimate the likelihood of finding a valid block if we iterate all 232 possible nonces.  If this test was 100% accurate, the attacker could just change his generate transactions until he found one that the test said would produce a block, and then iterate the nonces.  If a test like that was discovered, and then kept secret, the holder of it could monopolize mining.

Just not being secret is enough to mitigate this attack.  If the test is public, everyone will use it, and difficulty will quickly scale up by a factor of 4 billion.  No big deal.  But in reality, the test won't be perfect, it will be weak, and again, difficulty scaling works off of the apparent hashing power of the network, and this would just make the network appear to be hashing faster than it really is.

So "easy mining" attack tends to be harder and less likely to be found than a collision attack? Makes even more sense then to think about the implications.

I can't really judge this, but assuming md5 was used in bitcoin seems to be a good way to think about that.
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
Double-hashing doesn't help at all:  If HASH(t1) == HASH(t2)  then HASH(HASH(t1)) == HASH(HASH(t2))

Assuming the collision of HASH(t1) and HASH(t2) is performed with MD5 and

[t1 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

[t2 - in hex]
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70

then the collision hash is: 8bcec679bc54a3634116e859d16c32d4

but if instead of HASH(HASH(t1)) we do HASH(HASH(t1)+t1) (where the + means append) then we get:

HASH(HASH(t1)+t1) = 3d4d97fb6f2b9638a4b881e7db1aefaf and
HASH(HASH(t2)+t2) = 78849bd73dda9f9c1a946d06766d08f2

So maybe one possibility down the track would be to change the double hashing technique?
kjj
legendary
Activity: 1302
Merit: 1026
If sha-256 was "broken" in the way md5 is "broken"  (possibility to create a collision if certain prerequisites are given), would this imply that it's also easy to find nonce so that md5(nonce+data) < diff ("create a collision with 0000000000*")?

I'm not sure wether or not "easy collision-finding" implies easy mining, but it might be true (maybe depending on the algorithm used).

I can't even judge which might be harder to accomplish: "easy collision-finding" or "easy mining".

Thanks Gavin for reminding me about the same-hash-transaction-construction-attack, I hadn't considered this here up to this point.

In case easy collision finding implies easy mining, that attack is moot, though, because the chain will be dead anyway due to cheap mining.

No, it doesn't imply that at all.

As far as I know, no one has ever actually tried to attack a hashing algorithm in a way that would make mining easier.  But all hashing attacks require certain freedoms in the inputs, and the scheme we use for mining doesn't really give an attacker much room to work with.  Go look at the header and think carefully about where each field comes from and how much freedom an attacker has to change it.  The Merkle root in particular is a pain in the ass.  Even if you could run SHA256 backwards and create valid inputs from desired outputs, you'd still have the problem of creating valid(!) transactions for those hashes.

The most realistic attack would be a statistical test that could estimate the likelihood of finding a valid block if we iterate all 232 possible nonces.  If this test was 100% accurate, the attacker could just change his generate transactions until he found one that the test said would produce a block, and then iterate the nonces.  If a test like that was discovered, and then kept secret, the holder of it could monopolize mining.

Just not being secret is enough to mitigate this attack.  If the test is public, everyone will use it, and difficulty will quickly scale up by a factor of 4 billion.  No big deal.  But in reality, the test won't be perfect, it will be weak, and again, difficulty scaling works off of the apparent hashing power of the network, and this would just make the network appear to be hashing faster than it really is.
donator
Activity: 2772
Merit: 1019
If sha-256 was "broken" in the way md5 is "broken"  (possibility to create a collision if certain prerequisites are given), would this imply that it's also easy to find nonce so that md5(nonce+data) < diff ("create a collision with 0000000000*")?

I'm not sure wether or not "easy collision-finding" implies easy mining, but it might be true (maybe depending on the algorithm used).

I can't even judge which might be harder to accomplish: "easy collision-finding" or "easy mining".

Thanks Gavin for reminding me about the same-hash-transaction-construction-attack, I hadn't considered this here up to this point.

In case easy collision finding implies easy mining, that attack is moot, though, because the chain will be dead anyway due to cheap mining.
kjj
legendary
Activity: 1302
Merit: 1026
I believe an attacker could easily produce two different non-standard transactions that hashed to the same txid. That would be a disaster, they could split the blockchain and/or double-spend by broadcasting one version of the transaction to half the network and the other to the other half of the network.

Heh, I meant safe for my transactions, not safe for the network.

Also, I'm not sure that the known collision attacks would work for a transaction.  The two examples given are for Postscript and X.509, both of which are notorious for allowing arbitrary garbage.  I'm not sure that the garbage hiding capacity of bitcoin transactions will allow the necessary alignment.
legendary
Activity: 1652
Merit: 2311
Chief Scientist
...On the other hand, if it (MD5) was used in bitcoin transactions, those transactions would still be totally safe, for at least a few more years, because all of the attacks require conditions that can't be met in the bitcoin system.

I think that is a good thought experiment:  If you replaced all uses of SHA256 in Bitcoin with MD5, what attacks would be possible? (please check my work, I am not an expert on hash collisions)

Well, to generate a collision:
Quote
...the attacker needs to generate ... a 128-byte block of data, aligned on a 64-byte boundary that can be changed freely by the collision-finding algorithm.

Block hashing would be safe, for now; the block header that is hashed is only 80 bytes long, much less than the 128 bytes of wiggle room needed to find a collision.

I believe an attacker could easily produce two different non-standard transactions that hashed to the same txid. That would be a disaster, they could split the blockchain and/or double-spend by broadcasting one version of the transaction to half the network and the other to the other half of the network.

To split the chain the attacker would mine a block containing the 'poison' transaction hash, and then broadcast two versions of the same block, containing the two different-but-same-hash transactions. Half the network would think that block contains 't1', and half 't2'.  Everything would be just fine until the attacker spent the outputs of t1 and/or t2... then Bad Things would happen.

Double-hashing doesn't help at all:  If HASH(t1) == HASH(t2)  then HASH(HASH(t1)) == HASH(HASH(t2))


I do agree with everybody who points out that SHA256 isn't close to being broken. If it does ever start to get close, then I'm sure we could figure out a backwards-compatible fixes and phase them in (something like "a block's coinbase transaction must include a SHA3-based transaction merkle root", create a new version of OP_CHECKSIG that used SHA3, roll out a new alert system that used SHA3, etc).


donator
Activity: 2772
Merit: 1019
True, it wouldn't make much sense in that case.  But I'll bet you every single bitcoin that I own that there will not be a catastrophic break in SHA256 that makes a change in the mining scheme urgent on a timeline of less than a decade.

I'm with you on that one. Totally hypothetical discussion we're having here. We must be bored.
kjj
legendary
Activity: 1302
Merit: 1026
Such a transition wouldn't need to be sudden.  For mining, which will probably never be broken, there is no reason why we couldn't accept two different hash schemes with a sunset of the old hash scheme many years in the future.

Do you mean a rule like: the block is valid if it has sha256 proof-of-work of difficulty x OR SHA3 proof-of-work of difficulty y?

That would not make much sense in the supposed case of sha256 being broken.

True, it wouldn't make much sense in that case.  But I'll bet you every single bitcoin that I own that there will not be a catastrophic break in SHA256 that makes a change in the mining scheme urgent on a timeline of less than a decade.

MD5 is hopelessly broken, but none of the attacks would help with mining.  I am very, very, very confident that SHA256 will not suffer a catastrophic break even worse than what has been found with two decades of work on MD5.

donator
Activity: 2772
Merit: 1019
Such a transition wouldn't need to be sudden.  For mining, which will probably never be broken, there is no reason why we couldn't accept two different hash schemes with a sunset of the old hash scheme many years in the future.

Do you mean a rule like: the block is valid if it has sha256 proof-of-work of difficulty x OR SHA3 proof-of-work of difficulty y?

That would not make much sense in the supposed case of sha256 being broken.
Pages:
Jump to: