Author

Topic: Question about SHA-256 hash functions (Read 232 times)

hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
April 22, 2022, 07:51:56 AM
#15
It is just as easy to find a hash with 19 (or less) leading zero bits as finding a hash with another predefined set of 19 bits. The probability of '0100 1010' is same as '0000 0000', or '1111 1111'. The 'less than' requirement doesn't change anything, because what comes after the desired prefix doesn't matter anyway, since the difficulty target is met and if we have more zeroes or random gibberish after, is irrelevant.
Suppose we have two hashes that start with zero bits:

000aaa

and

000bbb

In the first case all hashes that start with three zeroes are valid (000aaa inclusive if hash collision is ever to be found). That means that all hashes from the range 000[000] - 000[fff] are valid for us.

In the second case all hashes that are less than bbb are valid, but those that are higher than bbb are not.

Which means only hashes from the range 000[000] - 000[bba] are valid.

Are you telling me that despite the huge distinction in the number of valid hashes that would satisfy our condition, the chances of finding each of those are the same? Huh
That's why you need to think in numbers, then it's easier. I tried to show exactly this: in hex notation, it seems both have a difficulty target of '3 leading zeroes'. In reality though, the difficulty target is a number and the numerical interpretation of the hash needs to be smaller than that number.

If you look at the binary representation, in your first case, the smallest allowed number is 0b00000000 00000000 00000000 / 0x0 / 010 and the highest is 0b00000000 00001111 11111111 / 0x000fff / 409510.

In the second case, the smallest allowed number is again 0b00000000 00000000 00000000 / 0x0 / 010 and the highest is 0b00000000 00001011 10111010 / 0x000bba / 300210.

So in short, in one case you have 4095 possible valid hashes and in the other you have only 3002 valid hashes, so it's a harder difficulty target. The number of leading zeroes in hex and binary notation is actually the same in both cases, but the actual number of valid hashes in case 2 is significantly smaller. That's why I think we need to just stick to what Bitcoin actually does (comparing numbers) instead of counting leading characters.
legendary
Activity: 2464
Merit: 4419
🔐BitcoinMessage.Tools🔑
April 22, 2022, 12:22:15 AM
#14
It is just as easy to find a hash with 19 (or less) leading zero bits as finding a hash with another predefined set of 19 bits. The probability of '0100 1010' is same as '0000 0000', or '1111 1111'. The 'less than' requirement doesn't change anything, because what comes after the desired prefix doesn't matter anyway, since the difficulty target is met and if we have more zeroes or random gibberish after, is irrelevant.
Suppose we have two hashes that start with zero bits:

000aaa

and

000bbb

In the first case all hashes that start with three zeroes are valid (000aaa inclusive if hash collision is ever to be found). That means that all hashes from the range 000[000] - 000[fff] are valid for us.

In the second case all hashes that are less than bbb are valid, but those that are higher than bbb are not.

Which means only hashes from the range 000[000] - 000[bba] are valid.

Are you telling me that despite the huge distinction in the number of valid hashes that would satisfy our condition, the chances of finding each of those are the same? Huh
sr. member
Activity: 1190
Merit: 469
April 21, 2022, 11:01:48 PM
#13

I'm yet to see anything that can break the SHA-256 hash function...



Well it's not just you. But do let the forum know if you happen to find something.
legendary
Activity: 3472
Merit: 10611
April 21, 2022, 10:11:36 PM
#12
Apart from the math that was explained before you have to take the process it takes to compute both hashes and break both down into their small steps and compare those.

* What is your message in your first hash? If it is the same 80-byte input then the first step is the same otherwise computing hash of a smaller message (eg. 15 byte ""Hello, world!"") is faster than computing hash of a bigger message (eg. 80 byte header) since you would only have 1 block to compress instead of 2.

* Are you computing double SHA256 or single SHA256? The hash you posted is single SHA. The speed is different here also since a double SHA would add an additional block compression and essentially more steps that increases the cost.

* As for the result, in block headers we aren't comparing values like (x[0]==0, x[1]==0) instead the hash has to be interpreted as a 256-bit integer and then compared with the target.
In your first hash if you are looking for exact starting values the comparison is slightly different and can be slightly faster since you are actually doing the (x[0]==0x19, x[1]==0x27) comparison.
member
Activity: 393
Merit: 44
April 21, 2022, 08:26:08 AM
#11
bit shifting is slower than byte one. nothing stops you from checking first non-zero byte when you found leading zeoroes. say they have uint256 with '>' op everloaded in bitcoin but it doesn't use even sse. it's a wrapper over  openssl's bn which isn't tuned for zeroes in any way.
which is to say, after you found one with zeroes you can do BN_cmp() on that
hero member
Activity: 924
Merit: 5943
not your keys, not your coins!
April 21, 2022, 05:18:03 AM
#10
In the first case, the value can be lower or higher than the initial hash, the only constrain is it needs to have certain characters in the beginning. In the second case, the value should be lower than the initial hash to be considered valid. The only problem is the initial value is already relatively small, and you need to find even smaller.
A random oracle doesn't know any notion of 'small' or 'large', though - so the 'size of the number' doesn't affect the randomness at all.

It was already pointed out that it's a bit problematic talking about 'leading zeroes' with a hexadecimal representation, as for instance a difficulty target could allow hashes leading with 0x0000.0000 (4 bytes of full 0, or 32 bits), but it might also be reduced to 31 bits which would mean 0x0000.0001 would be allowed, while 0x0000.0002 would not (even though same number of 'leading zeroes' in hex).

So I'm just talking in bits now to make it clearer.
It's just as likely to find a number with e.g. 32 leading 0 bits like the number that starts with 0x29f29183 or any other 32-bit prefix.

So I'm changing the question a bit:
Quote
So, my question is, what would be more difficult to find, in terms of computational power, a hash with the same 19 random characters bits like in the first example, or a hash that would be less than the second hash with 19 zeroes?

It is just as easy to find a hash with 19 (or less) [edit: with 'less', I mean a smaller number / more leading zeroes actually] leading zero bits as finding a hash with another predefined set of 19 bits. The probability of '0100 1010' is same as '0000 0000', or '1111 1111'. The 'less than' requirement doesn't change anything, because what comes after the desired prefix doesn't matter anyway, since the difficulty target is met and if we have more zeroes or random gibberish after, is irrelevant.
member
Activity: 393
Merit: 44
April 21, 2022, 05:03:55 AM
#9
you do right shift by hash_width - difficulty bytes. and when you do it you have zeroes on the left  ( except for old systems with dos where you'll have garbage bytes during right or left shift, do not remember ) . then you compare whole thing to zero.
legendary
Activity: 2464
Merit: 4419
🔐BitcoinMessage.Tools🔑
April 21, 2022, 04:03:35 AM
#8
The best search term would be "hash collision".
As far as I know, hash collision means a situation when two different inputs hash into exactly the same output, but that is not what I am asking. I need only part of the output to be the same, not the whole hash.

And in terms of mining, the miner is looking for a value lower than the target, not exact.
So, since 19270ea4f98808a63fbb99bc26a5ee6f0fe8df9c8182cf1d710b115d57250578 is higher than 000000000000000000016923fc405bbb8e9552072b0905adeb4115706e569bb2,
it'll be easier to find a hash that's lower than the former than the latter.

In the first case, the value can be lower or higher than the initial hash, the only constrain is it needs to have certain characters in the beginning. In the second case, the value should be lower than the initial hash to be considered valid. The only problem is the initial value is already relatively small, and you need to find even smaller.

Your formulation "less than the second hash" makes the second about 14 times more difficult, since there are additional constraints - ...63fbxx vs ...000016 in the first case all 256 possibilities are valid, while in the second about 234 are invalid.
I guess that is the answer I am looking for, although I didn't fully understand what you meant.  Grin
legendary
Activity: 952
Merit: 1386
April 21, 2022, 03:20:20 AM
#7
The probability is the same. Fact that something looks "nicer" for use, people, does not matter at all.
Think about it that way:
on the first position you may have 1 of 16 possible characters.
on the the second position you may have 1 of 16 possible characters.
etc.
then you multiply chances and you conclude that chance to get any pair like "00" or "ee" or "27" is exactly the same - 1/256.
Then you may extend that calculations for your 19 characters long example.

Technically there are some minor improvements/performance gains when you work with zeros, because computers are build that way, not another.
For example if you need to iterate on the huge loop, if you write loop like:
[for x=0, until xslightly slower than
[for x=last element, until x position>0, decrement position] as comparing to 0 is faster.
member
Activity: 393
Merit: 44
April 21, 2022, 02:20:56 AM
#6
it's the same.
check for zeroes is faster on cpu miner because you can do this

Code:

__m128i zero = _mm_setzero_si128();


b = _mm_alignr_epi8(m, r, 28);


camp = _mm_cmpeq_epi8(b ,zero);
turt = _mm_movemask_epi8(camp);
if (turt == 0xffff)
{

I highlighted the first 19 characters of each hash with red color to emphasize that the former has random leading characters, but the latter's leading characters are zeroes. So, my question is, what would be more difficult to find, in terms of computational power, a hash with the same 19 random characters like in the first example, or a hash that would be less than the second hash with 19 zeroes? And why?
Both are nearly impossible to find.
impossible for you may be. it's about "less than"
legendary
Activity: 2646
Merit: 6681
Self-proclaimed Genius
April 21, 2022, 02:14:01 AM
#5
Your formulation "less than the second hash" makes the second about 14 times more difficult, since there are additional constraints - ...63fbxx vs ...000016 in the first case all 256 possibilities are valid, while in the second about 234 are invalid.
What I get is the highlight is only there to emphasize the non-zero characters of the first hash.

If that's not the case, then yes you're correct.
hero member
Activity: 1106
Merit: 912
Not Your Keys, Not Your Bitcoin
April 21, 2022, 02:07:35 AM
#4
The SHA-256 hash of "Hello, world!"

19270ea4f98808a63fbb99bc26a5ee6f0fe8df9c8182cf1d710b115d57250578

And the block header of bitcoin block 732811

000000000000000000016923fc405bbb8e9552072b0905adeb4115706e569bb2



I'm yet to see anything that can break the SHA-256 hash function, so it is impossible to find an input of any hash result. One of the cool features of SHA-256 is that it gives you the same fixed length result, you can confirm it from the two examples you did above irrespective of what data you input. Both are 64 characters, If not that you specify the inputs and the results, it will be difficult for anyone to differentiate between the two.



I highlighted the first 19 characters of each hash with red color to emphasize that the former has random leading characters, but the latter's leading characters are zeroes. So, my question is, what would be more difficult to find, in terms of computational power, a hash with the same 19 random characters like in the first example, or a hash that would be less than the second hash with 19 zeroes? And why?


The hello world hash doesn't have anything to do with computational power, the SHA-256 gave you the unique fingerprint of the data you hashed while the block header was the result of the block hashed which was less than the target value(often calculated by the miners).
full member
Activity: 206
Merit: 450
April 21, 2022, 02:01:14 AM
#3
Both highlighted leading characters are "random". Finding a hash with different input and the same highlighted output has exactly the same probability in both cases, hence the same difficulty.

Your formulation "less than the second hash" makes the second about 14 times more difficult, since there are additional constraints - ...63fbxx vs ...000016 in the first case all 256 possibilities are valid, while in the second about 234 are invalid.

legendary
Activity: 2646
Merit: 6681
Self-proclaimed Genius
April 21, 2022, 01:46:56 AM
#2
I highlighted the first 19 characters of each hash with red color to emphasize that the former has random leading characters, but the latter's leading characters are zeroes. So, my question is, what would be more difficult to find, in terms of computational power, a hash with the same 19 random characters like in the first example, or a hash that would be less than the second hash with 19 zeroes? And why?
Both are nearly impossible to find.
Since both are representing a 256-bit number, the search space would be too high to have a chance of finding them.
And SHA-256 is irreversible.

P.S. Obviously, by "finding hash" I mean finding specific inputs which result in specific hashes.
The best search term would be "hash collision".


If you're referring to mining, then the search query is "Bitcoin mining target" (target).

And in terms of mining, the miner is looking for a value lower than the target, not exact.
So, since 19270ea4f98808a63fbb99bc26a5ee6f0fe8df9c8182cf1d710b115d57250578 is higher than 000000000000000000016923fc405bbb8e9552072b0905adeb4115706e569bb2,
it'll be easier to find a hash that's lower than the former than the latter.
legendary
Activity: 2464
Merit: 4419
🔐BitcoinMessage.Tools🔑
April 21, 2022, 01:36:23 AM
#1
This may be a very dumb question, but, honestly, I didn't find the answer in google, because I can't even formulate the search query correctly to find the right answer.

Suppose we have two hashes:

The SHA-256 hash of "Hello, world!"

19270ea4f98808a63fbb99bc26a5ee6f0fe8df9c8182cf1d710b115d57250578

And the block header of bitcoin block 732811

000000000000000000016923fc405bbb8e9552072b0905adeb4115706e569bb2


I highlighted the first 19 characters of each hash with red color to emphasize that the former has random leading characters, but the latter's leading characters are zeroes. So, my question is, what would be more difficult to find, in terms of computational power, a hash with the same 19 random characters like in the first example, or a hash that would be less than the second hash with 19 zeroes? And why?

P.S. Obviously, by "finding hash" I mean finding specific inputs which result in specific hashes.


Jump to: