the number of strings that we cannot get is then:
k (1-1/k) ^n
where n=2^257 (input) and k = 2^256 (output).
The result is: 2^256 * (1 - 1/2^256) ^ (2^257) = 2^256 * ((1 - 1 / 2^256)^(2^256))^(2) = 2^256 * ((1/e)^2) = 0,135 * 2^256 so we can get at this stage the 86,5% of all the 256 bit strings.
I would have appreciated if we could have given this more time (especially me to wrap my head around it) - but sure, maybe someone else can chime in with some insight.
All these models assume a hash function to behave like a random (or pseudo-random) number generator. Normally, a good hash function - by design - tries to
map the expected inputs as evenly as possible over its output range.
see
https://en.wikipedia.org/wiki/Hash_function#UniformityAlso
Note that this criterion only requires the value to be uniformly distributed, not random in any sense. A good randomizing function is (barring computational efficiency concerns) generally a good choice as a hash function, but the converse need not be true.
I had a look at the referenced paper, but I'm still not convinced about the premises to model a hash function as a pseudo-random generator. Please don't forget, that I've been looking at the SHA256 and RIPEMD160 implementation for months! They are both very similar (RIPEMD160 being way more "light-weight") and I cannot see how these would qualify as pseudorandoms.
...
Rico
Hi rico, hi arulbero,
to check wether sha256 and ripemd160 behave like a PNG and meet the expectations regarding the formula k (1-1/k) ^n one could do that:
Hash all possible 2^33 bit values with sha256 and XOR the first half of the bits with the second half. The 128 result is xored in the same manner. The 64 bit result as well. Thus we get a 32 bit output from the 2^33 input. Check how many values of all possible 32bit expression could not be generated. If k (1-1/k) ^n is applicable to sha256 than the result should equal arulbero´s 13,5%. If this is true, than this small model can be used to investigate futher assumptions regarding distribution and prohabilities.
For Ripemd160 the same can be done with a 2^21 input and 2^20 output.
Due to the neutral behavior of XOR it should be possible to scale the 160 respectively 256 bit problem down to a manageable size.
Regards,
Janu$$