Pages:
Author

Topic: Rule 30 automaton as hash function - page 3. (Read 10658 times)

full member
Activity: 126
Merit: 100
July 28, 2014, 10:21:22 AM
I think you have to consider what is your intention for using R30?  As a general purpose non-cryptographic hash?  As a general purpose cryptographic hash?  Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW.  Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space.  It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:
Code:
R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")


I believe R30 is an excellent hash function for cryptography. Although that's a risky assumption since not even Stephen Wolfram has a definite proof of that I assume. My aim is for R30 to be a general hash function. Using it for Bitcoin proof of work would be an interesting test of concept.
hero member
Activity: 836
Merit: 1030
bits of proof
July 28, 2014, 10:13:15 AM
I think you have to consider what is your intention for using R30?  As a general purpose non-cryptographic hash?  As a general purpose cryptographic hash?  Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW.  Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space.  It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:
Code:
R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")

Agree, R30 is interesting as perfect POW not as much as cryptographic hash.

As side note: I am not sure Satoshi made a mistake here. Maybe he wanted the block header is updated with new transactions instead of rolling a nonce until a block is found.
donator
Activity: 1218
Merit: 1079
Gerald Davis
July 28, 2014, 10:02:39 AM
Maybe not for Bitcoin but the O(n2) of R30 makes it very computationally intensive for long messages.

One way of reducing the calculations for long messages is to use the Merkle–Damgård construction together with Davies–Meyer single-block-length compression function.

With a message of 1,000,000 bits, taking the square of that results in 1012. By dividing that message into fixed-sized blocks of 1000 bits the result is 10002 * 1000 = 109. A three orders of magnitude improvement.

A similar technique is used in SHA-2 and other standard hash functions.

I think you have to consider what is your intention for using R30?  As a general purpose non-cryptographic hash?  As a general purpose cryptographic hash?  Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW.  Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space.  It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:
Code:
R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")



full member
Activity: 126
Merit: 100
July 28, 2014, 09:47:02 AM
Maybe not for Bitcoin but the O(n2) of R30 makes it very computationally intensive for long messages.

One way of reducing the calculations for long messages is to use the Merkle–Damgård construction together with Davies–Meyer single-block-length compression function.

With a message of 1,000,000 bits, taking the square of that results in 1012. By dividing that message into fixed-sized blocks of 1000 bits the result is 10002 * 1000 = 109. A three orders of magnitude improvement.

A similar technique is used in SHA-2 and other standard hash functions.
full member
Activity: 126
Merit: 100
July 28, 2014, 08:07:54 AM
The hash function should be defined for certain a block size.

Padding is an optional preprocessing step for the case the block size is not met exactly at input, and is then on bits not bytes.
I would leave that aside until the algorithm is not settled.


I think the fixed block size used in many hash functions today simply is a consequence (limitation) of using the Davies–Meyer compression function.

Treating the whole message as a single unit could be more robust.

Instead of adding length bits in R30 I could simply add for example 1 before and after the message as a preprocessing. That would work too since then even messages containing only zeros would be different when they have different lengths.
hero member
Activity: 836
Merit: 1030
bits of proof
July 28, 2014, 07:48:24 AM
The hash function should be defined for certain a block size.

Padding is an optional preprocessing step for the case the block size is not met exactly at input, and is then on bits not bytes.
I would leave that aside until the algorithm is not settled.

full member
Activity: 126
Merit: 100
July 28, 2014, 07:39:53 AM
Aha. Here is how the length bits are handled in SHA-256:

"Pre-processing:
append the bit '1' to the message
append k bits '0', where k is the minimum number >= 0 such that the resulting message
    length (modulo 512 in bits) is 448.
append length of message (without the '1' bit or padding), in bits, as 64-bit big-endian integer
    (this will make the entire post-processed length a multiple of 512 bits)" -- http://en.wikipedia.org/wiki/SHA-2#Pseudocode

That's actually similar to how I do it in my current R30 version.
full member
Activity: 126
Merit: 100
July 28, 2014, 07:28:28 AM
Remove message length encoding. That's a foreign concept to a hash function.

The message length bits are needed. Otherwise for example messages 000 and 000000 would result in the same initial condition and have the same hash value.
full member
Activity: 126
Merit: 100
July 28, 2014, 07:25:41 AM
I did a worst case test with one million bits messages of 010101010... and it works:

The R30 hash for UUU...U (1000000 bits) is: 10ea3662363b4940ed208b2a70960f7520bd5309e3f059ea5d106cdde9f2d7a1
The R30 hash for UUU...V (1000000 bits) is: 8903d17a9ec35cce2f028907e0fef7ccabbbd12d8f4a9f635e1d0503f395b1e9
hero member
Activity: 836
Merit: 1030
bits of proof
July 28, 2014, 07:11:37 AM
Remove message length encoding. That's a foreign concept to a hash function.
full member
Activity: 126
Merit: 100
July 28, 2014, 04:23:05 AM
Fast Java implementation of R30 using 64-bit longs:

Code:
	public final byte[] digest(byte[] message) {
int maxMessageBytes = message.length;
int maxKeyBytes = maxMessageBytes + 4;
int maxKeyBits = maxKeyBytes * 8;
byte[] hash = new byte[MAX_HASH_BYTES];
byte[] key = new byte[maxKeyBytes];
for (int i = 0; i < maxMessageBytes; i++) {
key[i + 2] = message[i];
}
key[0] = (byte) (maxMessageBytes >>> 24);
key[1] = (byte) (maxMessageBytes >>> 16);
key[maxKeyBytes - 2] = (byte) (maxMessageBytes >>> 8);
key[maxKeyBytes - 1] = (byte) maxMessageBytes;
int maxHashBits = MAX_HASH_BYTES * 8;
int skipRows = maxKeyBits * 7;
int maxCells = 2;
maxCells += maxKeyBits;
maxCells += skipRows;
maxCells += maxHashBits * 2;
int maxLongs = (maxCells + 63) >>> 6;
maxCells = maxLongs << 6;
int cellsMid = maxCells / 2;
long[] cells = new long[maxLongs];
int keyStart = (maxCells - maxKeyBits) / 2;
for (int i = 0; i < key.length; i++) {
int keyChar = key[i];
int bitPos = 0x80;
for (int j = 0; j < 8; j++) {
long b = (keyChar & bitPos) >>> (7 - j);
int bitIndex = keyStart + i * 8 + j;
cells[bitIndex >>> 6] |= b << (63 - (bitIndex % 64));
bitPos >>>= 1;
}
}
int bitCount = 0;
int mid = 0;
int longMid = maxLongs / 2;
int longMidShift = longMid * 2 == maxLongs ? 63 : 31;
int maxRow = skipRows + maxHashBits * 2;
for (int row = 0; row < maxRow; row++) {
int doubleRow = row * 2;
int calcWidth = doubleRow;
if (calcWidth > maxRow - 2) {
calcWidth = maxRow - ((doubleRow) % maxRow) + 2;
} else {
calcWidth += maxKeyBits;
}
int halfWidth = calcWidth / 2 + 2;
int start = (cellsMid - halfWidth) >>> 6;
int end = (cellsMid + halfWidth + 63) >>> 6;
mid = (int) ((cells[longMid] >>> longMidShift) & 0x01);
long carryLeft = 0L;
for (int i = start; i < end; i++) {
long l = cells[i];
long carryRight = i < maxLongs - 1 ? cells[i + 1] >>> 63 : 0;
long cellRight = (l << 1) | carryRight;
long cellLeft = (l >>> 1) | carryLeft;
carryLeft = l << 63;
cells[i] = cellLeft ^ (l | cellRight);
}
if (row < skipRows) {
continue;
}
if (row % 2 == 1) {
if (mid == 1) {
int bufPos = bitCount >>> 3;
hash[bufPos] ^= 1 << (bitCount % 8);
}
bitCount++;
}
}
return hash;
}

Slower JavaScript alternative: http://jsfiddle.net/7DV7Z/
hero member
Activity: 836
Merit: 1030
bits of proof
July 28, 2014, 03:59:31 AM
If I were to use this for real (or had more time for fun) I'd rewrite it using a vector of longs and also parallelise it since computation on (groups of) longs can be executed with CPUs cores in parallel only synchronising at change of generation.

I am sure the BigInteger with the analytic form of rule 30 already beats your javascript with 2 magnitudes  Tongue

It turns out that the performance of the below is pretty dismal if compared with SHA-256 as implemented by the standard java runtime libraries. My guess is that memory allocation / shuffling for increasing length of BigIntegers dominates it. This needs a pre-allocated store of the right size.

Code:
	private static BigInteger r30hash (BigInteger input)
{
BigInteger result = BigInteger.ZERO;
for ( int i = 0; i < (80 * 7 + 32) * 8; ++i )
{
input = input.xor (input.shiftLeft (1).or (input.shiftLeft (2)));
if ( i >= 80 * 7 * 8  && input.testBit (i + 1) )
{
result = result.setBit (0);
}
result = result.shiftLeft (1);
}
return result;
}
hero member
Activity: 836
Merit: 1030
bits of proof
July 27, 2014, 05:36:35 PM
I think with machine code the shifts of the 64 bit registers can automatically be chained. Maybe assembler code in C/C++ would be efficient. Of course, a hardware implementation could be made even much faster than that.

In reality only Bitcoin miners who would use dedicated R30 hardware would be competitive.
Sure. Writing software for this is just for research and fun.

The best in class code in whatever software layer will be still lots of magnitudes away of an ASIC for R30, since it is utmost simple, homogenous and  parallelizable at finest scale with constant memory (if any) need.

This efficiency combined with irreducibility of computation, no inverse and ability to satisfy any difficulty would make it the perfect proof of work.

The ability to satisfy any difficulty is something that yet bugs me.. I wish I could convince myself that it has this property, in the limited generation of 7N.

Since there is 80 bytes input to determine 32 bytes output, there is plenty of freedom to create whatever pattern though.
full member
Activity: 126
Merit: 100
July 27, 2014, 03:59:24 PM
I think with machine code the shifts of the 64 bit registers can automatically be chained. Maybe assembler code in C/C++ would be efficient. Of course, a hardware implementation could be made even much faster than that.

In reality only Bitcoin miners who would use dedicated R30 hardware would be competitive.
hero member
Activity: 836
Merit: 1030
bits of proof
July 27, 2014, 03:25:18 PM
If I were to use this for real (or had more time for fun) I'd rewrite it using a vector of longs and also parallelise it since computation on (groups of) longs can be executed with CPUs cores in parallel only synchronising at change of generation.

I am sure the BigInteger with the analytic form of rule 30 already beats your javascript with 2 magnitudes  Tongue
full member
Activity: 126
Merit: 100
July 27, 2014, 03:11:37 PM
Using the Java long type (64 bits) would make the algorithm fast. Tedious to implement however with all the padding, boundary matching and shifts of bits between the longs etc. I'm too lazy for that at the moment. Plus BigInteger is perhaps fast enough to match the use of longs for the implementation of R30.
full member
Activity: 126
Merit: 100
July 27, 2014, 02:15:17 PM
#99
Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:
package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main
{
    public static void main(String[] args)
    {
   BigInteger input = new BigInteger (args[0], 16);
   BigInteger result = BigInteger.ZERO;
   int rounds = (80 * 7 + 32) * 8;
   for ( int i = 0; i < rounds; ++i )
   {
   BigInteger q = input.shiftLeft (1);
   BigInteger r = input.shiftLeft (2);
   input = input.xor(q.or(r));
   if ( i >= 80 * 7 * 8  )
   {
   if ( input.testBit (i+1) )
   {
   result = result.setBit (0);
   }
   result = result.shiftLeft (1);
   }
   }
  System.out.println(result.toString (16));
    }
}

Neat. I noticed in this paper that indeed Rule 30 can be defined as xi' = xi-1 xor (xi or xi+1):

Cryptography with Cellular Automata -- http://www.stephenwolfram.com/publications/academic/cryptography-cellular-automata.pdf

I wonder about the performance of BigInteger though.
hero member
Activity: 836
Merit: 1030
bits of proof
July 27, 2014, 01:40:06 PM
#98
I don't believe it's a requirement to select the hash bits from the central column ...
It is apparent that chaos converges to order toward both left and right, so intuition says to stay in the middle if you want maximum entropy.
hero member
Activity: 836
Merit: 1030
bits of proof
July 27, 2014, 01:26:07 PM
#97
Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:
package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main
{
    public static void main(String[] args)
    {
   BigInteger input = new BigInteger (args[0], 16);
   BigInteger result = BigInteger.ZERO;
   int rounds = (80 * 7 + 32) * 8;
   for ( int i = 0; i < rounds; ++i )
   {
   BigInteger q = input.shiftLeft (1);
   BigInteger r = input.shiftLeft (2);
   input = input.xor(q.or(r));
   if ( i >= 80 * 7 * 8  )
   {
   if ( input.testBit (i+1) )
   {
   result = result.setBit (0);
   }
   result = result.shiftLeft (1);
   }
   }
  System.out.println(result.toString (16));
    }
}
full member
Activity: 126
Merit: 100
July 27, 2014, 01:11:51 PM
#96
But how to prove that 7N is enough mathematically? Huh

We enumerated all three bit patterns and 7N is enough to propagate any change to the middle column, even in worst case pattern. A pattern with repetition cycle of more than 3 bits or a random pattern must start local triangles that propagate at a higher slope than the worst case two bit pattern 01010 ...

Is this enough?

Well, enough for what? I think we should first be clear of properties needed for the purpose of block header hashing.
My take is:
- no known inverse - Wolfram demonstrated nicely that the middle column gives no clue of the input after a few generations.
- full propagation - the nonce should be able to flip any bit in the hash - we think given with 7N generations.
- proof of work - means no shortcuts, irreducible, given more clearly than in SHA256
- able to meet any difficulty target - thats a hard question, but did someone prove it for SHA256d? I think not. If there was an attempt to prove this, it would be easier for this one.

did I missed some?

Proved for Bitcoin, yes. But what about longer messages? It seems that the slope converges to 6. something for large messages (I have tested up to 100,000 bits). 7N seems enough, but I don't know for sure if it will work for say a 1 megabyte file as the message.
Pages:
Jump to: