Rule 30 automaton as hash function - page 3.

Anders

full member

Activity: 126

Merit: 100

Quote from: DeathAndTaxes on July 28, 2014, 10:02:39 AM

I think you have to consider what is your intention for using R30? As a general purpose non-cryptographic hash? As a general purpose cryptographic hash? Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW. Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space. It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:

Code:

R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")

I believe R30 is an excellent hash function for cryptography. Although that's a risky assumption since not even Stephen Wolfram has a definite proof of that I assume. My aim is for R30 to be a general hash function. Using it for Bitcoin proof of work would be an interesting test of concept.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: DeathAndTaxes on July 28, 2014, 10:02:39 AM

I think you have to consider what is your intention for using R30? As a general purpose non-cryptographic hash? As a general purpose cryptographic hash? Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW. Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space. It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:

Code:

R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")

Agree, R30 is interesting as perfect POW not as much as cryptographic hash.

As side note: I am not sure Satoshi made a mistake here. Maybe he wanted the block header is updated with new transactions instead of rolling a nonce until a block is found.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: Anders on July 28, 2014, 09:47:02 AM

Maybe not for Bitcoin but the O(n²) of R30 makes it very computationally intensive for long messages.

One way of reducing the calculations for long messages is to use the Merkle–Damgård construction together with Davies–Meyer single-block-length compression function.

With a message of 1,000,000 bits, taking the square of that results in 10¹². By dividing that message into fixed-sized blocks of 1000 bits the result is 1000² * 1000 = 10⁹. A three orders of magnitude improvement.

A similar technique is used in SHA-2 and other standard hash functions.

I think you have to consider what is your intention for using R30? As a general purpose non-cryptographic hash? As a general purpose cryptographic hash? Or as a PoW function?

Irreducibility of computation isn't really that important for the first two categories but is (in theory) pretty important for a PoW. Say someone found a way to produce SHA-256 hashes 10,000x faster for a given amount of die space. It wouldn't do anything to compromise the security of SHA-2 as a cryptographic hash but it would allow one to exploit the bitcoin network for cheap.

This is why I indicated using something like:

Code:

R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")

Anders

full member

Activity: 126

Merit: 100

Maybe not for Bitcoin but the O(n²) of R30 makes it very computationally intensive for long messages.

One way of reducing the calculations for long messages is to use the Merkle–Damgård construction together with Davies–Meyer single-block-length compression function.

With a message of 1,000,000 bits, taking the square of that results in 10¹². By dividing that message into fixed-sized blocks of 1000 bits the result is 1000² * 1000 = 10⁹. A three orders of magnitude improvement.

A similar technique is used in SHA-2 and other standard hash functions.

Anders

full member

Activity: 126

Merit: 100

Quote from: grau on July 28, 2014, 07:48:24 AM

The hash function should be defined for certain a block size.

Padding is an optional preprocessing step for the case the block size is not met exactly at input, and is then on bits not bytes.
I would leave that aside until the algorithm is not settled.

I think the fixed block size used in many hash functions today simply is a consequence (limitation) of using the Davies–Meyer compression function.

Treating the whole message as a single unit could be more robust.

Instead of adding length bits in R30 I could simply add for example 1 before and after the message as a preprocessing. That would work too since then even messages containing only zeros would be different when they have different lengths.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

The hash function should be defined for certain a block size.

Padding is an optional preprocessing step for the case the block size is not met exactly at input, and is then on bits not bytes.
I would leave that aside until the algorithm is not settled.

Anders

full member

Activity: 126

Merit: 100

Aha. Here is how the length bits are handled in SHA-256:

"Pre-processing:
append the bit '1' to the message
append k bits '0', where k is the minimum number >= 0 such that the resulting message
length (modulo 512 in bits) is 448.
append length of message (without the '1' bit or padding), in bits, as 64-bit big-endian integer
(this will make the entire post-processed length a multiple of 512 bits)" -- http://en.wikipedia.org/wiki/SHA-2#Pseudocode

That's actually similar to how I do it in my current R30 version.

Anders

full member

Activity: 126

Merit: 100

Quote from: grau on July 28, 2014, 07:11:37 AM

Remove message length encoding. That's a foreign concept to a hash function.

The message length bits are needed. Otherwise for example messages 000 and 000000 would result in the same initial condition and have the same hash value.

Anders

full member

Activity: 126

Merit: 100

I did a worst case test with one million bits messages of 010101010... and it works:

The R30 hash for UUU...U (1000000 bits) is: 10ea3662363b4940ed208b2a70960f7520bd5309e3f059ea5d106cdde9f2d7a1
The R30 hash for UUU...V (1000000 bits) is: 8903d17a9ec35cce2f028907e0fef7ccabbbd12d8f4a9f635e1d0503f395b1e9

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Remove message length encoding. That's a foreign concept to a hash function.

Anders

full member

Activity: 126

Merit: 100

Fast Java implementation of R30 using 64-bit longs:

Code:

	public final byte[] digest(byte[] message) {
		int maxMessageBytes = message.length;
		int maxKeyBytes = maxMessageBytes + 4;
		int maxKeyBits = maxKeyBytes * 8;
		byte[] hash = new byte[MAX_HASH_BYTES];
		byte[] key = new byte[maxKeyBytes];
		for (int i = 0; i < maxMessageBytes; i++) {
			key[i + 2] = message[i];
		}
		key[0] = (byte) (maxMessageBytes >>> 24);
		key[1] = (byte) (maxMessageBytes >>> 16);
		key[maxKeyBytes - 2] = (byte) (maxMessageBytes >>> 8);
		key[maxKeyBytes - 1] = (byte) maxMessageBytes;
		int maxHashBits = MAX_HASH_BYTES * 8;
		int skipRows = maxKeyBits * 7;
		int maxCells = 2;
		maxCells += maxKeyBits;
		maxCells += skipRows;
		maxCells += maxHashBits * 2;
		int maxLongs = (maxCells + 63) >>> 6;
		maxCells = maxLongs << 6;
		int cellsMid = maxCells / 2;		
		long[] cells = new long[maxLongs];
		int keyStart = (maxCells - maxKeyBits) / 2;
		for (int i = 0; i < key.length; i++) {
			int keyChar = key[i];
			int bitPos = 0x80;
			for (int j = 0; j < 8; j++) {
				long b = (keyChar & bitPos) >>> (7 - j);
				int bitIndex = keyStart + i * 8 + j;
				cells[bitIndex >>> 6] |= b << (63 - (bitIndex % 64));
				bitPos >>>= 1;
			}
		}
		int bitCount = 0;
		int mid = 0;
		int longMid = maxLongs / 2;
		int longMidShift = longMid * 2 == maxLongs ? 63 : 31;
		int maxRow = skipRows + maxHashBits * 2;
		for (int row = 0; row < maxRow; row++) {
			int doubleRow = row * 2;
			int calcWidth = doubleRow;
			if (calcWidth > maxRow - 2) {
				calcWidth = maxRow - ((doubleRow) % maxRow) + 2;
			} else {
				calcWidth += maxKeyBits;
			}
			int halfWidth = calcWidth / 2 + 2;
			int start = (cellsMid - halfWidth) >>> 6;
			int end = (cellsMid + halfWidth + 63) >>> 6;
			mid = (int) ((cells[longMid] >>> longMidShift) & 0x01);
			long carryLeft = 0L;
			for (int i = start; i < end; i++) {
				long l = cells[i];
				long carryRight = i < maxLongs - 1 ? cells[i + 1] >>> 63 : 0;
				long cellRight = (l << 1) | carryRight;
				long cellLeft = (l >>> 1) | carryLeft;
				carryLeft = l << 63;
				cells[i] = cellLeft ^ (l | cellRight);
			}
			if (row < skipRows) {
				continue;
			}
			if (row % 2 == 1) {
				if (mid == 1) {
					int bufPos = bitCount >>> 3;
					hash[bufPos] ^= 1 << (bitCount % 8);
				}
				bitCount++;
			}
		}
		return hash;
	}

Slower JavaScript alternative: http://jsfiddle.net/7DV7Z/

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: grau on July 27, 2014, 03:25:18 PM

If I were to use this for real (or had more time for fun) I'd rewrite it using a vector of longs and also parallelise it since computation on (groups of) longs can be executed with CPUs cores in parallel only synchronising at change of generation.

I am sure the BigInteger with the analytic form of rule 30 already beats your javascript with 2 magnitudes Tongue

It turns out that the performance of the below is pretty dismal if compared with SHA-256 as implemented by the standard java runtime libraries. My guess is that memory allocation / shuffling for increasing length of BigIntegers dominates it. This needs a pre-allocated store of the right size.

Code:

	private static BigInteger r30hash (BigInteger input)
	{
		BigInteger result = BigInteger.ZERO;
		for ( int i = 0; i < (80 * 7 + 32) * 8; ++i )
		{
			input = input.xor (input.shiftLeft (1).or (input.shiftLeft (2)));
			if ( i >= 80 * 7 * 8  && input.testBit (i + 1) )
			{
				result = result.setBit (0);
			}
			result = result.shiftLeft (1);
		}
		return result;
	}

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: Anders on July 27, 2014, 03:59:24 PM

I think with machine code the shifts of the 64 bit registers can automatically be chained. Maybe assembler code in C/C++ would be efficient. Of course, a hardware implementation could be made even much faster than that.

In reality only Bitcoin miners who would use dedicated R30 hardware would be competitive.

Sure. Writing software for this is just for research and fun.

The best in class code in whatever software layer will be still lots of magnitudes away of an ASIC for R30, since it is utmost simple, homogenous and parallelizable at finest scale with constant memory (if any) need.

This efficiency combined with irreducibility of computation, no inverse and ability to satisfy any difficulty would make it the perfect proof of work.

The ability to satisfy any difficulty is something that yet bugs me.. I wish I could convince myself that it has this property, in the limited generation of 7N.

Since there is 80 bytes input to determine 32 bytes output, there is plenty of freedom to create whatever pattern though.

Anders

full member

Activity: 126

Merit: 100

I think with machine code the shifts of the 64 bit registers can automatically be chained. Maybe assembler code in C/C++ would be efficient. Of course, a hardware implementation could be made even much faster than that.

In reality only Bitcoin miners who would use dedicated R30 hardware would be competitive.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

If I were to use this for real (or had more time for fun) I'd rewrite it using a vector of longs and also parallelise it since computation on (groups of) longs can be executed with CPUs cores in parallel only synchronising at change of generation.

I am sure the BigInteger with the analytic form of rule 30 already beats your javascript with 2 magnitudes Tongue

Anders

full member

Activity: 126

Merit: 100

Using the Java long type (64 bits) would make the algorithm fast. Tedious to implement however with all the padding, boundary matching and shifts of bits between the longs etc. I'm too lazy for that at the moment. Plus BigInteger is perhaps fast enough to match the use of longs for the implementation of R30.

Anders

full member

Activity: 126

Merit: 100

Quote from: grau on July 27, 2014, 01:26:07 PM

Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:

package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main 
{
    public static void main(String[] args)
    {
	    BigInteger input = new BigInteger (args[0], 16);
	    BigInteger result = BigInteger.ZERO;
	    int rounds = (80 * 7 + 32) * 8;
	    for ( int i = 0; i < rounds; ++i )
	    {
		    BigInteger q = input.shiftLeft (1);
		    BigInteger r = input.shiftLeft (2);
		    input = input.xor(q.or(r));
		    if ( i >= 80 * 7 * 8  )
		    {
			    if ( input.testBit (i+1) )
			    {
				    result = result.setBit (0);
			    }
			    result = result.shiftLeft (1);
		    }
	    }
	   System.out.println(result.toString (16));
    }
}

Neat. I noticed in this paper that indeed Rule 30 can be defined as x_i' = x_i-1 xor (x_i or x_i+1):

Cryptography with Cellular Automata -- http://www.stephenwolfram.com/publications/academic/cryptography-cellular-automata.pdf

I wonder about the performance of BigInteger though.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: Peter R on July 27, 2014, 12:38:14 PM

I don't believe it's a requirement to select the hash bits from the central column ...

It is apparent that chaos converges to order toward both left and right, so intuition says to stay in the middle if you want maximum entropy.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:

package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main 
{
    public static void main(String[] args)
    {
	    BigInteger input = new BigInteger (args[0], 16);
	    BigInteger result = BigInteger.ZERO;
	    int rounds = (80 * 7 + 32) * 8;
	    for ( int i = 0; i < rounds; ++i )
	    {
		    BigInteger q = input.shiftLeft (1);
		    BigInteger r = input.shiftLeft (2);
		    input = input.xor(q.or(r));
		    if ( i >= 80 * 7 * 8  )
		    {
			    if ( input.testBit (i+1) )
			    {
				    result = result.setBit (0);
			    }
			    result = result.shiftLeft (1);
		    }
	    }
	   System.out.println(result.toString (16));
    }
}

Anders

full member

Activity: 126

Merit: 100

Quote from: grau on July 27, 2014, 12:19:33 PM

Quote from: Anders on July 27, 2014, 09:54:00 AM

But how to prove that 7N is enough mathematically? Huh

We enumerated all three bit patterns and 7N is enough to propagate any change to the middle column, even in worst case pattern. A pattern with repetition cycle of more than 3 bits or a random pattern must start local triangles that propagate at a higher slope than the worst case two bit pattern 01010 ...

Is this enough?

Well, enough for what? I think we should first be clear of properties needed for the purpose of block header hashing.
My take is:
- no known inverse - Wolfram demonstrated nicely that the middle column gives no clue of the input after a few generations.
- full propagation - the nonce should be able to flip any bit in the hash - we think given with 7N generations.
- proof of work - means no shortcuts, irreducible, given more clearly than in SHA256
- able to meet any difficulty target - thats a hard question, but did someone prove it for SHA256d? I think not. If there was an attempt to prove this, it would be easier for this one.

did I missed some?

Proved for Bitcoin, yes. But what about longer messages? It seems that the slope converges to 6. something for large messages (I have tested up to 100,000 bits). 7N seems enough, but I don't know for sure if it will work for say a 1 megabyte file as the message.

Topic: Rule 30 automaton as hash function - page 3. (Read 10686 times)