Rule 30 automaton as hash function - page 2.

Anders

full member

Activity: 126

Merit: 100

Quote from: gmaxwell on August 06, 2014, 04:27:50 AM

Quote from: Anders on August 05, 2014, 05:42:56 PM

Cellular automata may actually be difficult to analyze mathematically. Maybe like the game go you mentioned. And proving randomness is really tricky. My guess is that the R30 hash function acts well as a random oracle.

Except you already proved it bad by conventional cryptographic standards in https://bitcointalksearch.org/topic/m.7927510 where you find a collision. A good hash-function is not efficiently distinguishable from a random oracle (other than in the trivial sense where it has a compact implementation).

No! That's an old pre-hash version that absolutely sucks. It was before I had learned proper ways to do it. Here is the current version that uses a Merkle–Damgård construction and the Davies–Meyer compression function that, if I have done the implementation correctly, should be the real deal: http://jsfiddle.net/596XN/

Quote

Being difficult to analyze mathematically is pretty much the polar opposite what we want in a cryptographic building block— there is just an extreme danger of a complete break lurking right behind an intellectual speed-bump. It's often said that anyone can come up with a scheme that they can't crack themselves, but it's perhaps more interesting to note that almost anyone can come up with a scheme with a trivial weakness but which is hard enough to analyze that no one will find it until someone's money (or life!) is on the line.

Easy to analyze mathematically is no guarantee for that it will be secure. Take SHA-2 for example which is based on pretty straightforward plain math it seems, yet someone may discover a technique to break it. And Stephen Wolfram has pointed out that the math used in science today is only a tiny subset of the universe of possible constructs, a result more out of tradition and legacy rather than a deliberate design behind it.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: Anders on August 05, 2014, 05:42:56 PM

Cellular automata may actually be difficult to analyze mathematically. Maybe like the game go you mentioned. And proving randomness is really tricky. My guess is that the R30 hash function acts well as a random oracle.

Except you already proved it bad by conventional cryptographic standards in https://bitcointalksearch.org/topic/m.7927510 where you find a collision. A good hash-function is not efficiently distinguishable from a random oracle (other than in the trivial sense where it has a compact implementation).

I've struggled a bit watching along worrying that this thread is a train-wreak of everything wrong with amateur cryptography: There are many reasonably well studied and designed functions in this space already (hash functions, and hashcash as proof of work), that benefit from decades of knoweldge and experience in building constructs which are actually irreducible for the purposes they're designed for... This is precisely the kind of effort people are counseled against, and for good reason.

Being difficult to analyze mathematically is pretty much the polar opposite what we want in a cryptographic building block— there is just an extreme danger of a complete break lurking right behind an intellectual speed-bump. It's often said that anyone can come up with a scheme that they can't crack themselves, but it's perhaps more interesting to note that almost anyone can come up with a scheme with a trivial weakness but which is hard enough to analyze that no one will find it until someone's money (or life!) is on the line.

OTOH, I'm happy people have been having fun; and most of the other POW navel gazing I've seen is even more worthless... I've drafted and canned a couple messages for this thread, struggling with the difficulty in conveying a kind of underlying understanding of what responsible work looks like for cryptographic primitives— telling you to time travel back to 1994 and read sci.crypt with me is not likely to be helpful....

— but when you get to the point of making claims that the function might actually be useful in cryptographic applications, I feel the need to speak up before another one of these incidents (which was caused by this thread, which went without adequate criticism) happens (read all the messages, not just my first response).

Have fun, by all means— but cryptography is a subtle and difficult science and art. Building a good system is an engineering discipline and having any confidence of security requires formalizing your work and putting in effort which is orders of magnitude more difficult than the basic implementation tinkering. If you don't find that kind of hard core analysis interesting yourself, then sadly it's likely never going to be done for your function. I'd say you were on the right path when you attacked and found a severe cryptographic flaw in your approach, but then something went wrong when you discarded it and continued like it never happened... If BCT were a forum that I expected competent cryptographic review in, I might also say that something has gone wrong that it's taken this long to get a less than supportive message in this thread.

You were on to something with your attack— why not dig into it more and instead of adding complexity (which might just be obfuscating weaknesses instead of removing them), just assume this is a fun learning experience and see how many other ways you can break the original construct or what other kinds of seemingly-okay-but-really-broken functions you can build in this space? I think more than anything else doing cryptanalysis and finding holes in functions has increased my appreciation for the enormous challenge that doing good work in this space involves. (I suppose I could say that In one sense there is no science of cryptography except cryptanalysis).

But this is just my curmudgeonly view, offered for your consideration.

Anders

full member

Activity: 126

Merit: 100

Quote from: jtimon on August 05, 2014, 04:58:14 PM

Quote from: Anders on August 05, 2014, 02:13:29 PM

Ok, the automaton is wrapped around for width N cells. It might work. I read somewhere that the random generator in Mathematica uses Rule 30 with a kind of fixed width size like that.

Yes, I think I remember that the book said that some of the rules appeared to be random, but they weren't with certain initial conditions or with different setups from the "cannonical" one, but some (or one?) were resistant and I assume that's R30.
It's been more almost four years since I read the book, so I could be wrong

Quote from: Anders on August 05, 2014, 02:13:29 PM

I'm worried however that cryptanalysis would become easier with a truncated cellular automaton. The image I posted from the book A New Kind of Science is about the full cellular automaton, not a truncated one I assume.

Yeah, he always uses the "cannonical" except for showing exceptions or just other ways in which you can organize the different algorithms.
But I don't understand your concern. There's many other models in the book besides cellular automata, and his point is that all of them have some equivalent of R30.
I thought you, like Wolfram, had chosen cellular automata precisely because they are simple to understand and analyze.

My hope was that maybe this could serve to have a "provably secure hashing function", instead of just "nobody has broken it".
Maybe to prove that "provably secure hashing functions" (PSHF ?) are an impossibility, who knows.

IMO the most interesting lesson about the book is that unpredictable behavior (in the sense that it can only be predicted by simulating the whole process) can emerge from apparently simple deterministic rules.
Have you played go? Very simple rules but no machine can beat professional players who relying on a strong intuition grown through experience.
I believe this two facts are profoundly related.

That's why I've been planning for some time to evolve a neural network to play go and become 9 dan by playing against versions of itself while(success || eternity).
But I got an 8-to-6 job and learned about Bitcoin, so it ended up playing against the machine I wrote in my first year of computer science to play reversi/othello, showing graphs of how much it kicked its ass with generation on the Y coordinate (the "dumb" machine kicked my own ass without any problem or second thought).

Sorry for the long story, that's also why I love your cellular automata hasing function (CAHF ?) idea.
I hope my little contribution is useful and I wish I had time to help with the code, but it doesn't look like you'll have problems to make 0 and 255 neighbors and try another version.

To finish, I also like the fact that you're interested in security for your custom hash function and not "anti-ASIC" nonsense.

Cellular automata may actually be difficult to analyze mathematically. Maybe like the game go you mentioned. And proving randomness is really tricky. My guess is that the R30 hash function acts well as a random oracle.

"In cryptography, a random oracle is an oracle (a theoretical black box) that responds to every unique query with a (truly) random response chosen uniformly from its output domain. If a query is repeated it responds the same way every time that query is submitted." -- http://en.wikipedia.org/wiki/Random_oracle

jtimon

legendary

Activity: 1372

Merit: 1002

Quote from: Anders on August 05, 2014, 02:13:29 PM

Ok, the automaton is wrapped around for width N cells. It might work. I read somewhere that the random generator in Mathematica uses Rule 30 with a kind of fixed width size like that.

Yes, I think I remember that the book said that some of the rules appeared to be random, but they weren't with certain initial conditions or with different setups from the "cannonical" one, but some (or one?) were resistant and I assume that's R30.
It's been more almost four years since I read the book, so I could be wrong

Quote from: Anders on August 05, 2014, 02:13:29 PM

I'm worried however that cryptanalysis would become easier with a truncated cellular automaton. The image I posted from the book A New Kind of Science is about the full cellular automaton, not a truncated one I assume.

Yeah, he always uses the "cannonical" except for showing exceptions or just other ways in which you can organize the different algorithms.
But I don't understand your concern. There's many other models in the book besides cellular automata, and his point is that all of them have some equivalent of R30.
I thought you, like Wolfram, had chosen cellular automata precisely because they are simple to understand and analyze.

My hope was that maybe this could serve to have a "provably secure hashing function", instead of just "nobody has broken it".
Maybe to prove that "provably secure hashing functions" (PSHF ?) are an impossibility, who knows.

IMO the most interesting lesson about the book is that unpredictable behavior (in the sense that it can only be predicted by simulating the whole process) can emerge from apparently simple deterministic rules.
Have you played go? Very simple rules but no machine can beat professional players who relying on a strong intuition grown through experience.
I believe this two facts are profoundly related.

That's why I've been planning for some time to evolve a neural network to play go and become 9 dan by playing against versions of itself while(success || eternity).
But I got an 8-to-6 job and learned about Bitcoin, so it ended up playing against the machine I wrote in my first year of computer science to play reversi/othello, showing graphs of how much it kicked its ass with generation on the Y coordinate (the "dumb" machine kicked my own ass without any problem or second thought).

Sorry for the long story, that's also why I love your cellular automata hasing function (CAHF ?) idea.
I hope my little contribution is useful and I wish I had time to help with the code, but it doesn't look like you'll have problems to make 0 and 255 neighbors and try another version.

To finish, I also like the fact that you're interested in security for your custom hash function and not "anti-ASIC" nonsense.

Anders

full member

Activity: 126

Merit: 100

Quote from: jtimon on August 05, 2014, 02:06:33 PM

Quote from: Anders on August 05, 2014, 02:00:00 PM

You mean that the cellular automaton should be truncated to a width of 256 cells?

Yes, but it is important than 0 and 255 are neighbors instead of 0 having always white on the left and 255 on its right.

Quote from: Anders on August 05, 2014, 02:00:00 PM

I think the randomness will be much lost then. It would produce a lot of hash collisions is my guess.

What it's clear is that it would be much more efficient and easier to optimize.
There's only one way to find out who's right about randomness though...
As said I think it will be as good as the infinite space one or even better. But, please, prove me wrong.

Ok, the automaton is wrapped around for width N cells. It might work. I read somewhere that the random generator in Mathematica uses Rule 30 with a kind of fixed width size like that. I'm worried however that cryptanalysis would become easier with a truncated cellular automaton. The image I posted from the book A New Kind of Science is about the full cellular automaton, not a truncated one I assume.

jtimon

legendary

Activity: 1372

Merit: 1002

Quote from: Anders on August 05, 2014, 02:00:00 PM

You mean that the cellular automaton should be truncated to a width of 256 cells?

Yes, but it is important than 0 and 255 are neighbors instead of 0 having always white on the left and 255 on its right.

Quote from: Anders on August 05, 2014, 02:00:00 PM

I think the randomness will be much lost then. It would produce a lot of hash collisions is my guess.

What it's clear is that it would be much more efficient and easier to optimize.
There's only one way to find out who's right about randomness though...
As said I think it will be as good as the infinite space one or even better. But, please, prove me wrong.

Anders

full member

Activity: 126

Merit: 100

Quote from: jtimon on August 05, 2014, 01:34:33 PM

Maybe 0.5N is still not enough, but my main point is that you don't need to calculate the state of more than 256 cells per generation.
My prediction is that (for the same number of generations) with fixed size it will not only be faster to calculate, but you'll even get more randomness than with your infinite space approach.

You mean that the cellular automaton should be truncated to a width of 256 cells? I think the randomness will be much lost then. It would produce a lot of hash collisions is my guess.

jtimon

legendary

Activity: 1372

Merit: 1002

Quote from: Anders on August 05, 2014, 01:21:15 PM

Sure, you could start with a SHA-256 value as initial condition instead of the full plain text message. I even proposed that somewhere earlier in this thread.

Yes, you answered to Peter R when he was pointing out that your message-to-256-bits mechanism was too expensive.

Quote from: Anders on August 05, 2014, 01:21:15 PM

Then N would be 256 bits. With 128 generations (0.5N) you would get something like this: https://bitcointalksearch.org/topic/m.8045418

Notice how similar the hash values are.

Compare 0.5 N: http://jsfiddle.net/UQ6B7/

With 7N: http://jsfiddle.net/596XN/

I have experimented with 2D cellular automata and they seem to be good at producing random results. The reason for why I chose Rule 30 is that Stephen Wolfram has done extensive research on that and showed that it produces good random numbers and that it's likely that cryptanalysis will be difficult when every other row in the center column is sampled. 2D automata may be at least as good, but that would require a lot of work to find out. It's easier to build on Wolfram's research.

But that's with your infinite space approach, I think you're missing my main point:

Quote from: jtimon on August 05, 2014, 12:51:29 PM

Those 256 bits obtained are the initial state (the first row).
You apply R30 N times (N generations) using a constant row size of 256 cells per generation, treating cell 0 and cell 255 as neighbors (so no pyramid here, just a 256 cells-wide column all the time).
With 128 generations all cells should have influenced all other cells.

Maybe 0.5N is still not enough, but my main point is that you don't need to calculate the state of more than 256 cells per generation.
My prediction is that (for the same number of generations) with fixed size it will not only be faster to calculate, but you'll even get more randomness than with your infinite space approach.

Anders

full member

Activity: 126

Merit: 100

Quote from: jtimon on August 05, 2014, 12:51:29 PM

Reading your answer and more of the thread, I'm not sure what you're doing anymore. I thought the initial state was the data...

My recipe is the following:

Apply SHA256 once, you get 256 bits.
Those 256 bits obtained are the initial state (the first row).
You apply R30 N times (N generations) using a constant row size of 256 cells per generation, treating cell 0 and cell 255 as neighbors (so no pyramid here, just a 256 cells-wide column all the time).
With 128 generations all cells should have influenced all other cells.
So this is constant space and can be optimized to constant time per generation, thus constant time per N generations.
If we assume that N must be at least S/2 are required, being S the size of the cell array, then is O(S^2) time and O(S) space, but here S=256 is constant.

Can any of you prove that my approach doesn't produce enough randomness?

You see, Wolfrang always starts with 1 block cell alone as initial condition and allows infinite space for simplicity but that's not a requirement for cellular automata.
They don't even need to be a 2D universe (1D space + time), you could perfectly have initiate a 16x16 binary toroid (2D space where the edges opposite edges are neighbors) with your 256 bits and use a 3D rule with much less generations. But first you have to find out which one is equivalent in randomness to rule 30, and here instead of 2^(2^3)=256 possible automata, there are 2^(2^9) to analyze!!!).

Sure, you could start with a SHA-256 value as initial condition instead of the full plain text message. I even proposed that somewhere earlier in this thread. Then N would be 256 bits. With 128 generations (0.5N) you would get something like this: https://bitcointalksearch.org/topic/m.8045418

Notice how similar the hash values are.

Compare 0.5 N: http://jsfiddle.net/UQ6B7/

With 7N: http://jsfiddle.net/596XN/

I have experimented with 2D cellular automata and they seem to be good at producing random results. The reason for why I chose Rule 30 is that Stephen Wolfram has done extensive research on that and showed that it produces good random numbers and that it's likely that cryptanalysis will be difficult when every other row in the center column is sampled. 2D automata may be at least as good, but that would require a lot of work to find out. It's easier to build on Wolfram's research.

jtimon

legendary

Activity: 1372

Merit: 1002

Reading your answer and more of the thread, I'm not sure what you're doing anymore. I thought the initial state was the data...

My recipe is the following:

Apply SHA256 once, you get 256 bits.
Those 256 bits obtained are the initial state (the first row).
You apply R30 N times (N generations) using a constant row size of 256 cells per generation, treating cell 0 and cell 255 as neighbors (so no pyramid here, just a 256 cells-wide column all the time).
With 128 generations all cells should have influenced all other cells.
So this is constant space and can be optimized to constant time per generation, thus constant time per N generations.
If we assume that N must be at least S/2 are required, being S the size of the cell array, then is O(S^2) time and O(S) space, but here S=256 is constant.

Can any of you prove that my approach doesn't produce enough randomness?

You see, Wolfrang always starts with 1 block cell alone as initial condition and allows infinite space for simplicity but that's not a requirement for cellular automata.
They don't even need to be a 2D universe (1D space + time), you could perfectly have initiate a 16x16 binary toroid (2D space where the edges opposite edges are neighbors) with your 256 bits and use a 3D rule with much less generations. But first you have to find out which one is equivalent in randomness to rule 30, and here instead of 2^(2^3)=256 possible automata, there are 2^(2^9) to analyze!!!).

Anders

full member

Activity: 126

Merit: 100

Quote from: jtimon on August 05, 2014, 06:29:51 AM

Interesting thread, I love cellular automata (well, most bio-inspired algorithms).
Sorry if I missed something because I haven't fully read it yet.

First, SHA256 + R30 (on the resulting 256 bits) should remove some of Peter R's performance concerns, but not all.

Why don't you just calculate the 256 bits column in the middle?
cell 255 can be the cell at the right of cell 0 (and therefore cell 255 is the one at the left of 0).
I don't think this would remove randomness and if it does it can probably be compensated with more loops.
But having a constant-size cell array would make it much simpler to optimize.
For example, using SSE2 you can probably implement this with shift and bitwise logic operations alone, using all the 256 bits of the XMM registries with each instruction!

So yeah, maybe SHA256 + R30 could become a hardfork alternative to SHA256d in case it gets somehow broken.
Even for an anti-centralization hardfork in the event that a single entity is so stupid to consistently control 30%+ of the mining hardware or something like that.

I was wrong about the 2.5N value and grau measured that at least 7N is needed. See for example: https://bitcointalksearch.org/topic/m.8047993

The values in the center column depend on all the values in the initial condition after 7N generations where N is the length of the initial condition in number of bits. It turns out that the rightmost value (bit) of a worst case initial condition takes a nonlinear walk first to the right and then to the left in the cellular automaton. And after 7N steps (generations) of the cellular automaton the rightmost value has reached the center column.

Without the 7N initial steps the hash values would be far from random for similar messages.

jtimon

legendary

Activity: 1372

Merit: 1002

Interesting thread, I love cellular automata (well, most bio-inspired algorithms).
Sorry if I missed something because I haven't fully read it yet.

First, SHA256 + R30 (on the resulting 256 bits) should remove some of Peter R's performance concerns, but not all.

Quote from: Anders on July 23, 2014, 06:18:38 AM

Here is a version of the Rule 30 hash function that skips rows 2.5 times the number of bits in the initial condition: http://jsfiddle.net/d3NS6/

The reason is to ensure that the bits become influenced enough as a protection against cryptanalysis. The image shows that 2.5 times the message length in bits is enough to make the bits to the far left and right influence the cells back into the center column after N rows.

The next image shows again how protection against cryptanalysis is achieved:

Why don't you just calculate the 256 bits column in the middle?
cell 255 can be the cell at the right of cell 0 (and therefore cell 255 is the one at the left of 0).
I don't think this would remove randomness and if it does it can probably be compensated with more loops.
But having a constant-size cell array would make it much simpler to optimize.
For example, using SSE2 you can probably implement this with shift and bitwise logic operations alone, using all the 256 bits of the XMM registries with each instruction!

So yeah, maybe SHA256 + R30 could become a hardfork alternative to SHA256d in case it gets somehow broken.
Even for an anti-centralization hardfork in the event that a single entity is so stupid to consistently control 30%+ of the mining hardware or something like that.

Anders

full member

Activity: 126

Merit: 100

I created a GitHub project for R30: https://github.com/AndersLindman/R30-hash-function

Anders

full member

Activity: 126

Merit: 100

Quote from: dentldir on July 29, 2014, 11:49:54 AM

Quote from: Anders on July 29, 2014, 10:44:21 AM

I noticed that the rightmost bit in initial condition needs to be 1. If all the bits are zero on the right side of the initial condition then the hash values become nonrandom and with collisions for similar messages. The whole left side of the Rule 30 cellular automaton is nonrandom it seems.

This observation is a key part of the paper I linked on the first page. You can use that non-randomness to reverse the "right adjacent" states effectively (i.e. taking 2^17 possibilities down to 32 or so). After messing with it for some time, you can see the importance of the structure left in the CA diagrams as they iterate. You ideally don't want any recognizable continuities.

One of the papers you linked (with the omega-flip network), overcomes this by alternating with a different rule and performing a permutation between subsequent runs. That paper also links to the correct testing framework you'd want to use if you intend to go further. (NIST's sts, apply the strict avalance criterion, etc.)

I'm a huge fan of CAs and Wolfram's work from the 80's but I was disappointed at the non-peer reviewed approach he took with NKS. This is one area where his intuition is superseded by decades of rigorous work by cryptography experts.

This image shows how the left side is easy to determine when the bits are sampled too tight:

Notice that when every other row of the center column is sampled, the left side becomes difficult to determine. That's the sample method I use in my hash function.

dentldir

sr. member

Activity: 333

Merit: 250

Quote from: Anders on July 29, 2014, 10:44:21 AM

I noticed that the rightmost bit in initial condition needs to be 1. If all the bits are zero on the right side of the initial condition then the hash values become nonrandom and with collisions for similar messages. The whole left side of the Rule 30 cellular automaton is nonrandom it seems.

This observation is a key part of the paper I linked on the first page. You can use that non-randomness to reverse the "right adjacent" states effectively (i.e. taking 2^17 possibilities down to 32 or so). After messing with it for some time, you can see the importance of the structure left in the CA diagrams as they iterate. You ideally don't want any recognizable continuities.

One of the papers you linked (with the omega-flip network), overcomes this by alternating with a different rule and performing a permutation between subsequent runs. That paper also links to the correct testing framework you'd want to use if you intend to go further. (NIST's sts, apply the strict avalance criterion, etc.)

I'm a huge fan of CAs and Wolfram's work from the 80's but I was disappointed at the non-peer reviewed approach he took with NKS. This is one area where his intuition is superseded by decades of rigorous work by cryptography experts.

Anders

full member

Activity: 126

Merit: 100

I noticed that the rightmost bit in initial condition needs to be 1. If all the bits are zero on the right side of the initial condition then the hash values become nonrandom and with collisions for similar messages. The whole left side of the Rule 30 cellular automaton is nonrandom it seems.

Anders

full member

Activity: 126

Merit: 100

R30 with the Merkle–Damgård construction:

Code:

	private final static int MAX_HASH_BYTES = 32;
	private final static int BLOCK_SIZE_BYTES = MAX_HASH_BYTES;

	public final byte[] digest(InputStream is) throws IOException {
		byte[] digest = null;
		byte[] block = new byte[BLOCK_SIZE_BYTES];
		int bytesRead = 0;
		long totalBytesRead = 0;
		while (bytesRead != -1) {
			bytesRead = 0;
			int blockBytesRead = 0;
			while (blockBytesRead < BLOCK_SIZE_BYTES && bytesRead != -1) {
				bytesRead = is.read(block, blockBytesRead, 
						BLOCK_SIZE_BYTES - blockBytesRead);
				if (bytesRead > 0) {
					blockBytesRead += bytesRead;
				}
			}
			totalBytesRead += blockBytesRead;
			if (blockBytesRead < BLOCK_SIZE_BYTES) {
				for (int i = blockBytesRead; i < BLOCK_SIZE_BYTES; i++) {
					block[i] = 0;
				}
				if (BLOCK_SIZE_BYTES - blockBytesRead >= 8) {
					for (int i = 1, j = 0; i < 9; i++, j += 8) {
						block[BLOCK_SIZE_BYTES - i] =
								(byte) (totalBytesRead >>> j);
					}
				} else {
					bytesRead = 0;
				}
			}
			if (digest != null) {
				for (int i = 0; i < BLOCK_SIZE_BYTES; i++) {
					block[i] ^= digest[i];
				}
			} else {
				digest = new byte[BLOCK_SIZE_BYTES];
			}
			byte[] nextDigest = digestBlock(block);
			for (int i = 0; i < BLOCK_SIZE_BYTES; i++) {
				digest[i] ^= nextDigest[i];
			}
		}
		return digest;
	}
	
	private final byte[] digestBlock(byte[] block) {
		int maxKeyBytes = BLOCK_SIZE_BYTES + 1;
		int maxKeyBits = maxKeyBytes * 8;
		byte[] hash = new byte[MAX_HASH_BYTES];
		byte[] key = new byte[maxKeyBytes];
		for (int i = 0; i < BLOCK_SIZE_BYTES; i++) {
			key[i] = block[i];
		}
		key[BLOCK_SIZE_BYTES] = 0x01;
		int maxHashBits = MAX_HASH_BYTES * 8;
		int skipRows = maxKeyBits * 7;
		int maxCells = 2;
		maxCells += maxKeyBits;
		maxCells += skipRows;
		maxCells += maxHashBits * 2;
		int maxLongs = (maxCells + 63) >>> 6;
		maxCells = maxLongs << 6;
		int cellsMid = maxCells / 2;		
		long[] cells = new long[maxLongs];
		int keyStart = (maxCells - maxKeyBits) / 2;
		for (int i = 0; i < key.length; i++) {
			int keyChar = key[i];
			int bitPos = 0x80;
			for (int j = 0; j < 8; j++) {
				long b = (keyChar & bitPos) >>> (7 - j);
				int bitIndex = keyStart + i * 8 + j;
				cells[bitIndex >>> 6] |= b << (63 - (bitIndex % 64));
				bitPos >>>= 1;
			}
		}
		int bitCount = 0;
		int mid = 0;
		int longMid = maxLongs / 2;
		int longMidShift = longMid * 2 == maxLongs ? 63 : 31;
		int maxRow = skipRows + maxHashBits * 2;
		for (int row = 0; row < maxRow; row++) {
			int doubleRow = row * 2;
			int calcWidth = doubleRow;
			if (calcWidth > maxRow - 2) {
				calcWidth = maxRow - ((doubleRow) % maxRow) + 2;
			} else {
				calcWidth += maxKeyBits;
			}
			int halfWidth = calcWidth / 2 + 2;
			int start = (cellsMid - halfWidth) >>> 6;
			int end = (cellsMid + halfWidth + 63) >>> 6;
			mid = (int) ((cells[longMid] >>> longMidShift) & 0x01);
			long carryLeft = 0L;
			for (int i = start; i < end; i++) {
				long l = cells[i];
				long carryRight = i < maxLongs - 1 ?
						cells[i + 1] >>> 63 : 0;
				long cellRight = (l << 1) | carryRight;
				long cellLeft = (l >>> 1) | carryLeft;
				carryLeft = l << 63;
				cells[i] = cellLeft ^ (l | cellRight);
			}
			if (row < skipRows) {
				continue;
			}
			if (row % 2 == 1) {
				if (mid == 1) {
					int bufPos = bitCount >>> 3;
					hash[bufPos] |= 1 << (7 - (bitCount % 8));
				}
				bitCount++;
			}
		}
		return hash;
	}

Anders

full member

Activity: 126

Merit: 100

Quote from: grau on July 28, 2014, 07:11:37 AM

Remove message length encoding. That's a foreign concept to a hash function.

I came to think of a way to remove the length bits. By limiting the initial condition to a fixed length of 256 bits. And if the message is larger than 255 bits then the Merkle–Damgård construction is used by turning the rest of the message into 256-bit blocks. The preprocessing is then only to add the bit value 1 and pad the message with zeros to nearest 256-bit boundary. If a message is shorter than 255 bits then it gets padded with the one bit and zeros into a single 256-bit block.

And instead of dealing with bit resolution byte resolution can be used for convenience. Then the byte 0x01 is added to the message plus padding with zero bytes 0x00 to a total of 32 bytes per block.

Peter R

legendary

Activity: 1162

Merit: 1010

Quote from: grau on July 27, 2014, 01:40:06 PM

Quote from: Peter R on July 27, 2014, 12:38:14 PM

I don't believe it's a requirement to select the hash bits from the central column ...

It is apparent that chaos converges to order toward both left and right, so intuition says to stay in the middle if you want maximum entropy.

But if it wasn't a maximal entropy process, then it should be possible to peel out the predictable part in order to short-cut the computation. In other words, the computation wouldn't be irreducible.

For irreducibility to be a useful concept, doesn't it have to be binary: either it's reducible or its not? The left side of R30 is reducible along with the far right edge (I think), and the output is reducible for a certain number of rows for an enumerable set of initial conditions, but excluding these, I think the theory is that the rest is irreducible. So I think if you can prove that the column to the right of center has less entropy than the center column, then you've also proven that R30 is not irreducible (something that AFAIK no one has been able to do).

Anders

full member

Activity: 126

Merit: 100

Quote from: DeathAndTaxes on July 28, 2014, 10:02:39 AM

Code:

R30(nonce + H(blockheader)) < target

Where:
H= a secure cryptographic hashing function (i.e. SHA-2)
Blockheader = arbitrary length blockheader (excluding the nonce)
Nonce = 64 bit (don't make Satoshi's mistake of using a nonce "too small")

I don't know much about Bitcoin, but the nonce could start with 64 bits and then expanded in the future if needed! R30 would work with total length nonce + H(blockheader) being variable.

Topic: Rule 30 automaton as hash function - page 2. (Read 10686 times)