Mining protocol extension: noncerange

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Ok, this is really ugly, so hopefully JSON-RPC dies soon...

noncerange is provided to the miner in the usual big endian format: 100000001fffffff for all nonces from 0x10000000 to 0x1fffffff; this part is sane at least

However, since SHA256 processes in big-endian integers, while the rest of our block data is given in little-endian, this part gets hairy: the nonce will always come out in the opposite endian as the rest of the data. Since solutions are given as 32-bit big endian chunks, this means that in the solution, your nonce will be written as little endian. So for our example, dddddd1d is acceptable, and 1ddddddd is not.

On the bright side, this means you can do a simple iterative for loop from to and just plug the value into the nonce index in the SHA256 integer data regardless of what endian your platform is.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

After pondering this problem further, I have come to the conclusion that despite its ugliness, this extension should follow the same behaviour of the other response data. That is, the nonces should be treated as little endian (as specified for mining hashes), but sent over the wire as big endian. That means unfortunately, a range from 0x10203040 to 0x50607080 is represented as "4030201080706050", and a winning nonce of 0x20304050 is sent in the solution as "50403020".

Edit: No idea what I was thinking last night. This is nonsense. Big endian means 0x10203040 is encoded "10203040" of course. I need to chat with a GPU miner author :|

kjj

legendary

Activity: 1302

Merit: 1026

Quote from: Luke-Jr on July 29, 2011, 01:46:00 AM

Quote from: kjj on July 29, 2011, 01:41:45 AM

I don't think it will be much of a performance hit to make the mining client keep their counter in a non-native byte ordering. If their native byte ordering is messed up, they will just have to use an 8 bit counter for the inner loop, and have a tiny bit of logic for incrementing and checking the rest when it overflows every 256 hashes.

Even if the conversion from network order to native order took as much work as a full (double) SHA256 hash, the performance it would be under 0.4%. In practice, it will be MUCH less overhead. On an Intel, it won't even take an extra register.

Just make sure that all of your range boundaries are multiples of 256.

Except that AIUI, GPUs don't iterate. They run them all at once...

No, they still iterate. They just use larger increments. Unless there is a chip out there with 4 billion ALUs that I don't know about.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Quote from: kjj on July 29, 2011, 01:41:45 AM

I don't think it will be much of a performance hit to make the mining client keep their counter in a non-native byte ordering. If their native byte ordering is messed up, they will just have to use an 8 bit counter for the inner loop, and have a tiny bit of logic for incrementing and checking the rest when it overflows every 256 hashes.

Even if the conversion from network order to native order took as much work as a full (double) SHA256 hash, the performance it would be under 0.4%. In practice, it will be MUCH less overhead. On an Intel, it won't even take an extra register.

Just make sure that all of your range boundaries are multiples of 256.

Except that AIUI, GPUs don't iterate. They run them all at once...

kjj

legendary

Activity: 1302

Merit: 1026

I don't think it will be much of a performance hit to make the mining client keep their counter in a non-native byte ordering. If their native byte ordering is messed up, they will just have to use an 8 bit counter for the inner loop, and have a tiny bit of logic for incrementing and checking the rest when it overflows every 256 hashes.

Even if the conversion from network order to native order took as much work as a full (double) SHA256 hash, the performance it would be under 0.4%. In practice, it will be MUCH less overhead. On an Intel, it won't even take an extra register.

Just make sure that all of your range boundaries are multiples of 256.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Quote from: Schleicher on July 29, 2011, 12:34:32 AM

Quote from: Luke-Jr on July 29, 2011, 12:30:40 AM

0 to 1000000 is very different in little endian and big endian. Since SHA256 is on the byte level, the hash for 1000000 in little endian and big endian will be different.

Sure. That's why you convert all values to the proper endianness before hashing.

And what kind of overhead will that have? I'm under the impression it's pretty bad.

Schleicher

hero member

Activity: 675

Merit: 514

Quote from: Luke-Jr on July 29, 2011, 12:30:40 AM

0 to 1000000 is very different in little endian and big endian. Since SHA256 is on the byte level, the hash for 1000000 in little endian and big endian will be different.

Sure. That's why you convert all values to the proper endianness before hashing.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Quote from: Schleicher on July 29, 2011, 12:23:29 AM

Quote from: Luke-Jr on July 29, 2011, 12:05:15 AM

Quote from: Schleicher on July 29, 2011, 12:02:29 AM

You can flip endianness while converting the hex values

But then your hashes will all be wrong...?

We only have to agree on a endianness for the communication between server and client. Big or little doesn't really matter. The value has to be converted only once for every getwork request.
Or are we misunderstanding you?

0 to 1000000 is very different in little endian and big endian. Since SHA256 is on the byte level, the hash for 1000000 in little endian and big endian will be different.

Schleicher

hero member

Activity: 675

Merit: 514

Quote from: Luke-Jr on July 29, 2011, 12:05:15 AM

Quote from: Schleicher on July 29, 2011, 12:02:29 AM

You can flip endianness while converting the hex values

But then your hashes will all be wrong...?

We only have to agree on a endianness for the communication between server and client. Big or little doesn't really matter. The value has to be converted only once for every getwork request.
Or are we misunderstanding you?

1bitc0inplz

member

Activity: 112

Merit: 10

In regards to this being an HTTP header....

In order for this to be truly successful, the pool would need to know which miners supported such a feature. The reason for this, is the pool needs to know which miners to send the same work to multiple times (differing only by nonce range). If the pool accidentally sent the same work to multiple miners and they didn't support this feature, the pool would just be wasting somebody's resources.

At the very least, the miner needs to set an HTTP header on the getwork request to inform the server that it supports such a feature.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Quote from: Schleicher on July 29, 2011, 12:02:29 AM

You can flip endianness while converting the hex values

But then your hashes will all be wrong...?

Schleicher

hero member

Activity: 675

Merit: 514

You can flip endianness while converting the hex values

Luke-Jr

legendary

Activity: 2576

Merit: 1186

Quote from: -ck on July 28, 2011, 08:06:35 PM

Keep the delivery of the data consistent by being in big endian. It should be up to the coders to flip the endianness before feeding it to openCL.

You can't just flip endians when you're hashing.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Keep the delivery of the data consistent by being in big endian. It should be up to the coders to flip the endianness before feeding it to openCL.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

In the attempt to implement noncerange, I came across a dilemma: while all the existing share data is in big-endian blocks of 32 bits (with larger numbers being little-endian otherwise), it seems OpenCL miners at least choose their range in little-endian. Since nonce is only 32-bits, and all 32-bit fields are standardized on big-endian (not to mention it being the standard network endian), I'd much prefer to maintain the actual nonce field as big-endian. Can someone familiar with OpenCL and/or other mining platforms (SSE2, etc) tell us whether endian will hurt performance, and if so which endian is needed to maintain the current performance? Miner implementations welcome also! (gMinor has broken noncerange support!)

ah42

newbie

Activity: 14

Merit: 0

I think it actually makes good sense for the client to use a header: it is often a common practice to advertise capabilities in HTTP headers (or even e-mail headers...)

However, it might make sense for the server to respond with the noncemask in the JSON content, as the client has already advertised its capability to handle such content.

Luke-Jr

legendary

Activity: 2576

Merit: 1186

No extension is sane if it breaks the specification for the low-level protocol.

However, one real problem found is that a mask like ff00ff00 would be a pain to implement.

Therefore, I am proposing replacing noncemask with noncerange, with a value of two 32-bit integers (encoded as hex) specifying the initial value, and last acceptable value. So for example, to give a miner the first 29 bits of nonce space (up to 536 MH/s):

Code:

"noncerange": "000000001fffffff"

kjj

legendary

Activity: 1302

Merit: 1026

Yeah, just put it in the JSON request. I don't think anyone anywhere is doing validation that would reject it because of extra fields.

Leave the meta-channel for information/negotiation about the channel.

gnukix

newbie

Activity: 13

Merit: 0

Quote from: Luke-Jr on June 29, 2011, 09:31:04 AM

I propose that miners which support this extension send a "X-Mining-Extensions: noncemask" header when requesting work. If the server supports the extension, it should respond with the same header. For other extensions, "X-Mining-Extensions" should be a space-delimited list of elements which can have parameters after an "=" character.

When this extension is active, the server should send an additional field in the JSON-RPC reply: "noncemask" is a hexadecimal-encoded mask of the nonce bits a miner is allowed to change in the header.

For example, if the server sends

Code:

"noncemask" : "70000000"

then the miner should change only the last 29 bits of the nonce.

This allows the server to give the same work to multiple miners with different nonce ranges for each to scan. Combined with X-Roll-Ntime, this can greatly improve efficiency of the work-generating component.

In addition, miners may send a "X-Mining-Hashrate" header set to their average hashrate (in hashes per second) which the upstream server might use to choose a proper sized noncemask.

Thoughts?

I don't understand why this should be a HTTP header and muddy things up with side-band information. Why not just put it in-band, in the JSON request itself? The current clients and servers are all looking for specific fields, so add a field there? Then the parsing is very well defined instead of ad-hoc.

johncarl

newbie

Activity: 13

Merit: 0

Quote from: Luke-Jr on June 29, 2011, 09:31:04 AM

Thoughts?

Would you see this replacing a stale work timeout?

Topic: Mining protocol extension: noncerange (Read 4571 times)