Pages:
Author

Topic: Looking for PoW recommendations. (Read 2573 times)

legendary
Activity: 1400
Merit: 1050
April 02, 2015, 08:35:59 PM
#44
Scientific calculations, floating point and limited range search just cry to be combined. May be there are real-world problems of some value for science or business that can be turned into good PoW? Protein folding (by undisclosed algorithm) is not PoW; searching for prime number sequences has more value in software developed than in calculations done, what else could be used?
Feynman diagrams... (well  Grin from my point of view a little less stupid than prime numbers, but not necessary by far though...)

ps: a miner is just a big basic Monte Carlo generator, so any kind of simulation could be used... but it is unclear to me if there is a benefit for crypto or for science (who already has at its disposal such tool...).

The problem in implementing other stuff than cryptographic functions, mean it is breakable and it is possible to cheat by using such or such approximation (the danger in using floating point calculation and complex mathematical formula)
hero member
Activity: 524
Merit: 500
April 02, 2015, 02:51:10 PM
#43
Scientific calculations, floating point and limited range search just cry to be combined. May be there are real-world problems of some value for science or business that can be turned into good PoW? Protein folding (by undisclosed algorithm) is not PoW; searching for prime number sequences has more value in software developed than in calculations done, what else could be used?
legendary
Activity: 924
Merit: 1132
April 02, 2015, 11:14:52 AM
#42

Draft idea: two values that differ only in one bit will be separated by Hamming distance of 1. Miner could backet hashes by Hamming weight, possibly discarding hashes with low and high weight, then generate Bloom filter for even backets and perform search for odd ones in two neighbor (even, +-1) backets.


Aaaand, props to smolen for coming up with the idea that would in fact put that task in reach of a regular GPU. 

If you start with an exemplar given string of bits, make a hash table, and then every time you have a collision you evict whichever one is furthest in Hamming distance from your example string, then after a while you will collect a bunch of strings clustered very closely in Hamming space and stop updating your table very often - erasing the memory contention constraint and allowing all those parallel processors to hunt, without even needing to consult the table, for strings within a given Hamming distance of the original example.  And when they find one, they light it up; the table of things clustered in Hamming space around that original example string will yield dozens, or even hundreds, of near-triplets that include the new string.

With the parallelism and memory bandwidth advantages, they'd rapidly reach a state where they're able to find near-triplets very efficiently indeed. 

If you do the same trick with the 64G memory structure, you'll be able to collect a much larger Hamming radius and light it up with a much greater fraction of new inputs, but you can't parallelize the search to more than the 8 cores or whatever on your CPU and, because the searching threads don't even need to consult the table if they're looking for things within a small Hamming radius of an example, the GPU can.

So now I get to try to come up with some weird variation on the Cuckoo cycle. 
hero member
Activity: 524
Merit: 500
April 02, 2015, 10:25:49 AM
#41
It might work on a quad channel lga 2011... I think you would exceed bandwidth here under normal circumstances. If I am wrong it is because I'm stupid.
No idea how multiple memory buses are handled by OSes and how to explicitly balance a load between it, so far no coin developer rolled out such task Wink
newbie
Activity: 50
Merit: 0
April 01, 2015, 07:50:56 PM
#40
It might work on a quad channel lga 2011... I think you would exceed bandwidth here under normal circumstances. If I am wrong it is because I'm stupid.
hero member
Activity: 524
Merit: 500
April 01, 2015, 02:47:28 PM
#39
Right now the "Big Honkin' Server" task looks like searching for triplet subnonces whose hashes match - EXCEPT for one bit - for the first fifty bits.  Because of the way the triple is represented, the nonces have to be within a 45-bit range of each other, but that's not really an issue because you can spend all day searching inside a 45-bit range. 
Draft idea: two values that differ only in one bit will be separated by Hamming distance of 1. Miner could backet hashes by Hamming weight, possibly discarding hashes with low and high weight, then generate Bloom filter for even backets and perform search for odd ones in two neighbor (even, +-1) backets.
legendary
Activity: 924
Merit: 1132
April 01, 2015, 12:57:04 PM
#38
As I said, I haven't done much programming on GPUs so I don't know in great detail yet what their capabilities are.  Right now I'm just working from the conventional knowledge that they're highly parallel and have 2Gbytes or so of memory - and memory bandwidth that's about 10x as fast as CPU memory bandwidth. 

Right now the "Big Honkin' Server" task looks like searching for triplet subnonces whose hashes match - EXCEPT for one bit - for the first fifty bits.  Because of the way the triple is represented, the nonces have to be within a 45-bit range of each other, but that's not really an issue because you can spend all day searching inside a 45-bit range. 

You can compute them in parallel, but you're going to need a lot of data structure to store them in. For every one you compute you have fifty different places to look for a matching nonce.  The more previous results you can store, the better the odds that you'll have a couple of matching nonces in that set of places and be able to make a triplet.  In fact if you find more than two you can make more than one triplet.

That's a pretty brutal task.  I have been trying to think of every possible way for a GPU program to "cheat" at it using its 10x memory bandwidth advantage in a 2Gbyte data structure - and then  applying the same compression techniques to a 64G data structure to make the task even more brutal.  This one is supposed to be for big-memory servers only, and GPUs will find a far more profitable use of their time chewing on the SHA256D stuff that the big-memory servers (which don't usually have any use for high-end graphics cards and aren't built with them) can't do nearly as fast as they can.

It's actually kind of fun - I haven't done this kind of "use every trick in the book to absolutely maximize performance given a particular set of resources" programming in years. 
legendary
Activity: 1400
Merit: 1050
April 01, 2015, 10:09:06 AM
#37
The pool operator can give each participating miner pre-signed headers to work on.
What good is a pre-signed header when each signature is only good for one nonce?  

I thought your nonce field defined the 2^64 search space. Re-reading your earlier post, I now see that it defines the triple-nonce solution within the search space as well.

But if every element in your search space requires a new signature then your PoW will be very compute intensive, your search space rather limited in size, and thus not particularly memory hard.
actually it depends a bit how the 3 nonces are searched (it would become mem hard, if this implies to store the hashes generated by previous nonce).
but yes, it would probably be possible to search for 3 nonces in one pass... could take quite some time...
but I am wondering if it could be possible to search the 3 nonces in parallel or is it F1(nonce1) some_operation F2(nonce2) some_operation F3(nonce3)
legendary
Activity: 990
Merit: 1108
April 01, 2015, 09:39:17 AM
#36
The pool operator can give each participating miner pre-signed headers to work on.
What good is a pre-signed header when each signature is only good for one nonce? 

I thought your nonce field defined the 2^64 search space. Re-reading your earlier post, I now see that it defines the triple-nonce solution within the search space as well.

But if every element in your search space requires a new signature then your PoW will be very compute intensive, your search space rather limited in size, and thus not particularly memory hard.
hero member
Activity: 524
Merit: 500
April 01, 2015, 06:59:40 AM
#35
Actually opensource ecc code for the GPU would be pretty nice, and possibly worthwhile.
Long residue math would be fun to do in OpenCL, but is any coin ready for bulk signature verification?
legendary
Activity: 924
Merit: 1132
March 31, 2015, 10:45:37 PM
#34

The pool operator can give each participating miner pre-signed headers to work on.


What good is a pre-signed header when each signature is only good for one nonce? 
legendary
Activity: 990
Merit: 1108
March 31, 2015, 09:57:55 PM
#33
But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.

The pool operator can give each participating miner pre-signed headers to work on.

You can only prevent this by having PoW attempts be so cheap that the pool operator
(who can likely do hardware accelerated signing) could not hand out signed headers fast enough
to keep all miners busy.

But your desire for a memory hard PoW suggests rather slow PoW attempts, in which case
the pool operator can easily keep up...
hero member
Activity: 524
Merit: 500
March 31, 2015, 07:22:04 PM
#32
But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.
May be there is a way to provide miner with only partial knowledge of the key, say, miner will be able to sign only message with fixed prefix. Unlikely, but careful with hash used in signing algorithm.
With ECDSA ASICs, fast cheap internet and slow PoW pools will continue business as usual.
Homomorphic encryption. Seems not ready yet.
Pool could prevent miner from spending block reward by ceasing his pending payments and broadcasting the key. So pool gets nothing (like in block withholding attack), but uncooperative miner with high probability takes a loss. Variant: association of miners cooperate, put some funds under escrow, establish shared secret (private key), reward is splitted in p2pool fashion. If uncooperative spend is detected, association waives reward by broadcasting the key and confiscates defector's funds (or burns all funds, if there is no way to determine rogue member). I guess we are on territory of reverse game theory here.
legendary
Activity: 924
Merit: 1132
March 31, 2015, 05:25:56 PM
#31
Actually opensource ecc code for the GPU would be pretty nice, and possibly worthwhile.

But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.

Pool mining is an existential threat to the future of cryptocurrency.  I understand why people do it, but putting that much control (more specifically putting everyone that much closer to a 51% attack) into the hands of a pool admin is dangerous.

The other 'ringer' is the root of the merkle tree of unspent txOuts.  That allows "rolling root" block chain compression where the oldest blocks can be allowed to fall off the end of the block chain after long enough.

hero member
Activity: 524
Merit: 500
March 30, 2015, 11:17:35 PM
#30
Coinbase txOut is not spendable until more than 100 blocks have passed since it was mined.
Yes, I was wrong about it. Thought it is wallet restriction Sad

It suffices to ensure that the coinbase has only one txOut.
Seems so. Now I have to invent new trick for my miner Smiley

And what kind of 'ringer' did you mean initially? Something with hash?

EDIT: May be the purpose of this header structure is to ensure that miners will do hash+signature per nonce, and encourage developing open-source ECC math for GPU?
legendary
Activity: 924
Merit: 1132
March 30, 2015, 10:50:17 PM
#29

If block can contain transaction that spends coinbase txout (almost sure it can), privkey of coinbase txout could be fully disclosed to everyone, the second transaction will transfer to undisclosed pubkey.

Coinbase txOut is not spendable until more than 100 blocks have passed since it was mined.  Ah, I think I get the attack.  You're saying that *NEXT* time the miner gets a block, he can spend the 1-satoshi txOut from the current coinbase and claim the rest as 'mining fee?'  

Good point.  I'll go make sure it can't happen. coinbase can only have one txOut and it must be the full amount.

EDIT:  I'm editing this thing a lot....

Actually, nope.  It suffices to ensure that the coinbase has only one txOut.  If the miner doesn't make it for the full amount he's due, that's tough toenails; the rest isn't going to appear out of nowhere when he spends his undersized txOut.
hero member
Activity: 524
Merit: 500
March 30, 2015, 10:39:49 PM
#28
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve!

20      32        2   sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout


How so?  It is my objective to ensure that the person who finds the winning hash is able to spend the txOut.  What better way than ensuring that he needs the Privkey in the first place to form the block that must be hashed?  
I've seen code that checks only the first txOut of coinbase, it could be 1 satoshi dust, the rest is transferred to unsiclosed pubkeyhash.
If block can contain transaction that spends coinbase txout (almost sure it can), privkey of coinbase txout could be fully disclosed to everyone, the second transaction will transfer to undisclosed pubkey.
EDIT: Wait, I was too fast. It's useless for protected closed source miner, such as my Smelter.
legendary
Activity: 924
Merit: 1132
March 30, 2015, 10:34:00 PM
#27
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve!

20      32        2   sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout


How so?  It is my objective to ensure that the person who finds the winning hash is able to spend the txOut.  What better way than ensuring that he needs the Privkey in the first place to form the block that must be hashed? 
legendary
Activity: 924
Merit: 1132
March 30, 2015, 10:32:14 PM
#26
I'd start with set of buckets with ring buffer inside of each, advance through nonces in big chunks, newly calculated hashes are first uploaded (from cache to main memory) into buckets and then, when entire new chunk is distributed, download each bucket back to fast cache and search for triplets with hash table. Memory access going to be smooth, may be even some HDD bandwidth could be used Smiley

Consider how you'd index a search for "triplets" with a hash collision in 35 bits -- except that any 4 of those 35 bits MUST be different between hash A and hash B, and a different 4 bits MUST be different between hash B and hash C.  

Suddenly, you are either trying to write nonces into many buckets of your table (use too much memory) or trying to read from
about ten-thousand different unpredicted places in a simple hash-indexed table (cache abuse).  Of course, you may be able to do that by the time you calculate another signature over the next subnonce....  
hero member
Activity: 524
Merit: 500
March 30, 2015, 10:24:09 PM
#25
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve!

20      32        2   sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout
Pages:
Jump to: