Pages:
Author

Topic: Cuckoo Cycle: a new memory-hard proof-of-work system (Read 10866 times)

legendary
Activity: 3836
Merit: 4969
Doomed to see the future and unable to prevent it
Tromp how will this hold up against HBM? It seems to be taking longer than expected to become cost effective.

Too early to tell. The current GPU solver writes + reads about 27GB of data within 1 second on a 1080Ti, which has a bandwidth of 484GB/s. So there are still other bottlenecks to overcome...

Ahh, I haven't read enough to know it is cuda dependent. I'll try to catch up in the next week, what the best link for that? I think was the last thing I read on this.

http://cryptorials.io/beyond-hashcash-proof-work-theres-mining-hashing/

EDIt: I just remembered this is hardware agnostic isn't it? It should be IIRC. Were you just using the Nvidia as an example?

Pretty agnostic indeed.
My benching was done on intel Core i5/i7 and that NVidia, and I happen to have the memory bandwidth data
available for the latter.

I'm rewriting my Cuckoo Cycle paper to have more details on mean miner. For now, it's best to study its source code.

I wish I still could, unfortunately those days are past for me. Smiley

I'll just keep an eye on grin and assume (yeah I know) that we will see the latest mirroring there as well.

BTW this threads OP needs some +M.
legendary
Activity: 990
Merit: 1108
Tromp how will this hold up against HBM? It seems to be taking longer than expected to become cost effective.

Too early to tell. The current GPU solver writes + reads about 27GB of data within 1 second on a 1080Ti, which has a bandwidth of 484GB/s. So there are still other bottlenecks to overcome...

Ahh, I haven't read enough to know it is cuda dependent. I'll try to catch up in the next week, what the best link for that? I think was the last thing I read on this.

http://cryptorials.io/beyond-hashcash-proof-work-theres-mining-hashing/

EDIt: I just remembered this is hardware agnostic isn't it? It should be IIRC. Were you just using the Nvidia as an example?

Pretty agnostic indeed.
My benching was done on intel Core i5/i7 and that NVidia, and I happen to have the memory bandwidth data
available for the latter.

I'm rewriting my Cuckoo Cycle paper to have more details on mean miner. For now, it's best to study its source code.
legendary
Activity: 3836
Merit: 4969
Doomed to see the future and unable to prevent it
Tromp how will this hold up against HBM? It seems to be taking longer than expected to become cost effective.

Too early to tell. The current GPU solver writes + reads about 27GB of data within 1 second on a 1080Ti, which has a bandwidth of 484GB/s. So there are still other bottlenecks to overcome...

Ahh, I haven't read enough to know it is cuda dependent. I'll try to catch up in the next week, what the best link for that? I think was the last thing I read on this.

http://cryptorials.io/beyond-hashcash-proof-work-theres-mining-hashing/

EDIt: I just remembered this is hardware agnostic isn't it? It should be IIRC. Were you just using the Nvidia as an example?
legendary
Activity: 990
Merit: 1108
Tromp how will this hold up against HBM? It seems to be taking longer than expected to become cost effective.

Too early to tell. The current GPU solver writes + reads about 27GB of data within 1 second on a 1080Ti, which has a bandwidth of 484GB/s. So there are still other bottlenecks to overcome...
legendary
Activity: 3836
Merit: 4969
Doomed to see the future and unable to prevent it
Sorry to necro bump but I suspect this is going to be relevant again soon and this is the thread for it.

Any theories about the bottlenecks in mining with Cuckoo Cycle?  I mean I know it's supposed to be memory hard but here is a platform with 96 DIMM slots http://www.tyan.com/datasheets/DataSheet_FT76-B7922.pdf This thing can hold up to 6 TERABYTES of DRAM. Surely one would start to run into processor bottlenecks before that point but when? And also just more generally curious on peoples thoughts on the most efficient way to mine with Cuckoo Cycle.

Also fancy seeing you here aminorex.

He's not the only one that has monitored this thread. Smiley

This was a little ahead of it's time but I agree it will become relevant soon.

Tromp how will this hold up against HBM? It seems to be taking longer than expected to become cost effective.
legendary
Activity: 990
Merit: 1108
And also just more generally curious on peoples thoughts on the most efficient way to mine with Cuckoo Cycle.

Lots of memory is not enough. You need lots of memory bandwidth, and enough cores to saturate it.
Currently, GPUs seem superior to CPUs:

https://github.com/tromp/cuckoo/blob/master/GPU.md

Note there is $35000 in bounties for various performance doublings.
legendary
Activity: 1722
Merit: 1217
Sorry to necro bump but I suspect this is going to be relevant again soon and this is the thread for it.

Any theories about the bottlenecks in mining with Cuckoo Cycle?  I mean I know it's supposed to be memory hard but here is a platform with 96 DIMM slots http://www.tyan.com/datasheets/DataSheet_FT76-B7922.pdf This thing can hold up to 6 TERABYTES of DRAM. Surely one would start to run into processor bottlenecks before that point but when? And also just more generally curious on peoples thoughts on the most efficient way to mine with Cuckoo Cycle.

Also fancy seeing you here aminorex.
legendary
Activity: 1596
Merit: 1030
Sine secretum non libertas
legendary
Activity: 990
Merit: 1108
But I was trying to re-introduce the higher-level issue that, while the graph is random, a solver always has two options:  continue working on the current graph or throw it away and try a new one.

Indeed. And this could very well be taken to the extreme with the cuckoo tmto you identified;
when looking at vertex subsets of size N/k, only look at the first one, and skip all the k-1 others,
since they will have less chance (even if ever so slightly, due to partial overlap) of producing a solution.

The only cost to switching to the next graph is one sha256 to compute a new siphash key...
dga
hero member
Activity: 737
Merit: 511
AFAIK the probability of there being a k-cycle in a bipartite undirected G(n,p) graph, where k > 3 has not been proven, so we don't have a strong basis to judge the bounds of an ideal solution for Cuckoo Cycle.  This is were I would love to be proven wrong, as it would lay to rest the possibility of there being unknown backdoors in Cuckoo Cycle.

By backdoor, do you mean time-memory trade-offs?
These do not depend in any way on the exact value of the probability of the Cuckoo graph
having a k-cycle. The whitepaper already has numerical approximations for those probabilities,
which suffice for any run-time or memory-use analysis...

I share some of optimiz3's concerns.  Let me give an incorrect hypothetical example:

I think you're misreading my answer.
I was telling optimiz3 that it doesn't matter whether the probability is 2.4% or 2.6%, so he should not require me to prove some exact closed form expression for the probability and just accept my numerical approximations as being good enough for analyzing resource usage.

As to your hypothetical graph-statistics filter, that is not something I worry about in the least:-(

Sorry - i was overgeneralizing from his note.  I agree with you that the expected probabilities (for the specific N you evaluated) are pretty clear given that the graph is random.

But I was trying to re-introduce the higher-level issue that, while the graph is random, a solver always has two options:  continue working on the current graph or throw it away and try a new one.
legendary
Activity: 990
Merit: 1108
AFAIK the probability of there being a k-cycle in a bipartite undirected G(n,p) graph, where k > 3 has not been proven, so we don't have a strong basis to judge the bounds of an ideal solution for Cuckoo Cycle.  This is were I would love to be proven wrong, as it would lay to rest the possibility of there being unknown backdoors in Cuckoo Cycle.

By backdoor, do you mean time-memory trade-offs?
These do not depend in any way on the exact value of the probability of the Cuckoo graph
having a k-cycle. The whitepaper already has numerical approximations for those probabilities,
which suffice for any run-time or memory-use analysis...

I share some of optimiz3's concerns.  Let me give an incorrect hypothetical example:

I think you're misreading my answer.
I was telling optimiz3 that it doesn't matter whether the probability is 2.4% or 2.6%, so he should not require me to prove some exact closed form expression for the probability and just accept my numerical approximations as being good enough for analyzing resource usage.

As to your hypothetical graph-statistics filter, that is not something I worry about in the least:-(
dga
hero member
Activity: 737
Merit: 511
AFAIK the probability of there being a k-cycle in a bipartite undirected G(n,p) graph, where k > 3 has not been proven, so we don't have a strong basis to judge the bounds of an ideal solution for Cuckoo Cycle.  This is were I would love to be proven wrong, as it would lay to rest the possibility of there being unknown backdoors in Cuckoo Cycle.

By backdoor, do you mean time-memory trade-offs?
These do not depend in any way on the exact value of the probability of the Cuckoo graph
having a k-cycle. The whitepaper already has numerical approximations for those probabilities,
which suffice for any run-time or memory-use analysis...

I share some of optimiz3's concerns.  Let me give an incorrect hypothetical example:

What if it were possible to establish, with P(error) < 10%, whether or not a graph *had* a 42 cycle using, e.g., a histogram of the counts of edges incident upon vertices?

(I doubt this is true - I'm pulling something plausible-sounding out of that place we pull such things.)

In other words, if there were some relatively simple-to-measure characteristics of the graph that would let you rapidly find good candidates to spend your effort on, an optimized solver might spend a lot of time *generating graph instances* and relatively little time evaluating them using much memory at all.  With your default parameters of about 2.5% of graphs having a 42-cycle, this could easily result in a 10x speedup using comparatively little chip area or memory.

I'm not saying at all that such a thing exists.  But it's the kind of avenue of attack I worry about.
legendary
Activity: 990
Merit: 1108
AFAIK the probability of there being a k-cycle in a bipartite undirected G(n,p) graph, where k > 3 has not been proven, so we don't have a strong basis to judge the bounds of an ideal solution for Cuckoo Cycle.  This is were I would love to be proven wrong, as it would lay to rest the possibility of there being unknown backdoors in Cuckoo Cycle.

By backdoor, do you mean time-memory trade-offs?
These do not depend in any way on the exact value of the probability of the Cuckoo graph
having a k-cycle. The whitepaper already has numerical approximations for those probabilities,
which suffice for any run-time or memory-use analysis...
newbie
Activity: 37
Merit: 0
We explored this when looking at k-SUM, where it became clear there was an amortized O(sqrt(n)) compute cost and an O(n) space cost, rooted in the probability of a collision which has been proven via the birthday paradox.

AFAIK the probability of there being a k-cycle in a bipartite undirected G(n,p) graph, where k > 3 has not been proven, so we don't have a strong basis to judge the bounds of an ideal solution for Cuckoo Cycle.  This is were I would love to be proven wrong, as it would lay to rest the possibility of there being unknown backdoors in Cuckoo Cycle.
legendary
Activity: 990
Merit: 1108

From your Cuckoo Cycle paper:

Quote
"For example, for L = 2 the problem reduces to finding a birthday collision as in the Momentum proof-
of-work."


I observed that length 2 Cuckoo Cycle is equivalent to Momentum is equivalent to finding birthday collisions
(ignoring differences in choice of hash function and in number of edges).

What makes you think that says anything about lower bounds for space and time use?

We don't know how much memory and time is needed for finding birthday collisions.
We just have some upperbounds, and some linear trade-offs of memory for time.

We similarly have some upper bounds for the more general Cuckoo Cycle and fewer known trade-offs
(no linear one), which in your terminology makes Momentum the more vulnerable one.
newbie
Activity: 37
Merit: 0
Quote
Quote from: optimiz3 on Today at 12:26:18 AM

Quote
What mathematical statement about Momentum is trivial to proof?
It is an instance of the birthday problem, where complexity is understood.

I don't think this is understood.

Why don't you think the complexity of Momentum understood/proven?

From your Cuckoo Cycle paper:

Quote
"For example, for L = 2 the problem reduces to finding a birthday collision as in the Momentum proof-
of-work."
legendary
Activity: 990
Merit: 1108

Quote
What mathematical statement about Momentum is trivial to proof?
It is an instance of the birthday problem, where complexity is understood.

I don't think this is understood.

Quote
Quote
What constitutes a vulnerability of a proof-of-work?
The possibility implementations exist that exploit unaccounted for (lower and amortized) bounds in memory and/or execution time.

Quote
What is a proof of difficulty?
A proof the amortized limits of memory and execution time cannot be less than the proposed implementation for a PoW.

What is the minimum memory usage for a Momentum implementation?
newbie
Activity: 37
Merit: 0
What doesn't make sense?

Quote
What mathematical statement about Momentum is trivial to proof?

It is an instance of the birthday problem, where complexity is understood.

Quote
What constitutes a vulnerability of a proof-of-work?

The possibility implementations exist that exploit unaccounted for (lower and amortized) bounds in memory and/or execution time.

Quote
What is a proof of difficulty?

A proof the amortized limits of memory and execution time cannot be less than the proposed implementation for a PoW.

Quote
And how do you rhyme the above with my Apr 30 observation that Momentum is a special case of
Cuckoo Cycle where cycle length equals 2, a choice which leads it to be particularly memory-easy?

Cycle lengths of 2 and 3 are special cases because they can be restated in terms of the birthday problem and 3-cliques respectively.  They do not generalize to k-cycle (nor 42-cycle), which is the concern.
legendary
Activity: 990
Merit: 1108
Momentum is trivial to prove for any parameter selection because it is based on a well known problem.
Cuckoo Cycle is based on a novel graph-theoretic approach which may or may not have vulnerabilities, but without a proof of difficulty for a particular set of parameters, there is no way to asses the vulnerability to future attacks.

I'm afraid you're not making much sense.

If we're going to continue this discussion you should first define your terminology.
What mathematical statement about Momentum is trivial to proof?
What constitutes a vulnerability of a proof-of-work?
What is a proof of difficulty?

And how do you rhyme the above with my Apr 30 observation that Momentum is a special case of
Cuckoo Cycle where cycle length equals 2, a choice which leads it to be particularly memory-easy?
[/quote]
newbie
Activity: 37
Merit: 0
Momentum is trivial to prove for any parameter selection because it is based on a well known problem.

Cuckoo Cycle is based on a novel graph-theoretic approach which may or may not have vulnerabilities, but without a proof of difficulty for a particular set of parameters, there is no way to asses the vulnerability to future attacks.

Optimality comes through applying engineering requirements to a defined set of provable algorithm characteristics.  We don't have that (yet) for Cuckoo Cycle.

EDIT #2: Grammar, also some searching turned up this recent (2009) paper "Deviation inequality for the number of k-cycles in a random graph" by Wang and Gao.  Perhaps it may be of use?
Pages:
Jump to: