Cuckoo Cycle Speed Challenge; $2500 in bounties - page 2.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: dga on July 31, 2014, 09:03:12 PM

Would you consider extending the applicability of your time/memory tradeoff bounty to a theoretical improvement to the asymptotic bounds for the time-memory tradeoff, with a low-speed demonstrator, but not an insanely tuned implementation, proving the feasibility of a sub-quadratic TMTO (but superlinear - i'm guessing some extra log(n) factor) for the edge pruning component of Cuckoo Cycle?

Are you asking for a bounty for using N/k bits and an o(k^2) slowdown?
The problem with asymptotic running times is that they're hard to verify:-(

I'd be happy to generalize the bounty as follows:

$1000/E for an open source implementation that uses at most N/k bits
while running up to 1.5*k^E times slower, for any k>=2 and E>=1.

Or is that still too strict for your taste?

dga

hero member

Activity: 737

Merit: 511

Would you consider extending the applicability of your time/memory tradeoff bounty to a theoretical improvement to the asymptotic bounds for the time-memory tradeoff, with a low-speed demonstrator, but not an insanely tuned implementation, proving the feasibility of a sub-quadratic TMTO (but superlinear - i'm guessing some extra log(n) factor) for the edge pruning component of Cuckoo Cycle?

dga

hero member

Activity: 737

Merit: 511

Quote from: tromp on July 31, 2014, 03:04:49 PM

Quote from: dga on July 31, 2014, 01:44:59 PM

"the complexity of determining a shortest cycle of even length" (I couldn't find a non-paywalled PDF, but could send you a copy if you want). A key difference is the shortest aspect, however. And asymptotically, the V^2 runtime is worse than the |V||E| runtime of the simpler BFS-based approach on the graph you choose for cuckoo.
(Note: I wasn't suggesting it's better -- I mostly wanted to avoid chasing down the Yuster algorithm if you'd already evaluated it. ;-)

Please email me a copy of that Monien paper. Btw, my runtime is O(|E|) = O(|V|), not |V| |E|. I didn't check much graph theory literature because my setting is pretty specific:

the graph is (pseudo-)random.
it is very sparse; in fact a forest, apart from a constant expected number of additional cycle inducing edges.
it is only given implicitly, rather than some adjacency matrix or set of adjacency lists.

That's why the cuckoo hashing literature seemed more relevant, since these graph arise in that setting.

Agreed, and emailed. Just trying to figure out where you've gone in looking for solutions.

The sparsity is very interesting (and what makes edge pruning effective). The nature of the oracle is also fun. I tried yesterday to find a theory postdoc or someone to dangle the problem in front of, but didn't get any bites, just forward pointers.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: dga on July 31, 2014, 01:44:59 PM

"the complexity of determining a shortest cycle of even length" (I couldn't find a non-paywalled PDF, but could send you a copy if you want). A key difference is the shortest aspect, however. And asymptotically, the V^2 runtime is worse than the |V||E| runtime of the simpler BFS-based approach on the graph you choose for cuckoo.
(Note: I wasn't suggesting it's better -- I mostly wanted to avoid chasing down the Yuster algorithm if you'd already evaluated it. ;-)

Please email me a copy of that Monien paper. Btw, my runtime is O(|E|) = O(|V|), not |V| |E|. I didn't check much graph theory literature because my setting is pretty specific:

the graph is (pseudo-)random.
it is very sparse; in fact a forest, apart from a constant expected number of additional cycle inducing edges.
it is only given implicitly, rather than some adjacency matrix or set of adjacency lists.

That's why the cuckoo hashing literature seemed more relevant, since these graph arise in that setting.

dga

hero member

Activity: 737

Merit: 511

Quote from: tromp on July 31, 2014, 12:09:34 PM

Quote

One thing I'm curious about: your current union-find solution seems to echo that of Monien. Before wasting my time thinking about it more if you've already done the work, did you attempt to apply Yuster & Zwick's algorithm (in "Finding Even Cycles Even Faster")? The only difference is a factor of inverse-of-ackerman's in (V), but it's mostly intriguing because it's simpler from a data structure perspective, which might lead to a faster real-world implementation. Or not.

I thought their approaches are quite different. Can you point to a specific paper and page of Monien describing a union-find like algorithm? My paper has a paragraph on both the similarity to and crucial difference with union-find
which explains why I can't use path-compression (which leads to the Ackermann function). My algorithm also crucially depends on the edge to vertex ratio being at most one half.
But any speedup of this part is mostly irrelevant anyway, as the running time of Cuckoo Cycle is entirely dominated by the edge trimming phase.

"the complexity of determining a shortest cycle of even length" (I couldn't find a non-paywalled PDF, but could send you a copy if you want). A key difference is the shortest aspect, however. And asymptotically, the V^2 runtime is worse than the |V||E| runtime of the simpler BFS-based approach on the graph you choose for cuckoo.

(Note: I wasn't suggesting it's better -- I mostly wanted to avoid chasing down the Yuster algorithm if you'd already evaluated it. ;-)

Quote

My nervousness remains because of the unique structure of your graphs - as you note, they're like GNMp graphs - and it's not clear that there are specific tricks that can be used here. I'm a little less worried about this than I was, but I've still got this concern that the algorithm could be vulnerable to non-breakthrough-level theory developments (i.e., things easier than breaking discrete logs. :-). Still trying to wrap my head around a good formulation of it, and people shouldn't take me too seriously on this, because it's a tickling hunch, not a "gosh, I'm sure" kind of feeling.

I still compare it in my head to Momentum, where there's less structure sitting around, and therefore I'm more confident in the analysis, which I view as a good property in a PoW. But we know that Momentum, even if it were modified to use SIPhash, basically turns into a DRAM bandwidth problem. That's not bad from an ASIC standpoint, but it is comparatively GPU-friendly.

Momentum suffers from a having a linear time-memory trade-off though. It's essentially identical to Cuckoo Cycle with N=2^26 vertices, cycle length 2, and #edges = N rather than N/2. Note that the best known implementations (entirely or partly due to you) are also essentially identical. I believe that by generalizing cycle length, Cuckoo Cycle leaves less room for trade-offs, as witnessed by the fact that Momentum's linear tmto does not carry over...

What's the linear TMTO in momentum? It seems a lot like the edge trimming step in Cuckoo (and you're right - I just repurposed my code from momentum for that step in Cuckoo). But isn't the tmto quadratic in momentum as well? E.g., split the nonce set in two halves n_1, n_2, and then test n_1 x n_1, n_1 x n_2, n_2 x n_2? Or is there something I missed?

Duh, nevermind. Right - rip through and save all outputs with high order bit 0. Repeat again for high order bit 1. Straightforward. Sorry, brain has been in too many meetings today.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: dga on July 31, 2014, 09:48:36 AM

Quote from: tromp on July 31, 2014, 09:29:26 AM

This thread in Bitcoin Development & Technical Discussion may also be of interest:

https://bitcointalksearch.org/topic/cuckoo-cycle-revisited-717267

For the record, I have no interest in developing my own coin. As a computer scientist, proof-of-work algorithms interest me, and Cuckoo Cycle has become one of my hobbies. Beyond the proof-of-work, crypto currencies have too many complexities (e.g. key management, networking issues) that I don't want to have to deal with.

Cuckoo Cycle will be adopted eventually. I suggested that the Ethereum guys take a serious look at it. Let's see what they come up with and how it compares...

Hi, John -

One more follow-up. I posted in my public review that, beyond the speedups I suggested to you, I was somewhat nervous about algorithmic speedups to the problem still being possible, and I can't shake that nervousness yet. (But I haven't been able to exploit it either, but my failure to do so doesn't mean it's not possible, just that it's not immediately obvious).

Please stay nervous:-) As you noted in the past, losing my nervousness is one of my problems...

Quote

One thing I'm curious about: your current union-find solution seems to echo that of Monien. Before wasting my time thinking about it more if you've already done the work, did you attempt to apply Yuster & Zwick's algorithm (in "Finding Even Cycles Even Faster")? The only difference is a factor of inverse-of-ackerman's in (V), but it's mostly intriguing because it's simpler from a data structure perspective, which might lead to a faster real-world implementation. Or not.

I thought their approaches are quite different. Can you point to a specific paper and page of Monien describing a union-find like algorithm? My paper has a paragraph on both the similarity to and crucial difference with union-find
which explains why I can't use path-compression (which leads to the Ackermann function). My algorithm also crucially depends on the edge to vertex ratio being at most one half.
But any speedup of this part is mostly irrelevant anyway, as the running time of Cuckoo Cycle is entirely dominated by the edge trimming phase.

Quote

My nervousness remains because of the unique structure of your graphs - as you note, they're like GNMp graphs - and it's not clear that there are specific tricks that can be used here. I'm a little less worried about this than I was, but I've still got this concern that the algorithm could be vulnerable to non-breakthrough-level theory developments (i.e., things easier than breaking discrete logs. :-). Still trying to wrap my head around a good formulation of it, and people shouldn't take me too seriously on this, because it's a tickling hunch, not a "gosh, I'm sure" kind of feeling.

I still compare it in my head to Momentum, where there's less structure sitting around, and therefore I'm more confident in the analysis, which I view as a good property in a PoW. But we know that Momentum, even if it were modified to use SIPhash, basically turns into a DRAM bandwidth problem. That's not bad from an ASIC standpoint, but it is comparatively GPU-friendly.

Momentum suffers from a having a linear time-memory trade-off though. It's essentially identical to Cuckoo Cycle with N=2^26 vertices, cycle length 2, and #edges = N rather than N/2. Note that the best known implementations (entirely or partly due to you) are also essentially identical. I believe that by generalizing cycle length, Cuckoo Cycle leaves less room for trade-offs, as witnessed by the fact that Momentum's linear tmto does not carry over...

dga

hero member

Activity: 737

Merit: 511

Quote from: tromp on July 31, 2014, 09:29:26 AM

This thread in Bitcoin Development & Technical Discussion may also be of interest:

https://bitcointalksearch.org/topic/cuckoo-cycle-revisited-717267

For the record, I have no interest in developing my own coin. As a computer scientist, proof-of-work algorithms interest me, and Cuckoo Cycle has become one of my hobbies. Beyond the proof-of-work, crypto currencies have too many complexities (e.g. key management, networking issues) that I don't want to have to deal with.

Cuckoo Cycle will be adopted eventually. I suggested that the Ethereum guys take a serious look at it. Let's see what they come up with and how it compares...

Hi, John -

One more follow-up. I posted in my public review that, beyond the speedups I suggested to you, I was somewhat nervous about algorithmic speedups to the problem still being possible, and I can't shake that nervousness yet. (But I haven't been able to exploit it either, but my failure to do so doesn't mean it's not possible, just that it's not immediately obvious).

One thing I'm curious about: your current union-find solution seems to echo that of Monien. Before wasting my time thinking about it more if you've already done the work, did you attempt to apply Yuster & Zwick's algorithm (in "Finding Even Cycles Even Faster")? The only difference is a factor of inverse-of-ackerman's in (V), but it's mostly intriguing because it's simpler from a data structure perspective, which might lead to a faster real-world implementation. Or not.

My nervousness remains because of the unique structure of your graphs - as you note, they're like GNMp graphs - and it's not clear that there are specific tricks that can be used here. I'm a little less worried about this than I was, but I've still got this concern that the algorithm could be vulnerable to non-breakthrough-level theory developments (i.e., things easier than breaking discrete logs. :-). Still trying to wrap my head around a good formulation of it, and people shouldn't take me too seriously on this, because it's a tickling hunch, not a "gosh, I'm sure" kind of feeling.

I still compare it in my head to Momentum, where there's less structure sitting around, and therefore I'm more confident in the analysis, which I view as a good property in a PoW. But we know that Momentum, even if it were modified to use SIPhash, basically turns into a DRAM bandwidth problem. That's not bad from an ASIC standpoint, but it is comparatively GPU-friendly.

tromp

legendary

Activity: 1000

Merit: 1120

This thread in Bitcoin Development & Technical Discussion may also be of interest:

https://bitcointalksearch.org/topic/cuckoo-cycle-revisited-717267

For the record, I have no interest in developing my own coin. As a computer scientist, proof-of-work algorithms interest me, and Cuckoo Cycle has become one of my hobbies. Beyond the proof-of-work, crypto currencies have too many complexities (e.g. key management, networking issues) that I don't want to have to deal with.

Cuckoo Cycle will be adopted eventually. I suggested that the Ethereum guys take a serious look at it. Let's see what they come up with and how it compares...

fluffypony

donator

Activity: 1274

Merit: 1060

GetMonero.org / MyMonero.com

Quote from: itod on July 31, 2014, 07:23:47 AM

Don't want to troll this wonderful topic, so my last post just to clarify the point of view: If there's already a bounty on the bitcointalk.org, then this is not quite an "academic only" attempt. There's no harm in making the bounty much, much bigger, and there is no scam in launching such a coin with this background. I also believe that tromp is expecting this challenge to be unclaimed, then why not making it more explicit by involving the highest quality coders, which will not make a move for $2500?

If he doesn't want to mess with launching the coin, there are more than enough people who would like to help him doing it. I would help without expecting any monetary reward, simply because I believe this algo is the future and we really need the coin which is CPU only for cryptocurrencies to be widely adopted. Tromp just has to ask and the team around the coin would be gathered in no time.

I think at this stage the bounty is more to refine Cuckoo Cycle at a working-PoC level (for want of a better term). If, thereafter, a cryptocurrency were to adopt it, you can bet they'd discuss it with John and would put paid-for resources on cryptanalysis and code optimisation of the PoW, which would negate the need to launch a separate cryptocurrency merely as a vehicle for this.

itod

legendary

Activity: 1988

Merit: 1077

Honey badger just does not care

Quote from: dga on July 31, 2014, 06:55:13 AM

Quote from: itod on July 31, 2014, 06:44:30 AM

I agree with you that this is not about the "market", but you may have misunderstood the question. I believe he asked if launching the altcoin will bring far greater incentive than $2500 for skilled developers to work on this, since making the GPU miner while all others mine on CPU can get you reward a few orders of magnitude bigger than $2500. That way these technical test could be done much, much quicker. Not a bad idea.

It would. It's what most other proof-of-works have done.

But John is doing it the ethical way by trying to have a firm foundation for it established in a way that doesn't let someone (ahem - like me) come along and make a huge profit first. (And I don't think he wants to run a coin or have his name associated with yet-another-ripoff coin!)

That's the *right* way to do it if you believe that mining should be relatively equitable.

Quote from: fluffypony on July 31, 2014, 06:56:32 AM

The downside to that is that the GPU miner will be kept on the DL for weeks or even months until the coin has been bled dry. This approach is far more ethical, and it means that as and when an existing cryptocurrency adopts Cuckoo Cycle it will already have reasonably performant CPU and GPU miners.

Don't want to troll this wonderful topic, so my last post just to clarify the point of view: If there's already a bounty on the bitcointalk.org, then this is not quite an "academic only" attempt. There's no harm in making the bounty much, much bigger, and there is no scam in launching such a coin with this background. I also believe that tromp is expecting this challenge to be unclaimed, then why not making it more explicit by involving the highest quality coders, which will not make a move for $2500?

If he doesn't want to mess with launching the coin, there are more than enough people who would like to help him doing it. I would help without expecting any monetary reward, simply because I believe this algo is the future and we really need the coin which is CPU only for cryptocurrencies to be widely adopted. Tromp just has to ask and the team around the coin would be gathered in no time.

fluffypony

donator

Activity: 1274

Merit: 1060

GetMonero.org / MyMonero.com

Quote from: itod on July 31, 2014, 06:44:30 AM

I agree with you that this is not about the "market", but you may have misunderstood the question. I believe he asked if launching the altcoin will bring far greater incentive than $2500 for skilled developers to work on this, since making the GPU miner while all others mine on CPU can get you reward a few orders of magnitude bigger than $2500. That way these technical test could be done much, much quicker. Not a bad idea.

The downside to that is that the GPU miner will be kept on the DL for weeks or even months until the coin has been bled dry. This approach is far more ethical, and it means that as and when an existing cryptocurrency adopts Cuckoo Cycle it will already have reasonably performant CPU and GPU miners.

dga

hero member

Activity: 737

Merit: 511

Quote from: itod on July 31, 2014, 06:44:30 AM

Quote from: fluffypony on July 31, 2014, 05:45:22 AM

Quote from: mtve on July 31, 2014, 05:20:58 AM

Just a wild idea - wouldn't it be easier to start one more altcoin and see market reaction?

The market reaction is irrelevant, this thread revolves around deep technical tests.

I agree with you that this is not about the "market", but you may have misunderstood the question. I believe he asked if launching the altcoin will bring far greater incentive than $2500 for skilled developers to work on this, since making the GPU miner while all others mine on CPU can get you reward a few orders of magnitude bigger than $2500. That way these technical test could be done much, much quicker. Not a bad idea.

It would. It's what most other proof-of-works have done.

But John is doing it the ethical way by trying to have a firm foundation for it established in a way that doesn't let someone (ahem - like me) come along and make a huge profit first. (And I don't think he wants to run a coin or have his name associated with yet-another-ripoff coin!)

That's the *right* way to do it if you believe that mining should be relatively equitable.

itod

legendary

Activity: 1988

Merit: 1077

Honey badger just does not care

Quote from: fluffypony on July 31, 2014, 05:45:22 AM

Quote from: mtve on July 31, 2014, 05:20:58 AM

Just a wild idea - wouldn't it be easier to start one more altcoin and see market reaction?

The market reaction is irrelevant, this thread revolves around deep technical tests.

I agree with you that this is not about the "market", but you may have misunderstood the question. I believe he asked if launching the altcoin will bring far greater incentive than $2500 for skilled developers to work on this, since making the GPU miner while all others mine on CPU can get you reward a few orders of magnitude bigger than $2500. That way these technical test could be done much, much quicker. Not a bad idea.

fluffypony

donator

Activity: 1274

Merit: 1060

GetMonero.org / MyMonero.com

Quote from: mtve on July 31, 2014, 05:20:58 AM

Just a wild idea - wouldn't it be easier to start one more altcoin and see market reaction?

The market reaction is irrelevant, this thread revolves around deep technical tests.

mtve

newbie

Activity: 9

Merit: 0

Just a wild idea - wouldn't it be easier to start one more altcoin and see market reaction?

peter378

member

Activity: 86

Merit: 10

Quote from: tromp on July 29, 2014, 11:26:57 AM

This news story

http://www.kitguru.net/components/cpu/anton-shilov/intel-core-i7-5960x-haswell-e-de-lidded-interesting-discoveries-found/

suggests that 8+ core Core i7 CPUs are not far off.
Even with an expected initial price premium, they are bound to be way more perfomance/price competitive than Xeons.

They should be called Core i8 CPUs or Core i9 CPUs if they have more than 7 cores. Core i7 does not sound right for an 8+ core.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: dga on July 30, 2014, 04:48:12 PM

That SHA256 is defined in openssl/sha.h for MacOS - I wa smostly pointing out the deprecation.

For Ubuntu, I had to define a little macro to do the init/update/final.

Otherwise, I get a link error linking against SHA256.

I also updated the makefile to separate the LIBS, because the -l should appear *after* the .o files that reference them, and that was needed when I added -lssl for compiling on ubuntu.

Thanks for elaborating, and for providing a ptach file.
I thought that all openssl/sha.h define SHA256(), but apparently not.

Patches applied to https://github.com/tromp/cuckoo ...

dga

hero member

Activity: 737

Merit: 511

Quote from: tromp on July 30, 2014, 01:57:10 PM

Quote from: dga on July 30, 2014, 12:46:31 PM

Just an fyi:

Code:

unsigned char *SHA256(const unsigned char *d, size_t n,unsigned char *md) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;

Where do I have that code? I don't see that in my repository?!

Quote

And does not compile on Ubuntu 14.04.

I've put a quick hack patch for making it work under Linux -

http://www.cs.cmu.edu/~dga/crypto/cuckoo-linux.patch

but didn't test it again under macos.

I can compile fine on my SUSE Linux box. But of course, I'd like it to work on all Linux distributions. Before applying your patch though, I'd like to identify the exact issue you're having...

That SHA256 is defined in openssl/sha.h for MacOS - I wa smostly pointing out the deprecation.

For Ubuntu, I had to define a little macro to do the init/update/final.

Otherwise, I get a link error linking against SHA256.

I also updated the makefile to separate the LIBS, because the -l should appear *after* the .o files that reference them, and that was needed when I added -lssl for compiling on ubuntu.

tromp

legendary

Activity: 1000

Merit: 1120

Quote from: dga on July 30, 2014, 12:46:31 PM

Just an fyi:

Code:

unsigned char *SHA256(const unsigned char *d, size_t n,unsigned char *md) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;

Where do I have that code? I don't see that in my repository?!

Quote

And does not compile on Ubuntu 14.04.

I've put a quick hack patch for making it work under Linux -

http://www.cs.cmu.edu/~dga/crypto/cuckoo-linux.patch

but didn't test it again under macos.

I can compile fine on my SUSE Linux box. But of course, I'd like it to work on all Linux distributions. Before applying your patch though, I'd like to identify the exact issue you're having...

dga

hero member

Activity: 737

Merit: 511

Just an fyi:

Code:

unsigned char *SHA256(const unsigned char *d, size_t n,unsigned char *md) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;

And does not compile on Ubuntu 14.04.

I've put a quick hack patch for making it work under Linux -

http://www.cs.cmu.edu/~dga/crypto/cuckoo-linux.patch

but didn't test it again under macos.

Topic: Cuckoo Cycle Speed Challenge; $2500 in bounties - page 2. (Read 6524 times)