Pages:
Author

Topic: Momentum Proof-of-Work Bounty - 30 BTC [PAID] - page 5. (Read 14110 times)

hero member
Activity: 770
Merit: 566
fractally

I had considered the potential for such a design.  DHT will not perform well enough, you would need 100BT ethernet or perhaps gigabit before the hash table could be filled quickly enough.  From what I can tell a SUPER COMPUTER with a ton of RAM and high-speed interconnects would give you the best performance.   Here is the difference, such a super computer could be built from commodity parts.     


Well, if the hash is so fast you could just collaborate by splitting the birthday space in N ranges and each computer just throws away any result that isn't in its range. This trades hashing speed for memory, so works with rather than against your pow.

What range do you suggest for N and how fast is the hash?
This should be possible to test.

Any time you throw away hashes because they fall outside of your memory range you move the computational complexity from the p(n) curve to the q(n) curve in my white paper.  You save on memory but the cost is far greater.   Memory and CPU power must always be paired proportionally or you will have idle memory or waisted CPU cycles.

newbie
Activity: 23
Merit: 0

I had considered the potential for such a design.  DHT will not perform well enough, you would need 100BT ethernet or perhaps gigabit before the hash table could be filled quickly enough.  From what I can tell a SUPER COMPUTER with a ton of RAM and high-speed interconnects would give you the best performance.   Here is the difference, such a super computer could be built from commodity parts.     


Well, if the hash is so fast you could just collaborate by splitting the birthday space in N ranges and each computer just throws away any result that isn't in its range. This trades hashing speed for memory, so works with rather than against your pow.

What range do you suggest for N and how fast is the hash?
This should be possible to test.
hero member
Activity: 770
Merit: 566
fractally
I agree with FreeTrade, why use a memory intensive algorithm for BirthdayHash if you're gonna fill up your ram with hashes anyway?

But either way, here is why it is worse than scrypt:
Your proposed algorithm (momentum) scales super-linearly with the number of computers that can be dedicated to solving the proof-of-work. Set up N computers storing generated hashes in a distributed hash table between them. They now have N times the memory capacity of a single computer and will fill up their memory just as fast. This could scale to pretty large N's, you don't need low latency, you could accumulate hashes on each computer and send them in batches.

And this is bad of course (would lead to centralization).

Suggested modification:
Embrace this and encourage every single miner to be part of the same DHT (Not that simple, but you get the picture), share reward between the computer generating the hash and the one that had it stored. Could work on the current P2P system already in place.

I agree that if there were a way to efficiently merge the memory of all computers on the internet this would be great.  Unfortunately, the memory bandwidth would be limited by the internet and thus not pratical at all.  However, I suspect that every office environment could achieve this at a slightly smaller scale.   

What is the cost per unit of computational power / memory as the amount of memory / power approaches infinity?    I suspect that it goes non-linear for larger and larger sizes and thus the limits on Super Computers today.  Eventually speed-of-light becomes your problem.
hero member
Activity: 770
Merit: 566
fractally
>Assuming a cryptographically secure hashing function Hash(x) and a Sequental-
>Memory-Hard hashing function BirthdayHash(x) such as scrypt the algorithm for this
>proof of work can be defined as follows:
>Given a block of data D, calculate H = Hash(D).
>Find nonce A and nonce B such that BirthdayHash(A +H) == BirthdayHash( B+H)
>If Hash(H + A + B) < TargetDifficulty then a result has been found, otherwise keep
>searching

Okay - just wanted to mention two issues, more to check whether I've correctly understood than anything else.

1. My first instinct would be to choose a fixed number, say F and build a rainbow table for values of x where

BirthdayHash(x) = F

Then in order to solve
BirthdayHash(A +H) == BirthdayHash( B+H)

setting A and B to values such that A=x1-H and B=x2-H

This assumes ruling out the trivial A=B, of course.

This is trivial to correct of course, with a salt or half a dozen other ways, but I just wanted to confirm my understanding is correct.

2. I'm confused as to why we'd want a memory intensive hash for  BirthdayHash(x) - it seems the harder this hash is, the more opportunity for parallelization. As noted by Nathaniel, the ideal setup would be GPU working on the hashes, with the CPU looking for collisions. In order to create the bottleneck on the CPU/Memory, we'd want to flood it with potential matches so it was maxed out, and to do that we'd want to make the hash as simple as possible.

Again - have I understood that correctly, or am I missing something?

BirthdayHash(x) where X = 256 bit (or 512 bit) number and produces a 32 bit number would be hard to build a rainbow table for.  Such a rainbow table would probably use far more memory.

The "+" operator is concatenation rather than addition so the math you are doing is not valid.


BirthdayHash is only memory intensive enough to limit the cache available to the GPU.  Scrypt-like hash functions have some benefit at fighting ASIC designs as long as their validation time does not exceed 1ms.  By combining Scrypt-like functions with Momentum you get the best of all currently known techniques.
hero member
Activity: 770
Merit: 566
fractally
hero member
Activity: 770
Merit: 566
fractally
I agree with FreeTrade, why use a memory intensive algorithm for BirthdayHash if you're gonna fill up your ram with hashes anyway?

But either way, here is why it is worse than scrypt:
Your proposed algorithm (momentum) scales super-linearly with the number of computers that can be dedicated to solving the proof-of-work. Set up N computers storing generated hashes in a distributed hash table between them. They now have N times the memory capacity of a single computer and will fill up their memory just as fast. This could scale to pretty large N's, you don't need low latency, you could accumulate hashes on each computer and send them in batches.

And this is bad of course (would lead to centralization).

Suggested modification:
Embrace this and encourage every single miner to be part of the same DHT (Not that simple, but you get the picture), share reward between the computer generating the hash and the one that had it stored. Could work on the current P2P system already in place.

I had considered the potential for such a design.  DHT will not perform well enough, you would need 100BT ethernet or perhaps gigabit before the hash table could be filled quickly enough.  From what I can tell a SUPER COMPUTER with a ton of RAM and high-speed interconnects would give you the best performance.   Here is the difference, such a super computer could be built from commodity parts.     

So the question becomes how would such a super computer compete with say a corporate office that works together via the high-speed local network to mine at night?    The other question becomes, if you are going to build a SUPER COMPUTER to perform this mining computation then you probably have a dual-use machine that could be used for all kinds of research.  This is far better than Scrypt or SHA256.
newbie
Activity: 23
Merit: 0
I agree with FreeTrade, why use a memory intensive algorithm for BirthdayHash if you're gonna fill up your ram with hashes anyway?

But either way, here is why it is worse than scrypt:
Your proposed algorithm (momentum) scales super-linearly with the number of computers that can be dedicated to solving the proof-of-work. Set up N computers storing generated hashes in a distributed hash table between them. They now have N times the memory capacity of a single computer and will fill up their memory just as fast. This could scale to pretty large N's, you don't need low latency, you could accumulate hashes on each computer and send them in batches.

And this is bad of course (would lead to centralization).

Suggested modification:
Embrace this and encourage every single miner to be part of the same DHT (Not that simple, but you get the picture), share reward between the computer generating the hash and the one that had it stored. Could work on the current P2P system already in place.
donator
Activity: 1464
Merit: 1047
I outlived my lifetime membership:)

You should change it kor stick to your original offer.  Also "convince me" is very subjective,  sounds like you  ant someone to do work for free for you.

You must not know Dan. He's a real programmer (darn good too...really darn good) with a real reputation.
legendary
Activity: 1666
Merit: 1010
he who has the gold makes the rules
legendary
Activity: 1470
Merit: 1030
>Assuming a cryptographically secure hashing function Hash(x) and a Sequental-
>Memory-Hard hashing function BirthdayHash(x) such as scrypt the algorithm for this
>proof of work can be defined as follows:
>Given a block of data D, calculate H = Hash(D).
>Find nonce A and nonce B such that BirthdayHash(A +H) == BirthdayHash( B+H)
>If Hash(H + A + B) < TargetDifficulty then a result has been found, otherwise keep
>searching

Okay - just wanted to mention two issues, more to check whether I've correctly understood than anything else.

1. My first instinct would be to choose a fixed number, say F and build a rainbow table for values of x where

BirthdayHash(x) = F

Then in order to solve
BirthdayHash(A +H) == BirthdayHash( B+H)

setting A and B to values such that A=x1-H and B=x2-H

This assumes ruling out the trivial A=B, of course.

This is trivial to correct of course, with a salt or half a dozen other ways, but I just wanted to confirm my understanding is correct.

2. I'm confused as to why we'd want a memory intensive hash for  BirthdayHash(x) - it seems the harder this hash is, the more opportunity for parallelization. As noted by Nathaniel, the ideal setup would be GPU working on the hashes, with the CPU looking for collisions. In order to create the bottleneck on the CPU/Memory, we'd want to flood it with potential matches so it was maxed out, and to do that we'd want to make the hash as simple as possible.

Again - have I understood that correctly, or am I missing something?
hero member
Activity: 770
Merit: 566
fractally
But you can still gain linear increases in matches with just increasing the hashrate so at the very least you can just have an ASIC or GPU that hashes and sends the results to a CPU that manages the memory and checks for matches. GPUs will still be crazy good for it since they could hash 100s of times faster than the CPU and with 2-3GB of memory on board, the CPU would need too much memory to pull even. I think a GPU/CPU hybrid would be best in class until ASICs with the GPU doing mostly hashing and then sending the results to system RAM and the CPU which checks for matches.

With this linear increase just by increasing the hashrate the Momentum ASICs will be at least as good as the scrypt and probably much better since they could just extend an off the self system with lots of RAM and use a CPU to manage the hashtable and match checking part. Design might be a bit harder, but that's just a short term concern.

The reason it is GPU resistant has to do with bus speeds and the unpredictable runtime of my scrypt-like hash function.   I did not discuss this in the white paper which was meant to be more generic.  I also use hardware accelerated AES to give the CPU an edge over GPU.
newbie
Activity: 19
Merit: 0
But you can still gain linear increases in matches with just increasing the hashrate so at the very least you can just have an ASIC or GPU that hashes and sends the results to a CPU that manages the memory and checks for matches. GPUs will still be crazy good for it since they could hash 100s of times faster than the CPU and with 2-3GB of memory on board, the CPU would need too much memory to pull even. I think a GPU/CPU hybrid would be best in class until ASICs with the GPU doing mostly hashing and then sending the results to system RAM and the CPU which checks for matches.

With this linear increase just by increasing the hashrate the Momentum ASICs will be at least as good as the scrypt and probably much better since they could just extend an off the self system with lots of RAM and use a CPU to manage the hashtable and match checking part. Design might be a bit harder, but that's just a short term concern.
hero member
Activity: 770
Merit: 566
fractally
I have considered the potential of using hardware optimizations and believe that they could be built and with some of the advanced content addressable memory could be very quick. 

There are TWO things that are necessary to be optimized via an ASIC.

1) An ASIC that is able to handle Scrypt-like hash functions.
2) A large amount of transistors dedicated to memory.

Optimizing one without the other is pointless.

Both of these things are possible to build and the result would be a mining beast.    However, memory uses electricity and the scrypt-like hash functions are much harder to create ASICs for.

So I contend that the economic costs would make the threshold at which investing in an ASIC makes sense to be much higher than what would occur with either Bitcoin or Scrypt.

So the real question becomes what is the computational advantage per additional transistor in Sha256 vs Scrypt vs Momentum and what is the cost of deploying it.

One thing is clear, this thing is likely GPU proof and a much harder ASIC to develop than Scrypt. 

newbie
Activity: 19
Merit: 0
Once the RAM is full everything is a read so it'll be perfectly parallel with some negligible constant time overhead.

Improvements against scrypt for the birthdayhash will help a bit, but like you said any tricks generally translate fine into ASICs.

For your example, since the mac pro has 16 times more RAM and hashing power it will be exactly 16^2 or 256 times better. Well worth the 50x cost. Since this is quadratic it will just get worse when you get into 128GB servers.

I like the elegance of having the RAM be the main resource for hashing, but I think we have a ways to go before making it practical. I'll keep thinking about it.

Nathaniel
hero member
Activity: 770
Merit: 566
fractally
Nathaniel,
     Thank you for your insightful response.   Here is what I already know to be true about the algorithm.

1) The faster you can fill the RAM to the max amount, the more collisions per second you can find.
2) Once RAM is full, the performance is equal to SCRYPT with the exception that checking the centralized store is a synchronization point.  If a way can be found to require sequential access to the stored birthdays then that would be the holy grail because you couldn't fill the RAM in parallel.
3) In my particular implementation I do not use SCRYPT, but an algorithm based upon the same sequential-memory-hard design.  The twist is that I use AES (hardware accelerated on CPU) + 'unpredictable execution time' to trip up GPUs and other parallel architectures.  ASIC could probably design around it.   I left the white paper more abstract, but my particular implementation is better than Scrypt even at he Birthday Hash level.

The most compelling argument you make is that there is a non-linear gain for building larger systems.  Ie: doubling the RAM and doubling the CPU to match would result in more than 2x performance gain.  Though I would argue that the cost of building larger systems is also non-linear.   Ie: to double the RAM and CPU of a high-end system would cost more than 2x the base system.

Lets compare the range of mining machines:

1) Mac Pro with 12 cores and 32 GB of RAM
2) 1.6 Ghz 4 Core Arm A9 with 2 GB of RAM

I would estimate that the ratio of RAM to CPU is the same in both cases, but the Mac Pro would generate significantly more collisions per second because it has the RAM + ability to fill it in a timely manner.  The question becomes which is a better value:   50 ARM systems at $100 each or 1 Mac Pro at $5000+?




 
 
newbie
Activity: 19
Merit: 0
bytemaster,

First, I think the idea is quite interesting and novel. Let me summarize what you are proposing to make sure we’re on the same page. Then, I will raise my concerns with the current proposal (partly in hopes of collecting a bounty and partly because it’s a really cool idea that could be the grounds for some interesting new cryptocurrencies).

The question is whether the Momentum or scrypt give worse performance increases for specialized hardware design over typical commodity hardware already in the hands of users.

Paraphrase of your section 3 in the paper with some notes:
Take data D (transactions, previous block hash, etc), calculate H = Hash(D).
Find nonce A and nonce B such that BirthdayHash(A+H) == BirthdayHash(B+H) (in practice a nonce C, D, etc could also be required to match)
If Hash(H+A+B) < TargetDifficulty then you win.


My argument against Momentum compared to scrypt:
The way you describe the algorithm and present it in the paper is all in terms of finding pairs of birthdays that match from a large set and the probability of finding a match. While this is the intuition behind the idea of how you can require lots of RAM while still being able to verify a correct answer quickly, it does not show the real goal of a miner. A miner does not want to find one match, but as many matches as possible. This is because a miner wants as many chances at finding a hash under the difficulty as possible. When looking at number of matches, we should talk in terms of expected value.

Note that since Hash() is supposed to be trivially fast and is called so much less than other functions I just assume we have an oracle that gives us a win or loss answer for each match for clarity even though technically each match hash to be Hash(H + A + B) to determine if it’s a winner. For the rest of this explanation I’ll just talk in terms of expected matches.

Let B = the number of ‘birthdays’ (hashspace of BirthdayHash).
Let S = the number of BirthdayHash(nonce+H) stored in RAM.
Let M = maximum number that S can be given a particular system’s RAM.

If we find a new BirthdayHash(A + H), the expected number of matches ( BirthdayHash(A+H)==BirthdayHash(B+H) ) = S / B

Now we calculate the total number of expected matches during a time frame (average block creation time for instance). we take the number of BirthdayHash() we compute in that time (N). S will increase from 0 to N unless M
N * Mean(S) / B = Total expected matches
If M>=N:
Mean(S) = (1 + 2 + 3 + … + N)/N = (N+1)/2
If NMean(S) = ((1 + 2 + 3 + … + M) + M(N-M) )/N = M, if N >> M

If M>=N: Then the system scales the same as scrypt just with the amount of hashes so I make an scrypt ASIC until N>M. At this point they would be the same basically.

If N > M but not by much: Keep deploying scrypt ASICs even though there is a minor hit once RAM is full it doesn’t hurt too bad yet. If the goal is to land in this small sliver, I think that will be very hard to predict where it will be ahead of time and will likely vary drastically between CPU/GPU/ASIC so even hitting it right on for one may miss the others.

If N >> M:
This is when the lack of RAM kills straight scrypt ASICs as a possibility so I assume this is the scenario that Momentum is designed and would be tweaked to get into.
Now the expected value is: N * M / B = Total expected matches

This means that two variables affect the total amount of mining: N the number of BirthdayHash() computed per time and M which is basically the amount of RAM. The fundamental problem I see here is that we now have quadratic scaling with increased resources rather than linear.
For example:
You have a standard system that performs N BirthdayHash() and has M RAM which then expects N*M/B matches. Someone who buys a system that costs five times as much now has 5N*5M/B matches or 25 times as much output while only paying 5 times as much.

With off the shelf hardware there will be a natural limit on RAM perhaps such that the cost increases dramatically as people have to start buying high end servers, but custom ASICs can use standard commodity RAM and while no doubt they would face some scaling and bus issues, even if ASIC makers could achieve a few multiples of the amount of RAM normal users have it would multiply by the amount of hashing power. At the same time, even if RAM isn’t increased past commodity systems, if ASIC makers can use just as much RAM as commodity systems, they can still add the same multiplier as a straight scrypt ASIC would as just increasing the number of hashes per second still scales the number of matches linearly. For this reason I believe that Momentum is worse than scrypt in terms of customized hardware in the long run. In the short term, such hardware would no doubt be harder to design than just scrypt ASICs.

Another interesting piece of the algorithm to look at is the number of nonces that have to have matching birthdays. Scaling this up to 3 or 4+ shouldn’t affect the fundamental hashes per second * RAM scaling (I don’t have time to do the math but each should be so reliant on the pair matching as to be almost identically) although it will affect the storage strategy and how you use the RAM. In fact once the number of birthdays all matching is greater than 2, pools could use this to gain advantage. Consider a pool that in addition to hashing on the same block also share any partial matches (when BirthdayHash(A+H) ==BirthdayHash(B+H) and you need C and D matching as well). Each user of the pool can then improve their odds by saving partial matches and junking some single results they had. In this scenario it makes since to join the largest pool who gives you better partial matches so your expected matches also go up increasing your contribution. This is a dangerous incentive as even with BTC we have seen some large pools without any particular incentive beyond lower variance.


All that said, it is an interesting idea and I’ll keep thinking about ways to utilize/improve it. It’s also quite possible that I’ve misunderstood your algorithm so let me know what you think. These are the kinds of ideas that we need to share and discuss in order to build new innovative cryptocurrencies and improve upon existing ones.

Let me know if anything is unclear. I'm not sure how this board does math notation.


Thanks,

Nathaniel

legendary
Activity: 1470
Merit: 1030
+1 - This is a very interesting development for CPU coins.

I'll need a while to digest the white paper, but I'm liking the sound of it.

Your parameters sound similar to (the now dead) MemoryCoin - so I wonder if we have similar ideas in mind.
http://memorycoin.org/manifesto/

Some people had expressed doubts that the MemoryCoin hash was GPU resistant so very open to a new algorithm.

It so happens I'm just back from holiday and considering whether to try to restart MemoryCoin or try something new. Not particularly interested in the bounty, but if you wanted to put that towards services and promotion, we might have something.
hero member
Activity: 770
Merit: 566
fractally
In the linked white paper I have documented a new proof-of-work system that can be used for BitMessage, altcoins, or other distributed autonomous corporations.   This new proof of work can require gigabytes of memory to solve, but almost no memory to verify and as a result the ideal ASIC is memory.   If one were to create a Scrypt ASIC capable of 1 Giga-hash / second then one would require 10 terabytes of RAM to most efficiently find one solution at the target difficulty for the momentum proof-of-work every 10 minutes on average.

I have provided a proof of concept implementation  here: https://github.com/InvictusInnovations/BitShares/blob/master/src/momentum.cpp  

This 30 BTC bounty is for two different goals.

 One goal is to produce an implementation of this proof of work system that negates the algorithmic advantage given to the use of memory and allows much faster solutions.  If you are able to convince me this proof-of-work is no better than Scrypt then you will win the 30 btc bounty.   The objective proof that someone has convinced me will be whether or not I use momentum for BitShares.  I have far more on the line for choosing a bad proof of work than not paying this out.

Edit: It has been pointed out to me that this could be implemented in 1 day by a motivated individual, so the bounty for creating the following alt-coin has been reduced to 5 BTC
THe 5 BTC bounty from this thread has been supplanted by: https://bitcointalksearch.org/topic/protoshares-bounty-30-btc-start-mining-bitshares-in-november-315973

The other way to win the bounty is to create a fork of the latest Bitcoin codebase that leverages the proof of work implementation in my repository.  I will be more than willing to help with any such implementation.   The condition on this forked code base is that the resultant 'alt-coin' must not be launched or have any pre-mining and I get to set the release date.   It must have 5 minute block intervals, a maximum money supply of 2 million, a 4 year mining period until the full money supply has been mined with a halving of the mining reward every 3 months.  

You can read the full white paper for the proof-of-work here: http://invictus-innovations.com/s/MomentumProofOfWork.pdf

This bounty expires November 15th, 2013.  

Only one 30  BTC bounty will be paid... either to the creator of the alt-coin or to the person who breaks the proof-of-work.
With the reduction in alt-coin generation bounty, I will pay both bounties if both are won.
Pages:
Jump to: