Author

Topic: Ethash not fundamentally memory-hard (just practically) (Read 154 times)

sr. member
Activity: 588
Merit: 251
While recently writing code to generate the Ethereum DAG cache, I realized the algorithm is not fundamentally memory hard.  The memory it uses is basically a look-up table of pre-computed keccak hashes.  As explained on http://keccak.team/, the keccak sponge function is designed to be simpler to implement in silicon (use less chip area) than sha-256.

The DAG cache is made by sequentially running a keccack-512 hash on the seed to fill the cache, then mixing those 64-byte hashes with a few XOR operations.  The sequence of running n-rounds of keccak-512 can be decomposed to an operation that is constant in time, so that 1000 rounds can be computed in approximately the same time as 10 rounds, and with approximately the same amount of silicon area.  In the past this decomposition/optimization of multi-round hashing has been a mostly manual process.  In other words a silicon designer making a circuit that uses a single-round sha-256 and a 3-round sha-256 would need to insert the code (i.e. something like Verilog) for both an optimized single-round and 3-round sha-256.  Ethereum requires over a half-million rounds of keccak-512 to create the DAG cache, so the only practical way this could be designed in hardware would be to generate the code in software.

Why hasn't anyone done this yet?  Well, just because Ethash is not fundamentally memory hard, the alternative is area-hard.  Implementing a half-million different multi-round keccak-512 functions in silicon would take a lot of area.  If each implementation takes 10,000 gates (20 gates per bit), then the ASIC implementation would require at least 5 billion gates.  And that's only for the DAG cache, which is used to create the full DAG by mixing 256 cache entries to create one DAG entry.  The Ethash function itself mixes 64 DAG items, where each item is two consecutive entries that were created from the cache.

So while it is not fundamentally memory-hard, it is effectively memory hard at this time given the huge amount of chip area required to avoid the memory cache.
Jump to: