While you're here could you speak to the Groestl (Grøstl) algo's performance on NVIDIA compared with other algos
thanks,
Dorian (child_harold)
Groestl is quite slow because most implementations use a lot of random lookups in so-called T-Tables. Regardless whether you manage to get cached read access to these T-tables, or whether you put them into shared mem - the hardware still has issues with the random access from many threads. The same problem is seen on AMD devices, I suppose.
There are ways to work around these performance issues, and we have implemented one. I won't be more specific. Our different approach uses more power, but it is also up to 3 times as fast.
Christi
an
Thanks for the reply Christian:
3 questions please
1. Are the hardware issues experienced with the random access from many threads something like "parallelisztion resistance"? Or in other words would that make this algo offer low(er) ASIC advantage compared to other algos in use? Would producing ASIC hardware therefor be more difficult and result in a lower / reduced ASIC advantage compared with SHA-256?
2. So NVIDIA will run 3x current speed with your "different approach"!?
3. Can you imagine an efficient ARM miner for this algo (a BOUNTY has been offered)
Any other info bout what makes this algo different (quirks, surprises etc) from others used in crypto mining would be enlightening
thank you Christian for your generous contribution to this conversation