High-Performance CUDA Kernel Execution on FPGAs/ p.s. FPGA 4xFaster then CUDA GPU
https://cadlab.cs.ucla.edu/~cong/papers/FCUDA_extAbstract_ICS09_final3.pdf
Jast for Ex. There are many prommable ASICS and ASICS with FPGA
FPGA Miner https://www.ebay.com/itm/BlackMiner-F1-FPGA-Better-than-Bitmain-Antminer-ASIC-UK-In-Hand-w-PSU-and-SD/124207553835?hash=item1ceb58dd2b:g:nMgAAOSwT9lcq7AE
"FPGAs can be uploaded with a public or a private bitstream(s) so that you can mine a new algorithm (you can't do this with ASICs). "
If the price of equipment and cost of technology implementation does not important for you, better have a look at quantum computers. Instead of normal bits (with 2 possible values 0 or 1) they operate with quibits: could be represented as a sphere with many many many different states.
How do you think, will quantum computer will be faster than ASIC?
Qantum computer now for send signal to enoter galxy ))) No, notcommutation this is intresting for education, because not all peoe know about 2+3<>3+2 (If you cnoq quantums I thik you know what in qant wolrd 2+3 qantr not equel 3+2 qants !!!) F.... But, my exact opinion, GPU for making 10trillions operation is not so good like ald ASIC with FPGA.
So, in my opinion only, most wanted targets for Kangaroo progect is a cilen-server(witjout bugs and fine worked) and FPGA-ASIC''S
And I think interesting idea to implement Hensel's Lift like in this code to kangaroo for finding ranges for ex. https://github.com/elliptic-shiho/ecpy/blob/ccdb872124ca2c218b8a7261a2956efd5ec83705/ecpy/elliptic_curve/sssa_attack.py#L1