Can anyone try my tool on a 2080ti? On a 2080S it gets around 1300MKeys/sec when using 24-bit DP.
I tried your tool (DP18) on a V100.
[2020-06-22.11:35:35] [Info] DP: 0 TP: 0 853.74 Mpt/s (64 iter/s)
[2020-06-22.11:35:37] [Info] Verifying 40336 results
[2020-06-22.11:35:45] [Info] DP: 0 TP: 0 992.50 Mpt/s (75 iter/s)
[2020-06-22.11:35:48] [Info] Verifying 40362 results
[2020-06-22.11:35:55] [Info] DP: 0 TP: 0 991.18 Mpt/s (75 iter/s)
Kangaroo on a server too, configured in the same way, however, it is not clear how many kangaroo are running in parallel with your program and what grid setting is used.
GPU: GPU #0 Tesla V100-PCIE-16GB (80x64 cores) Grid(160x128) (207.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.32 kangaroos [11.2s]
[2000.07 MK/s][GPU 2000.07 MK/s][Count 2^37.48][01:52][Server OK]