All programs from JLP are perfectly optimized and if it were possible to improve the result, then the author would definitely do it.
To "double the speed" there is only one solution.
This is to use not only the addition of the kangaroo jump, but also the subtraction.
This will not require many resources, but will double the number of tested points and thus the number of distinguished points.
But it will also slow down the progress of all kangaroos.
Can this help, maybe, or maybe it will just overload the hash table with extra distinguished points.
With this modification also need to turn off the check for dead kangaroos, because a kangaroo hitting the same position that was left
after the subtraction does not mean that the kangaroo is following the trail of another kangaroo.
But as i said above, the author would have done it if it had worked.
Etar, you are a smart programmer. It’s really not that much to mod. I made some simple mods and increased speeds by at least 3x on most cards.
But hey, y’all can say what you want to. Water off my back.
Maybe I’ll make a video to compare how many DPs JLP’s stock Kangaroo can get in x amount of minutes and then compare that to mine. But even then, there would be doubters.
But we can do a test; a timed test. Someone can give me a range, say a 44 bit range and I will run it with my mod at DP 29, and post results. 44 bit range with DP 29 should yield 2^15 DPs in 38 minutes using 1 RTX 4090.