I think there's a paper about this already:
https://www.iacr.org/archive/pkc2010/60560372/60560372.pdf
You probably need to detect frutiless cycles with this method (stuck kangaroos in a loop without distinguished points).
Thank for this reading.
In abstract it was noted that the method "to solve the DLP in an interval of size N with heuristic average case expected running time of close to 1.36√N group operations for groups with fast inversion".
We do not have fast inversions. They also showed in practice that the total number of operations was not 1.36sqrt(N), but 1.46-1.49sqrt(N).
1.49 is 25% less than 2. Th question is, if we perform 2sqrt(N) operations with the speed 1000MKey/sec, what will be the speed for 1.49sqrt(N) operations? If it is only 5-10% less, probably nice. But if it decreases down twice to 500MKey/sec?