Author

Topic: Pollard's kangaroo ECDLP solver - page 143. (Read 60189 times)

member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 11:29:18 PM
#52
@Luc, then you provide release ?Smiley

Br.
member
Activity: 109
Merit: 13
A positive attitude changes everything *_*
May 02, 2020, 07:06:13 PM
#51
But with both signs (+-) or not?

Could you try with an average of 2^21 - 2^22 instead?
u try with an average of 2^21 - 2^22 instead?

Yes at the beginning I used the y sign to select positive or negative jump then I tried an otehr set of random negative jumps without success.
It seems that this "bownian motion" is rather tricky to tune.

Tomorrow, I'll try other things...


Only negative jumps? I would try with positive/negative jumps and a greater average (my sensation)
To me it looks promising. 1.50sqrt(N) would be amazing!
Another thing to pay attention to is: when we translate our interval, we have only 2^(n-1) x-coordinates, and then half DP. For the tuning it's like we worked in a 2^(n-1) space.


I was thinking another thing,
if you want to overcome/avoid the brownian-motion tuning problem,

go back and put together

1) the linearity (and therefore the simplicity) of the classic approach (that you used in the release 1.3)
2) the power of the symmetry

Exploiting the symmetry means that we use equivalence classes to increase the probability of a collision (same number of tries, half 'points'); in our case, given a point P(x,y), 'x' decides the width of the jump and 'y' only the direction.
In the secp256k1 there are 2 points that share the same 'x', (x,y) and (x,-y).

In a interval of consecutive points, all x-coordinates are different, but if we move the interval like we did it becomes symmetric.

The drawback of our approach is that the kangaroo, instead of going in only one direction, keep going back and forth, creating many loops. Besides, not only we don't know where a wild kangaroo is (obviously we don't know its position-private key), but we don't control even the direction of its motion (and often it comes out from the interval)


But this curve has another symmetry that respects the linearity (unlike the negative map that overturns the interval and the kangaroo algorithm): endomorphism.

There are 3 points with the same 'y' , that means that this curve has about 2^256 points and
 
1) 2^256 / 2 different x-coordinates
2) 2^256 / 3 different y-coordinates

Why don't we use the symmetry that mantains the linearity of the interval?

We can work on the y-coordinates, and a jump should be depend on 'y' and not on 'x'.
We set equivalence classes where [P] = {P, Q, R} if and only if  yP = yQ = yR

What is the relation between the private key of these points P, Q, R?

If P=k*G, then Q=k*lambda*G and R =k*lambda^2*G (endomorphism)
and P=(x,y) -> Q=(beta*x, y) -> R=(beta^2*x,y)

observation:

if P=k*G, i.e. k is the private key of P respect of the generator G, then
lambda*P = k*lambda*G, i.e. the private key of P'=lambda*P respect of the generator G'=lambda*G is always k.

In other words, if we know that k lies in [a,b], resolving the problem P=k*G is equivalent to resolve the problem P'=k*G'
The interval is the same (from the scalar's point of view) but there are actually 3 different intervals of points. They are all consecutive from a different generator's point of view:

           point                                scalar
[a*G,   (a+1)*G, ....,  b*G]     <-> [a,b]   interval 1
[a*G',  (a+1)*G', ....,  b*G']    <-> [a,b]   interval 2  where G'=lambda*G
[a*G'', (a+1)*G'', ...., b*G'']   <-> [a,b]   interval 3  where G''=lambda*G

We could work on  3 "spaces" simultaneously :

P=k*G, with all points with generator G   -> interval 1     jumps (1*G, 2*G, ,,,2^(n-1)*G)

P'=k*G' with all points with generator G' -> interval 2     jumps(lambda*G, lambda*2*G, .... lambda*2^(n-1)*G)

P''=k*G'' with all points with generator G'' -> interval 3  jumps(lambda^2*G, lambda^2*2*G, .... lambda^2*2^(n-1)*G)

if the movements of a kangaroo depends only by the y-coordinates, that means that if a kangaroo reach a point T in the first interval and another kangaroo reachs a point W in the second interval with the same y, we will have a collision!
From that point they will walk together along the same path (across the same equivalence classes) until they reach a DP (in this case a y-coordinate with some zero bits), we detect this collision.

EDIT:
We work in this way in a space of 3*2^n points (3 intervals), but with only 2^n y-coordinates, then we should have a collision with a higher probability.  WRONG

I just wanted to understand this: https://bitcointalksearch.org/topic/off-topic-5245379

legendary
Activity: 1948
Merit: 2097
May 02, 2020, 10:41:11 AM
#50
Anyway i did a test with dp 3 and it also converge to 2^21 (same algorithm, no change on the range in function of dp).

This is tricky, there is something I need to understand with those random walks...

The data of the article are:

N = 2^40
m = 2^11
θ = 2^−5

with  1.47*2^20, very strange...

Anyway I learned a lot of things in these 2 days, thank you!


I found that distinguished points are not all equal from the point of view of the probability, if you want to precompute them you have to choose carefully (don't simple reuse the DP you found in one run)

https://eprint.iacr.org/2012/458.pdf

I have the feeling that this algorithm win in average complexity by favoriting keys on the middle of the range as the density of wild is increased in the center. It has a worse 'worst case' and a much better 'best case' than the classic algorithm.

Maybe a hybrid approach, with some precomputed DP loaded in hashtable (with the precomputed DP near only to the extreme of the range) to compensate.
sr. member
Activity: 462
Merit: 701
May 02, 2020, 10:16:46 AM
#49
Yes GPU is not good on 40bit range, this 40bit file is for making stats on CPU.
To make accurate statistics you need to do it on CPU, the granularity of the counting on GPU is too large but I will try to find something in order to make tests on GPU.

Anyway i did a test with dp 3 and it also converge to 2^21 (same algorithm, no change on the range in function of dp).
This is tricky, there is something I need to understand with those random walks...

I have the feeling that this algorithm win in average complexity by favoriting keys on the middle of the range as the density of wild is increased in the center. It has a worse 'worst case' and a much better 'best case' than the classic algorithm.
sr. member
Activity: 443
Merit: 350
May 02, 2020, 09:45:18 AM
#48
-snip-
@MrFreeDragon
You need a large number of tests on key uniformly distributed in the range to get a correct estimation, at least 1000 trials.
Did you try with a jump table size of 32 or 64 ? NB_JUMP in GPUEngine.h:24 ?

I added the file I use for stats, it contains 1000 40 bits key uniformly distributed.
https://github.com/JeanLucPons/Kangaroo/blob/master/VC_CUDA8/in40_1000.txt

I tried with NB_JUMP 64 (as was by default)

As for 1000 40bit keys I gues that 40bit is very small range for GPU tests. I started your example file with 1000 jeys, and even it shows the average operations number 2^21.6, the GPU performs 2^25-2^26 operations just for 5-7 seconds and finds the key (in many cases it shows even Count 2^-inf). So the result is not reasonable. The GPU is only starting, but solves the key... and I do not see the exact speed of it during such tests. Or change grid size... for the test purpuses

The larger key is required, but with longer time of course.

There should be some range with a good balance between test results vs time required. With 5 min per key we need 3 days for the test. With 1 min per key only 16-17 hours. For 1080ti 70bit key could found for 1 minute with your program. May be 70bit key is that balance
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 09:20:02 AM
#47


@COBRAS
I release the code ASAP Wink

This will be great Luc.

If we reduce the number of calculations by reducing the dimension of the range in length and bits, we can get closer to a positive result, I think. So I think that processing calculation ranges is also very important Luc

Biggest Thanks. Smiley

sr. member
Activity: 462
Merit: 701
May 02, 2020, 08:48:32 AM
#46
With these data:
N = 2^40
m = 2^11
θ = 2^−5


still 2^21?

I have to try, I let you know...

@COBRAS
I release the code ASAP Wink
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 08:34:39 AM
#45
@Luc

Can you realise in your code modificarion of ranges ?


From:

Start-dfggsdgdsfh1111111
End- FFFFFABC112222233

To:
O

ABC888888

Etc...

Huh?



Biiigesr thank you.

member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 08:23:11 AM
#44
@Jean_Luc

In you examples:

0
FFFFFFFFFFFFFF
02E9F43F810784FF1E91D8BC7C4FF06BFEE935DA71D7350734C3472FE305FEF82A

[22.67 MKey/s][GPU 13.04 MKey/s][Count 2^29.06][Dead 0][28s][89.1MB]


Key# 0 Pub:  0x02E9F43F810784FF1E91D8BC7C4FF06BFEE935DA71D7350734C3472FE305FEF82A
       Priv: 0x378ABDEC51BC5D

Done: Total time 29s

I this your example in readme on github resalts was only 29 sec (!!!) this is more faster then use a ranges like this:

Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E0000000000000000
Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EFFFFFFFFFFFFFFFF

And I think more because  no need to research for RANGES(!)

I think what PUBKEY can not be > PRIVKEY, so our ranges not from 0.....FFFFFFFFFFFFFF, but from 0....(FFFFFFFF minus UnknownX HuhHuh?(mayby minus PYBKEY Huh??))

Huh
legendary
Activity: 1948
Merit: 2097
May 02, 2020, 08:19:01 AM
#43
I did a test using dp 0 (much slower):
The algorithm runs in 2^20.554 = ~1.47 sqrt(N) (1000 trials) as announced on the paper so it seems ok but I have to figure out why it is not the case for dp > 0. There is a relationship betwen dp and range size (m√2/3πθ) that I missed.
Will try to undersand, how to choose this m.

this is the difference ?

Quote
To detect small cycles we stored the previous 30 group elements in the walk
in the case N ≈ 2^34 (respectively, 30, 45 group elements for N ≈ 2^40, 2^48). Each
new step in the walk was compared with the previous 30 (respectively, 35, 45)
group elements visited. If this group element had already been visited then the
walk is in a cycle. We used a deterministic method to jump out of the cycle (using
a jump of distinct length from the other jumps used in the pseudorandom walk)
so that the pseudorandom walk as a whole remained deterministic
. The cost of
searching the list of previous group elements is not included in our experimental
results, but our count of group operations does include the “wasted” steps from
being in a cycle.

With these data:
N = 2^40
m = 2^11
θ = 2^−5


still 2^21?
sr. member
Activity: 462
Merit: 701
May 02, 2020, 08:08:43 AM
#42
I did a test using dp 0 (much slower):
The algorithm runs in 2^20.554 = ~1.47 sqrt(N) (1000 trials) as announced on the paper so it seems ok but I have to figure out why it is not the case for dp > 0. There is a relationship betwen dp and range size (m√2/3πθ) that I missed.
Will try to undersand, how to choose this m.

legendary
Activity: 1948
Merit: 2097
May 02, 2020, 04:24:36 AM
#41
This is a more update article:

https://www.ams.org/journals/mcom/2013-82-282/S0025-5718-2012-02641-X/S0025-5718-2012-02641-X.pdf

The number of bad collision is also link to the kangaroo density.

A note: probably half of your tames start from a even position (+24G, +1466G,... ) and the other half from a odd position (37G, 54365G, ....), and if you use the jumps: 1G, 2G, 4G, 8G,16G,... they are (except one) all even, then you have actually 2 groups that move in distinct areas. The same for the wild kangaroos.

It is like you worked in 2 subintervals, not good for the collision probability. At least you should concentrate all the wild kangaroos in a unique subinterval (all even or odd).

see this:


Quote
Four Kangaroo Method. We can make a further improvement. The starting points of W1 and W2,
namely x and −x, are integers of the same parity. Hence, to obtain wild-wild collisions more quickly one
should take walks whose jumps are all even length. However, now there is the possibility that the wild walks
will never collide with the tame walk. The solution is to have two tame kangaroos, T1 and T2, starting on
adjacent integers (one even and one odd). There is no possibility of a useless tame-tame collision, but there
is a chance of tame-wild collisions, as desired (though with only one of the tame kangaroos).
...
We now estimate the complexity of the method.
...

Hence, using 4 kangaroos gives an algorithm for the DLP in an interval whose heuristic average case
expected running time is (1.714 + o(1)).sqrt(N) group operations.
sr. member
Activity: 462
Merit: 701
May 02, 2020, 04:17:31 AM
#40
Yes
The above test was done with a stop&retry at 2^21.
I removed it and got 2^21.008 on 500 trials, so very similar.
As it is, it seems to converge to 2.sqrt(N) with a jump table of 16 positive random jumps and 16 negative random jumps.
The number of bad collision is also link to the kangaroo density.
legendary
Activity: 1948
Merit: 2097
May 02, 2020, 03:44:02 AM
#39
Yes, i think this might be due to the average jump size which is too large for wild.
I will try other things and what you suggest.

Remember this too:

Quote
To guard against bad steps of type 1 van Oorschot and Wiener [22] set a maximum number of steps in
a walk. They choose this maximum to be 20/θ steps and show that the proportion of bad
points of type 1 is at most 5 × 10^−8

...
Steven D. Galbraith and Raminder S. Ruprai:

We terminated walks which ran for 5/θ steps without arriving at a distinguished point (the usual recommendation is 20/θ steps; see [22]). This will give
us a slightly worse running time than optimal.


https://www.math.auckland.ac.nz/~sgal018/crypto-book/ch14.pdf

Quote
Exercise 14.5.12.
The distance between −a and a is even, so a natural trick is to use jumps of even length.  Since we don’t know whether a is even or odd, if this is done we don’t know whether to start the tame kangaroo at g^u or g^u+1.  However, one can consider a variant of the algorithm with two wild kangaroos (one starting from h and one from h−1) and two tame kangaroos (one starting from g^u and one from g^u+1) and with jumps of even length.
This is called the four-kangaroo algorithm.
Explain why the correct choice for the mean step size is m = 0.375√2w and why the heuristic average-case expected number of group operations is approximately 1.714√w. Galbraith, Pollard and Ruprai [226] have combined the idea of Exercise 14.5.12 and the Gaudry-Schost algorithm (see Section 14.7) to obtain an algorithm for the discrete logarithm problem in an interval of length w that performs (1.660 + o(1))√w group operations.

Galbraith, Ruprai and Pollard [226] showed that one can improve the kangaroo method by exploiting inversion in the group. This research actually grew out of writing this chapter. Sometimes it pays to work slowly.
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 03:42:08 AM
#38
P.s. v.1.4 is more stable - computation not downgrade with time and around stable level (in GPU this bee 500 Mkes/sec).

@Luc. Hardare is ready for any tests. Wink
sr. member
Activity: 462
Merit: 701
May 02, 2020, 03:36:48 AM
#37
Yes, i think this might be due to the average jump size which is too large for wild.
I will try other things and what you suggest.
I let you informed.
Thanks Wink
legendary
Activity: 1948
Merit: 2097
May 02, 2020, 03:11:19 AM
#36
Good news !
By changing the spreading of wild kangaroo to -N/8 N/8 the it seems to improve things. (using the Steven D. Galbraith?and Raminder S. method).
I got an average of 2^20.997 ~ 2.sqrt(n) on 1000 trials.

Great!! Now under 2.sqrt(n)!

I would be nice too a comparison with the classic approach (no equivalence classes) with these modifications:

T' = [−2N/3, 2N/3 - (m/theta)]  
W' = [k − N, k − N/3 - m/theta] ∪ [k + N/3, k + N - m/theta]

To recap:
the standard 1.4 version takes 2.21.sqrt(n) and the new 'equivalence classes/symmetry' based version takes 2.sqrt(n), -15% ! There is room for improvement!
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 02, 2020, 02:57:34 AM
#35
Good news !
By changing the spreading of wild kangaroo to -N/8 N/8 the it seems to improve things. (using the Steven D. Galbraith?and Raminder S. method).
I got an average of 2^20.997 ~ 2.sqrt(n) on 1000 trials.


:-)
Beeeeeegest thank you.
Br.
sr. member
Activity: 462
Merit: 701
May 02, 2020, 02:48:52 AM
#34
Good news !
By changing the spreading of wild kangaroo to -N/8 N/8 the it seems to improve things. (using the Steven D. Galbraith?and Raminder S. method).
I got an average of 2^20.997 ~ 2.sqrt(n) on 1000 trials.
legendary
Activity: 1948
Merit: 2097
May 02, 2020, 02:33:25 AM
#33
I think you are missing the point I am making.

You will find a collision if the solution is in one of the additional ranges opened up by the endomorphism.

First what is "our problem"?  Finding a that some point Q = kP has a discrete log with respect to some point P in a specific compact range of k -- [0, 2^80].  But why?    Instead you can solve a related problem: Find k for Q = kP where k's value is [+/-]1*[0, 2^78]*lambda^[0,2] in less time, even though that 'sparse' range has 1.5x the possible values.

Thanks. I admittedly didn't understand your previous post.

I think also if the interest is just in DL solving bragging rights or just the intellectual challenge in exploiting all the struture, the endomorphism is also interesting-- for that there isn't a particular problem structure so why not use the one that results in the fastest solver. Smiley  In the prime order group essentially all k values are equivalent, which is why you're able to shift the search around to look for that range wherever you want it in the 2^256 interval.

I'm trying to resolve the classic discrete logarithm problem via a 3-dimensional discrete logarithm problem ( https://www.math.auckland.ac.nz/~sgal018/RamCiren.pdf) using endomorphism, probably not a good idea.

If you resolve Q=a*P + b*P' + c*P'' with P'=lambda*P, P'' = lambda^2*P,
then you have that Q = k*P, where k=a + b*lambda + c*lambda^2

Jump to: