Author

Topic: Pollard's kangaroo ECDLP solver - page 141. (Read 60189 times)

legendary
Activity: 1948
Merit: 2097
May 06, 2020, 01:59:48 PM
#92
This might be interesting:

https://www.researchgate.net/publication/339319953_Using_Equivalent_Class_to_Solve_Interval_Discrete_Logarithm_Problem

https://link.springer.com/chapter/10.1007%2F978-3-030-41579-2_23

Quote
To accelerate solving IDLP, we first introduce the concept of jumping distance and expanding factor to decide whether to perform the class operation or not. When the value of the expanding factor is greater than a given value, the class operation will be performed, such that each decision on jumps is locally optimal. The improved method takes an average of (1+o(1))√N times of class operation, where N is the size of a given interval.
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 06, 2020, 01:42:35 AM
#91
@Luc. Hello.

Please make sure that if the code does not find the privkey , it is switched to the next one. And you can make the code downgrade dimension of pubkey automaticaly ??

Huh

I'm looking forward to the release.

BIIG THANK YOU !!!! Smiley
member
Activity: 144
Merit: 10
May 05, 2020, 01:34:22 PM
#90
Not looking too good either on the CPU side of things with symmetry enabled; I terminated the application after 4.5 hours.

AMD Ryzen 7 1800X CPU with Symmetry:
Code:
./kangaroo -t 21 -d 16 in70.txt
Kangaroo v1.4gamma (with symmetry)
Start:1FFFFFFFFFFFFFFFFF
Stop :3FFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 21
Range width: 2^70
Jump Avg distance: 2^34.96
Number of kangaroos: 2^14.39
Suggested DP: 20
Expected operations: 2^35.59
Expected RAM: 61.4MB
DP size: 16 [0xffff000000000000]
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 17: 1024 kangaroos
SolveKeyCPU Thread 19: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 18: 1024 kangaroos
SolveKeyCPU Thread 16: 1024 kangaroos
SolveKeyCPU Thread 20: 1024 kangaroos
[44.34 MK/s][GPU 0.00 MK/s][Count 2^39.28][Dead 277][04:32:02 (Avg 19:29)][32.4MB]
jr. member
Activity: 43
Merit: 1
May 05, 2020, 10:10:07 AM
#89
OK, i you want to test large range and symmetry, put 512 in the NB_JUMP...

I'm currently working on the save work options:
Code:
-w workfile: Specify file to save work into
 -i workfile: Specify file to load work from
 -wi workInterval: Periodic interval (in seconds) for saving work

Concerning optimization of the complexity, I will investigate on the Wild compression and Tame/Wild ratio which seems more promising to me...


That's very good, the save work option will be very useful when testing [95-105]-bit intervals
And also multiPubKey search at one time Smiley
member
Activity: 144
Merit: 10
May 05, 2020, 08:46:58 AM
#88
OK, i you want to test large range and symmetry, put 512 in the NB_JUMP...

I'm currently working on the save work options:
Code:
-w workfile: Specify file to save work into
 -i workfile: Specify file to load work from
 -wi workInterval: Periodic interval (in seconds) for saving work

Concerning optimization of the complexity, I will investigate on the Wild compression and Tame/Wild ratio which seems more promising to me...


That's very good, the save work option will be very useful when testing [95-105]-bit intervals
sr. member
Activity: 462
Merit: 701
May 05, 2020, 08:41:58 AM
#87
OK, i you want to test large range and symmetry, put 512 in the NB_JUMP...

I'm currently working on the save work options:
Code:
-w workfile: Specify file to save work into
 -i workfile: Specify file to load work from
 -wi workInterval: Periodic interval (in seconds) for saving work

Concerning optimization of the complexity, I will investigate on the Wild compression and Tame/Wild ratio which seems more promising to me...
member
Activity: 144
Merit: 10
May 05, 2020, 08:33:57 AM
#86
Thanks for the tests Wink
Stop your program, it will never ends if you have let the NB_JUMP to 32 and even with 512 I don't know if it will end Wink
The second test with CPU is not displaying: "Kangaroo v1.4gamma (with symmetry)" ?
Did you git pull between the correction I made concerning this (commit at 13:22 UTC+2) ?


No, it was from yesterday’s commit, now running CPU tests with latest commit.
sr. member
Activity: 462
Merit: 701
May 05, 2020, 08:20:57 AM
#85
Thanks for the tests Wink
Stop your program, it will never ends if you have let the NB_JUMP to 32 and even with 512 I don't know if it will end Wink
The second test with CPU is not displaying: "Kangaroo v1.4gamma (with symmetry)" ?
Did you git pull between the correction I made concerning this (commit at 13:22 UTC+2) ?

member
Activity: 144
Merit: 10
May 05, 2020, 07:59:20 AM
#84
@Jean_Luc

Below are the CPU/GPU results for an interval 70-bit in size with and without the use of symmetry. So far, the use of symmetry is not looking too good for intervals greater than ~45-bit in size.

AMD Ryzen 7 1800X CPU without Symmetry:
Code:
 ./kangaroo -t 16 -d 16 in70.txt
Kangaroo v1.4beta
Start:1FFFFFFFFFFFFFFFFF
Stop :3FFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^70
Jump Avg distance: 2^35.02
Number of kangaroos: 2^14.00
Suggested DP: 20
Expected operations: 2^36.15
Expected RAM: 89.6MB
DP size: 16 [0xffff000000000000]
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
[53.01 MK/s][GPU 0.00 MK/s][Count 2^36.10][Dead 0][26:31 (Avg 23:55)][91.4MB]
Key# 0 Pub:  0x0290E6900A58D33393BC1097B5AED31F2E4E7CBD3E5466AF958665BC0121248483
       Priv: 0x349B84B6431A6C4EF1

Done: Total time 26:32

AMD Ryzen 7 1800X CPU with Symmetry:
Code:
./kangaroo -t 16 -d 16 in70.txt
Kangaroo v1.4gamma
Start:1FFFFFFFFFFFFFFFFF
Stop :3FFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^70
Jump Avg distance: 2^34.00
Number of kangaroos: 2^14.00
Suggested DP: 20
Expected operations: 2^35.55
Expected RAM: 59.7MB
DP size: 16 [0xffff000000000000]
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
[35.34 MK/s][GPU 0.00 MK/s][Count 2^39.22][Dead 416][05:39:09 (Avg 23:47)][99.6MB]
Key# 0 [1S]Pub:  0x0290E6900A58D33393BC1097B5AED31F2E4E7CBD3E5466AF958665BC0121248483
       Priv: 0x349B84B6431A6C4EF1
[  0] 2^39.219 Dead:416 Avg:2^39.219 (2^35.554)

Done: Total time 05:39:10

Nvidia RTX 2070 GPU without Symmetry:
Code:
./kangaroo -t 0 -d 14 -gpu -gpuId 0 in70.txt
Kangaroo v1.4gamma
Start:1FFFFFFFFFFFFFFFFF
Stop :3FFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^70
Jump Avg distance: 2^33.97
Number of kangaroos: 2^20.17
Suggested DP: 14
Expected operations: 2^36.40
Expected RAM: 423.2MB
DP size: 14 [0xfffc000000000000]
GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(72x128) (117.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos in 7948.8ms
[857.53 MK/s][GPU 857.53 MK/s][Count 2^35.58][Dead 1][01:08 (Avg 01:45)][243.9MB]
Key# 0 [1S]Pub:  0x0290E6900A58D33393BC1097B5AED31F2E4E7CBD3E5466AF958665BC0121248483
       Priv: 0x349B84B6431A6C4EF1
[  0] 2^35.606 Dead:1 Avg:2^35.606 DeadAvg:1.0 (2^36.400)

Done: Total time 01:18

Nvidia RTX 2070 GPU with Symmetry still running:
Code:
./kangaroo -t 0 -d 14 -gpu -gpuId 0 in70.txt
Kangaroo v1.4gamma (with symmetry)
Start:1FFFFFFFFFFFFFFFFF
Stop :3FFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^70
Jump Avg distance: 2^34.98
Number of kangaroos: 2^20.17
Suggested DP: 14
Expected operations: 2^36.06
Expected RAM: 333.8MB
DP size: 14 [0xfffc000000000000]
GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(72x128) (153.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos in 8044.0ms
[826.63 MK/s][GPU 826.63 MK/s][Count 2^39.99][Dead 1845][25:06 (Avg 01:26)][27.0MB]
sr. member
Activity: 462
Merit: 701
May 05, 2020, 06:30:51 AM
#83
I committed the code for the GPU.
Symmetry is not enabled by default, you have to edit the Constants.h file and comment out the USE_SYMMETRY define.

Code:
// Use symmetry
//#define USE_SYMMETRY

You can change the jump table size in the same file but high number decrease GPU performance.

Code:
// Number of random jumps
// Max 512 for the GPU
#define NB_JUMP 32

Unfortunately, symmetry on large range without killing GPU performance seems difficult as lots of kangaroo end in infinite cycle.
After all, the title of the article was: "Using Equivalence Classes to Accelerate Solving the Discrete Logarithm Problem in a Short Interval"
They well mention this problem of cycle in the article.

The trick of saving the last jump avoid cycle of 2, but not the others.
May be solution could be to precompute a random jump table in order to minimize cycles, but it could be difficult for large random walks of 2^dp. If we recreate kangaroo at each ~2^dp iterations, the creation of a kangaroo is expensive.
Symmetry can bring a maximum gain of sqrt(2) = ~1.41 , so to win, we have to find a way to do all the job in less than 41%.


sr. member
Activity: 462
Merit: 701
May 05, 2020, 12:57:07 AM
#82
Hi,

Thanks for this detailed report, very interesting !
I did few quick tests with a logic slightly different than yours, still only positive jump in the table and a I get the next one if equal to the last one.

Code:
#ifdef USE_SYMMETRY
        // Limit cycle
        if( jmp == herd[g].lastJump ) jmp = (herd[g].lastJump + 1) % NB_JUMP;
#endif

After 100 trials, DP=5, 40bit search, 2048 kangaroos:

With symmetry: Here only cycles or natural path collisions which have a distinguished point on it are detected.
The average length of walks is Avg/2048 = 2^20.614/2048 = 783

NB_JUMP = 32
[100] 2^20.263 Dead:44 Avg:2^20.823 DeadAvg:71.7 (2^20.614)

NB_JUMP = 64
[100] 2^19.985 Dead:17 Avg:2^20.662 DeadAvg:28.4 (2^20.614)

NB_JUMP = 128
[100] 2^19.346 Dead:4 Avg:2^20.581 DeadAvg:19.6 (2^20.614)

NB_JUMP = 256
[100] 2^20.938 Dead:20 Avg:2^20.510 DeadAvg:15.1 (2^20.614)

Then above 256 the dead average seems to converge to ~15, which should be the natural collision rate when using symmetry for this configuration.

No symmetry: Here we have only natural collisions in same herd, which is independent of the table size.

NB_JUMP = 64
[100] 2^22.203 Dead:5 Avg:2^21.098 DeadAvg:1.3 (2^21.097)

NB_JUMP = 128
[100] 2^19.432 Dead:0 Avg:2^21.044 DeadAvg:1.2 (2^21.097)

NB_JUMP = 256
[100] 2^20.805 Dead:1 Avg:2^20.959 DeadAvg:1.4 (2^21.097)

legendary
Activity: 1948
Merit: 2097
May 04, 2020, 04:36:13 PM
#81
Recap of my tests:


1)

normal generation of jumps

20000 walks
num_jumps: 32 + 32
max_length of each walk = 32
jumpbit = 20

7516 loops of length 2, 118 of length 4:

(7516+118+4+2) / 20000 = 38,2 % of the walks have a loop before 32 steps

7516/20000 = 37,58% of the walks have a loop with 2 only points

[7516, 0, 118, 0, 4, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

----------------------------------------------------------------------------------------------------------------------------------------------------

2)

if(next_jump != -prev_jump): ok
else
 next_jump = prev_jump  #(you double the previous jump)

20000 walks
num_jumps: 32 + 32
max_length of each walk = 32
jumpbit = 20

0 loops of length 2, only 139 loops of length 4 and 7 of length 6:

(139+7+9+1+3) / 20000 = 0,795 % against  38,2%!

[0, 0, 139, 0, 7, 0, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

----------------------------------------------------------------------------------------------------------------------------------------------------

3)

if(next_jump != -prev_jump): ok
else
 next_jump = -sign(next_jump)*floor(distAvg)*G   #a different step, with the opposite sign of the normal step

20000 walks
num_jumps: 32 + 32
max_length of each walk = 32
jumpbit = 20

0 loops of length 2, only 128 loops of length 4 and 6 of length 6, 2 of length 8:

(128 + 1 + 6 +2 + 1 + 1) / 20000 = 0,695% of the walks have a loop before 32 steps

[0, 0, 128, 1, 6, 0, 2, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

-------------------------------------------------------------------------------------------------------------------------------------------------------

4)

if(next_jump != -prev_jump): ok
else
 next_jump = prev_jump  #(you double the previous jump)

20000 walks
num_jumps: 64 + 64
max_length of each walk = 32
jumpbit = 20

0 loops of length 2, only 40 loops of length 4:

(40+1+1) / 20000 = 0,21 % of the walks enter in a loop in less than 32 steps

[0, 0, 0, 40, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

loops of length 4 are of "second order":

probability that (a+b+c+d=0) = (1/128)*(127/128)*(1/128)
probability that each sequence of 4 consecutive points don't create a loop = (1 - (1/128)*(127/128)*(1/128))^32
probability to find a loop in a walk of 32 steps: 1 - (1 - (1/128)*(127/128)*(1/128))^32 =  0.001936

expected value of number of walks with a loop of 4 points: 2000 * 0.001936 = 38,7
-------------------------------------------------------------------------------------------------------------------------------------------------------

5)

if(next_jump != -prev_jump): ok
else
 next_jump = prev_jump  #(you double the previous jump)

20000 walks
num_jumps: 64 + 64
max_length of each walk = 128
jumpbit = 20

157 loops of length 4, only 4 loops of length 6:

(157+4+2+1+1+1+1+1+1+2+1+1+1+1+1+1+1) / 20000 = 0,89% of the walks enter in a loop in less than 128 steps


[0, 0, 157, 0, 4, 0, 2, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sr. member
Activity: 462
Merit: 701
May 04, 2020, 12:41:25 PM
#80
OK thansk I'll try tomorow
The 1.4gamma is still under dev...
member
Activity: 144
Merit: 10
May 04, 2020, 12:27:34 PM
#79
@Jean_Luc

For larger search intervals, the previous version (v1.4beta) appears to be significantly faster in terms of wall clock time than the current version (v1.4gamma). I am ready to test with a 90-bit search intervals once the latest GPU code is ready.

AMD Ryzen 7 1800X Eight-Core Processor with program defaults (v1.4beta):
Code:
./kangaroo -t 16 in65.txt
Kangaroo v1.4beta
Start:FFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^65
Jump Avg distance: 2^32.01
Number of kangaroos: 2^14.00
Suggested DP: 17
Expected operations: 2^33.65
Expected RAM: 8.8MB
DP size: 17 [0xffff800000000000]
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
[52.89 MK/s][GPU 0.00 MK/s][Count 2^33.00][Dead 1][03:06 (Avg 04:14)][10.5MB]
Key# 0 Pub:  0x0230210C23B1A047BC9BDBB13448E67DEDDC108946DE6DE639BCC75D47C0216B1B
       Priv: 0x1A838B13505B26867

Done: Total time 03:06

AMD Ryzen 7 1800X Eight-Core Processor with DP=8 (v1.4beta):

Code:
./kangaroo -t 16 -d 8 in65.txt
Kangaroo v1.4beta
Start:FFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^65
Jump Avg distance: 2^31.97
Number of kangaroos: 2^14.00
Suggested DP: 17
Expected operations: 2^33.65
Expected RAM: 4011.6MB
DP size: 8 [0xff00000000000000]
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
[49.60 MK/s][GPU 0.00 MK/s][Count 2^33.43][Dead 0][04:20 (Avg 04:31)][3443.0MB]
Key# 0 Pub:  0x0230210C23B1A047BC9BDBB13448E67DEDDC108946DE6DE639BCC75D47C0216B1B
       Priv: 0x1A838B13505B26867

Done: Total time 04:33
legendary
Activity: 1948
Merit: 2097
May 04, 2020, 12:11:53 PM
#78
Many thanks.
I will try this tomorrow. Hope the test of the last jump and the symmetry will no kill GPU performance.
We have to find a determinist logic to get an other jump if we get the opposite.
Thanks Wink

It is only a test between 2 integers (32 bit <-> 32 bit, a previous index with a current index in an array of 32/64 elements,  I don't think it is too much )

you compute the next jump as usual, then

if(next_jump != -prev_jump): ok
else
 next_jump = prev_jump  (you double the previous jump)

P -> P+k*G -> P+k*G+k*G

Note: in this case the inverse of (x2-x1) is the same:

(P + k*G) + k*G has the same inverse of (P+k*G) - k*G,

then you can compute the inverse before to check the sign of the y-coordinate and the prev_jump
member
Activity: 144
Merit: 10
May 04, 2020, 11:59:43 AM
#77
@Kpot87
Please send a private message and I will respond when I have a minute or two.

@Jean_Luc
Please see below for 65-bit interval tests with your latest CPU (Kangaroo v1.4gamma) changes:

AMD Ryzen 7 1800X Eight-Core Processor with program defaults:
Code:
./kangaroo -t 16 in65.txt
Kangaroo v1.4gamma
Start:FFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^65
Jump Avg distance: 2^31.00
Number of kangaroos: 2^14.00
Suggested DP: 17
Expected operations: 2^33.33
Expected RAM: 7.3MB
DP size: 17 [0xffff800000000000]
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
[33.27 MK/s][GPU 0.00 MK/s][Count 2^35.06][Dead 145][19:25 (Avg 05:25)][7.5MB]
Key# 0 [1S]Pub:  0x0230210C23B1A047BC9BDBB13448E67DEDDC108946DE6DE639BCC75D47C0216B1B
       Priv: 0x1A838B13505B26867
[  0] 2^35.061 Dead:145 Avg:2^35.061 (2^33.333)

Done: Total time 19:27

AMD Ryzen 7 1800X Eight-Core Processor with DP=8:
Code:
./kangaroo -t 16 -d 8 in65.txt
Kangaroo v1.4gamma
Start:FFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 16
Range width: 2^65
Jump Avg distance: 2^31.00
Number of kangaroos: 2^14.00
Suggested DP: 17
Expected operations: 2^33.05
Expected RAM: 2659.0MB
DP size: 8 [0xff00000000000000]
SolveKeyCPU Thread 0: 1024 kangaroos
SolveKeyCPU Thread 4: 1024 kangaroos
SolveKeyCPU Thread 8: 1024 kangaroos
SolveKeyCPU Thread 10: 1024 kangaroos
SolveKeyCPU Thread 9: 1024 kangaroos
SolveKeyCPU Thread 12: 1024 kangaroos
SolveKeyCPU Thread 7: 1024 kangaroos
SolveKeyCPU Thread 3: 1024 kangaroos
SolveKeyCPU Thread 1: 1024 kangaroos
SolveKeyCPU Thread 5: 1024 kangaroos
SolveKeyCPU Thread 13: 1024 kangaroos
SolveKeyCPU Thread 14: 1024 kangaroos
SolveKeyCPU Thread 15: 1024 kangaroos
SolveKeyCPU Thread 2: 1024 kangaroos
SolveKeyCPU Thread 11: 1024 kangaroos
SolveKeyCPU Thread 6: 1024 kangaroos
[34.98 MK/s][GPU 0.00 MK/s][Count 2^34.42][Dead 638][12:46 (Avg 04:14)][1250.3MB]
Key# 0 [1S]Pub:  0x0230210C23B1A047BC9BDBB13448E67DEDDC108946DE6DE639BCC75D47C0216B1B
       Priv: 0x1A838B13505B26867
[  0] 2^34.425 Dead:639 Avg:2^34.425 (2^33.054)

Done: Total time 12:52
sr. member
Activity: 462
Merit: 701
May 04, 2020, 11:54:06 AM
#76
Many thanks.
I will try this tomorrow. Hope the test of the last jump and the symmetry will no kill GPU performance.
We have to find a determinist logic to get an other jump if we get the opposite.
Thanks Wink
legendary
Activity: 1948
Merit: 2097
May 04, 2020, 11:20:46 AM
#75
I use a random jump table with a controlled average.
I store as before, no inforamtion on symetry is needed, the key is solved later.


Code:
void Kangaroo::CreateJumpTable() {

  int jumpBit = rangePower / 2;
  if(jumpBit > 128) jumpBit = 128;
  int maxRetry = 100;
  bool ok = false;
  double maxAvg = pow(2.0,(double)jumpBit - 0.95);
  double minAvg = pow(2.0,(double)jumpBit - 1.05);
  double distAvg;
  //::printf("Jump Avg distance min: 2^%.2f\n",log2(minAvg));
  //::printf("Jump Avg distance max: 2^%.2f\n",log2(maxAvg));

  // Kangaroo jumps

  // Positive only
  while(!ok && maxRetry>0 ) {
    Int totalDist;
    totalDist.SetInt32(0);
    for(int i = 0; i < NB_JUMP; ++i) {
      jumpDistance[i].Rand(jumpBit);
      if(jumpDistance[i].IsZero())
        jumpDistance[i].SetInt32(1);
      totalDist.Add(&jumpDistance[i]);
    }
    distAvg = totalDist.ToDouble() / (double)(NB_JUMP);
    ok = distAvg>minAvg && distAvg    maxRetry--;
  }

  for(int i = 0; i < NB_JUMP; ++i) {
    Point J = secp->ComputePublicKey(&jumpDistance[i]);
    jumpPointx[i].Set(&J.x);
    jumpPointy[i].Set(&J.y);
  }

  if(!ok) {
    ::printf("Warning, jump Avg distance out of bounds: 2^%.2f (restart the program)\n",log2(distAvg));
  } else {
    ::printf("Jump Avg distance: 2^%.2f\n",log2(distAvg));
  }

}

I simulated many walks with your CreateJumpTable (I did a porting in python).

I found a very interesting thing: finally I know how are these loops!!

I started 2000 walks (not together, I was interested in 'selfcollisions' only) from a position '0', and using only integer numbers I found out that:

almost all cycles have length exactly = 2!

I generated 32 positive jumps (and the 32 negative, don't worry, it is like you did but I prefer to write them)

Code:
jumpBit = 20
[b]Jump Avg distance[/b]:
minAvg, [b]distAvg[/b], maxAvg
506428.82601934916 [b]530326.5625[/b] 542776.9763909484

[907868, -907868, 716251, -716251, 1007711, -1007711, 720977, -720977, 901512, -901512, 835718, -835718, 363381, -363381, 475363, -475363, 770732, -770732, 217555, -217555, 67168, -67168, 615504, -615504, 584908, -584908, 768134, -768134, 450427, -450427, 926634, -926634, 458021, -458021, 854265, -854265, 589522, -589522, 280841, -280841, 172112, -172112, 32886, -32886, 279644, -279644, 525179, -525179, 200017, -200017, 372430, -372430, 66448, -66448, 905754, -905754, 1000160, -1000160, 331424, -331424, 170629, -170629, 401275, -401275]

and these are the results:

I got 787 (39,35%) collisions (only cycles!!) over 2000 walks (I stopped each walk after 32 steps) but the most important thing is that I understood why.

783 over 787 has length 2!!  Only 2 loops have length 4 and other 2 loops have length 6.

This is logic: what is the chance that a point Q produces a loop of length 2?

Q -> R -> Q   The probability is 1/64 (1/(number of effectively different jumps), if the jumps from R and Q is opposite to the jumps from Q and R.

Then when you produce a sequence of points, each point you generate has the probability 1/(number of effectively different jumps)) to create a loop of length 2 with the next point!

Some basic computation:  

probability to meet a 1 point that create a loop of length 2 with the next point: 1/NUM_JUMPS
probability to generate 32 (or 2^DP) consecutive points that not produce a 2-loop : (1 - 1/NUM_JUMPS)^32
probability to generate 32 consecutive points with at least one that produces a 2-loop: 1 - (1-NUM_JUMPS)^32

in my case: 1 - (1 - 1/64)^32 = 0.39586 probability of collision in a walk

The bad news is that I had to put a large number of random jump (65536) to avoid cycles and this will decrease GPU performance Sad
I will investigate on choosing these jumps in order to reduce probability of cycle and having the smallest possible jump table.


This explain perfectly why you have to use 65536 jumps instead of 32, because in this way you got:

1 - (1 - 1/65536)^32 = 0.000488

The simplest way to avoid to use a large number of jumps and to avoid almost all loops is to modify the condition of the jump, you have to store the last movement (jump):

you have: P -> Q -> R

when you have to decide how to do the next jump (from Q to R), do it normally unless the random jump is the opposite of the previous, in that case instead of doing -k*G do +k*G (in this way you avoid the loop and mantain the symmetry).

EDIT: the probability to have a loop of length 3 is instead very low, because you should have 3 numbers (jump) a,b,c such that a + b + c = 0 and that is not frequent (a,b,c are 3 random numbers between 32(64) generated in a 2^20 space!)

Second test:

20000 walks:

7610 loops of length 2, 101 loops of length 4, 5 loops of length 6,  ....
this is the sequence of lenght of each loop:

-> 7610, 0, 101, 0, 5, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 3, 0, 9, 0

38,05 %  of walk fall in a 2-loop!
38,66 % of loops in walks limited to 32 steps ...

jr. member
Activity: 43
Merit: 1
May 04, 2020, 10:50:18 AM
#74
@Jean_Luc

On the GPU side of things, I get different estimations for "Suggested DP", "Expected operations", and "Expected RAM" for the same problem/interval size on machines with dissimilar hardware configurations, please see below. It may be also helpful to add standard estimations based on the interval size alone. For example, on one my of servers configured with 8x RTX 2080 Ti while using the program defaults, the performance in terms iterations per second quickly degrades from ~10G to less than ~1G.

Thanks

8x RTX 2080 Tis with PL=240W:
Code:
./Kangaroo14 -t 0 -d 20 -gpu -gpuId 0,1,2,3,4,5,6,7 in85.txt
Kangaroo v1.4beta
Start:FFFFFFFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^85
Jump Avg distance: 2^41.99
Number of kangaroos: 2^24.09
Suggested DP: 17
Expected operations: 2^44.36
Expected RAM: 1646.6MB
DP size: 20 [0xfffff00000000000]
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #5 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#5: creating kangaroos...
GPU: GPU #6 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#6: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #7 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#7: creating kangaroos...
GPU: GPU #4 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#4: creating kangaroos...
SolveKeyGPU Thread GPU#7: 2^21.09 kangaroos in 19929.3ms
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos in 20113.4ms
SolveKeyGPU Thread GPU#5: 2^21.09 kangaroos in 20136.4ms
SolveKeyGPU Thread GPU#6: 2^21.09 kangaroos in 20160.7ms
SolveKeyGPU Thread GPU#4: 2^21.09 kangaroos in 20179.5ms
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos in 20247.4ms
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos in 22114.7ms
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos in 22203.1ms
[9802.06 MK/s][GPU 9802.06 MK/s][Count 2^44.27][Dead 2][41:14 (Avg 38:27)][1550.0MB]
Key# 0 Pub:  0x0329C4574A4FD8C810B7E42A4B398882B381BCD85E40C6883712912D167C83E73A
       Priv: 0x11720C4F018D51B8CEBBA8

Done: Total time 41:48

1x RTX 2070 with PL=120W:
Code:
./Kangaroo14 -t 0 -d 20 -gpu -gpuId 0 in85.txt
Kangaroo v1.4beta
Start:FFFFFFFFFFFFFFFFFFFFF
Stop :1FFFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^85
Jump Avg distance: 2^42.03
Number of kangaroos: 2^20.17
Suggested DP: 21
Expected operations: 2^43.65
Expected RAM: 1003.7MB
DP size: 20 [0xfffff00000000000]
GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(72x128) (117.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos in 9119.2ms
[768.80 MK/s][GPU 768.80 MK/s][Count 2^43.99][Dead 3][07:13:01 (Avg 04:58:44)][1278.3MB]
Key# 0 Pub:  0x0329C4574A4FD8C810B7E42A4B398882B381BCD85E40C6883712912D167C83E73A
       Priv: 0x11720C4F018D51B8CEBBA8

Done: Total time 07:13:16



Hi! How you connect 8RTX ? Rizer? Thanks
sr. member
Activity: 462
Merit: 701
May 04, 2020, 10:07:15 AM
#73
Many thanks.
Note that here you have an expected of 2^21 because it take in consideration the overload due dp and the number of kangaroos.
I see that it over estimate a bit.
These results are very interesting to adapt the calculation of expected memory and operations.

To answer a bit in details to HardwareCollector, these calculations are dependent on the range size, the number of kangaroo (so the number of thread, grid size, number of GPU,... ) and dp (number of distinguished point bits) so it is normal that you get different results on different configurations.

The calculation of suggested DP is a poor approximation , it need some works to improve it.

Jump to: