Author

Topic: Pollard's kangaroo ECDLP solver - page 136. (Read 55544 times)

legendary
Activity: 1914
Merit: 2071
May 10, 2020, 05:16:46 AM
The current record for a ECDLP solved on a curve over a prime field is 114-bit:

https://ellipticnews.wordpress.com/2018/04/22/114-bit-ecdlp-solved-on-a-curve-with-automorphisms-over-a-prime-field/

Quote
The curve has j-invariant 0, and so has an automorphism group of size 6. Hence, it is possible to perform the Pollard rho algorithm using equivalence classes of size 6.

They used n = 1024 partitions for the random walk, and the “hash function” was chosen to be the least significant log_2(n) bits of the x-coordinate of the current curve point.

The paper writes that “The parallel implementation of the rho method by adopting a client-server model, using 2000 CPU cores took about 6 months”. They seem to have been lucky to get a collision earlier than expected: “the result of the authors attack is little bit better than the average number of rational points where a simple collision attack stops.”



For the secp256k1, the current record is a ECDLP solved in a interval of 104 bit (key #105 of the "puzzle transaction")

https://www.blockchain.com/btc/tx/08389f34c98c606322740c0be6a7125d9860bb8d5cb182c02f98461e5fa6cd15

that key was found on 2019-09-23.


This is the next public key (#110, with a private key in range [ 2^109 , 2^110 - 1], 109 bit) they have been looking for over 7,5 months (about 225 days):
 
0309976ba5570966bf889196b7fdf5a0f9a1e9ab340556ec29f8bb60599616167d

(address: 12JzYkkN76xkwvcPT6AWKZtGX6w2LAgsJg)

The Pollard's kangaroo ECDLP solver needs 2*(2^(109/2)) = 2^55.5 steps to retrieve this private key, a GPU that computes 2^30 steps/sec would take 2^25.5 seconds, about 550 days.



A good article / recap about ECDLP:

https://ellipticnews.wordpress.com/2016/04/07/ecdlp-in-less-than-square-root-time/

sr. member
Activity: 462
Merit: 696
May 09, 2020, 08:35:58 AM
Yes the jumps have to be fixed otherwise paths differ and work files become incompatible.
Hope the cafe will be good Smiley
member
Activity: 144
Merit: 10
May 09, 2020, 08:00:08 AM

To make test there is 2 things, you need a large number of test with key uniformly distributed in the range. If you make test with always the same key, it is not representative.


Thanks for pointing out that again. Somehow I missed it the first time when you mentioned that the seed, for creating the kangaroo jumps, is now fixed.

Code:
// https://github.com/JeanLucPons/Kangaroo/blob/e7f481f6ad86338288e43cb8758700459ac4b800/Kangaroo.cpp#L638
// Kangaroo jumps
// Constant seed for compatibilty of workfiles
rseed(0x600DCAFE);
sr. member
Activity: 462
Merit: 696
May 09, 2020, 12:15:39 AM
@MrFreeDragon:

I'm looking at your test.

@HardwareCollector:

I did a test on 100 trials, 40bit range, 2^12 Kangaroo dp=10 ( quasi square root of your config, cannot have dp=9.5 Wink ).
The 100 keys are uniformly distributed in the range.

[ 98] 2^21.962 Dead:0 Avg:2^22.016 DeadAvg:1.4 (2^22.603)
[ 99] 2^21.276 Dead:0 Avg:2^22.010 DeadAvg:1.4 (2^22.603)
[100] 2^20.427 Dead:0 Avg:2^22.001 DeadAvg:1.4 (2^22.603)

The calculation of the overhead gives 2^22.603 and the actual average is 2^22.001, exact average for dp=0 is 20^21.056.

In that case it overestimate because I don't know the exact analytic expression of the time complexity of the DP method.
I know that it converges to ~cubicroot( 16.numberOfKangaroo.N.2^dp ) when numberOfKangaroo >> sqrt(N)/2^dp. nbKangaroo.2^dp is an asymptote when numberOfKangaroo << sqrt(N)/2^dp , N is the range size, 2^dp is lower than sqrt(N).
Here we are a bit between the 2 cases where the approximation is not really good.

To compute the exact expression, it is like the birthday paradox with 2 tables but by drawing bunches of 2^dp random numbers nbKangaroo times alternatively in the 2 tables. Quite a nightmare, never detailed in all the papers I read.

You can also see that the 3 last trials where under the average. This is due to the fact that the number of expected operation depends also where the private key is in the range and that the deviation is large.

To make test there is 2 things, you need a large number of test with key uniformly distributed in the range. If you make test with always the same key, it is not representative.

Edit: correction it was log2(numberOfKangaroo) >> log2(sqrt(N)) - log2(2^dp) so numberOfKangaroo >> sqrt(N)/2^dp
member
Activity: 144
Merit: 10
May 08, 2020, 06:48:54 PM
I did run some more tests with 80-bit intervals and my luck seems to be very consistent with d=19-20.

1x RTX 2070, Total time 33:27
Code:
./Kangaroos -t 0 -d 20 -gpu -gpuId 0 input_80_bit_interval.txt
Kangaroo v1.4
Start:7FFFFFFFFFFFFFFFFFFF
Stop :FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^20.17
Suggested DP: 19
Expected operations: 2^41.68
Expected RAM: 140.1MB
DP size: 20 [0xfffff00000000000]
GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(72x128) (117.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos [9.9s]
[862.68 MK/s][GPU 862.68 MK/s][Count 2^40.45][Dead 0][33:16 (Avg 01:08:00)][45.9/79.7MB]
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 33:27

8x RTX 2080 Ti, Total time 02:28
Code:
./Kangaroos -t 0 -d 19 -gpu -gpuId 0,1,2,3,4,5,6,7 -w server_1 -wi 300 input_80_bit_interval.txt
Kangaroo v1.4
Start:7FFFFFFFFFFFFFFFFFFF
Stop :FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^24.09
Suggested DP: 15
Expected operations: 2^43.40
Expected RAM: 858.1MB
DP size: 19 [0xffffe00000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #7 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#7: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #6 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#6: creating kangaroos...
GPU: GPU #5 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#5: creating kangaroos...
GPU: GPU #4 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#4: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#6: 2^21.09 kangaroos [21.1s]
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos [21.2s]
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [21.3s]
SolveKeyGPU Thread GPU#7: 2^21.09 kangaroos [23.4s]
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos [23.7s]
SolveKeyGPU Thread GPU#5: 2^21.09 kangaroos [24.0s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [24.6s]
SolveKeyGPU Thread GPU#4: 2^21.09 kangaroos [24.8s]
[9827.80 MK/s][GPU 9827.80 MK/s][Count 2^39.93][Dead 0][02:00 (Avg 19:43)][63.1/97.2MB]
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 02:28
sr. member
Activity: 462
Merit: 696
May 08, 2020, 03:29:10 PM
2 times more kangaroos with the same dp, overhead 2 time larger and it has solved in 2^42 (no lucky here) + 6GB wrote to disk.
During the file saving, GPUs are waiting, the table is locked.
I'll have a look at your result tomorow in details Wink
Thanks for the test...
sr. member
Activity: 443
Merit: 350
May 08, 2020, 03:21:33 PM
-snip-
I tried to continue the job 7 times - for the 1st time 2080ti solved the key for the extra 1 minute (with total operations 2^41.9), but all other 6 attempts I stopped while they reach 2^42.3 group operations (actually 2 times more than the expected).

However I'm also a bit surprised here, i would need more info to try to understand, especially number of kangaroo of each configuration and evolution of the number of distinguished point bits in the work files...



I made the same test again with your recent release (for 2^80 range key):

1) Start work on 2x2080ti (12min work)
2) Continue work on 1x2080ti (15min work)
3a) Continue work from (2) on 2x2080ti (launched 5 times)
3b) Continue work from (2) on Tesla T4 (launched one time)

At start (1) the expected time to solve was 19-20min, when continue on less powerful machine the expected time changed to 35min.
In fact, the continued job in (3a) was solved for 40 min, and in (3b) was solved for 1h 48 min (expected 1h 11min only).

The interesting thing also that while continue job on lower GPU (Tesla T4) it expected less number of operations (2^41.16 compared to 2^41.38 at work start). Probably the expected number of operations is calculated based on expected DP, but for work used DP from the started work where the expected operations number was different.

Here is the statistics from all the steps:

Code:
---------------------------------------------------------
(1) Start work on 2x2080ti

$ ./kangaroo -gpu -gpuId 0,1 -w work2080 -wi 150 -t 0 VC_CUDA8/in80.txt
Kangaroo v1.4
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [12.9s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [14.1s]
[2454.82 MK/s][GPU 2454.82 MK/s][Count 2^38.08][Dead 0][02:14 (Avg 19:28)][69.5/103.6MB]
SaveWork: work2080...............done [69.6 MB] [00s] Fri May  8 16:47:27 2020
[2453.25 MK/s][GPU 2453.25 MK/s][Count 2^39.16][Dead 0][04:45 (Avg 19:28)][144.8/188.0MB]
SaveWork: work2080...............done [144.8 MB] [01s] Fri May  8 16:49:59 2020
[2448.46 MK/s][GPU 2448.46 MK/s][Count 2^39.77][Dead 0][07:16 (Avg 19:31)][219.8/281.2MB]
SaveWork: work2080...............done [219.8 MB] [01s] Fri May  8 16:52:31 2020
[2450.64 MK/s][GPU 2450.64 MK/s][Count 2^40.19][Dead 0][09:46 (Avg 19:30)][293.8/373.8MB]
SaveWork: work2080...............done [293.9 MB] [02s] Fri May  8 16:55:01 2020
[2445.49 MK/s][GPU 2445.49 MK/s][Count 2^40.51][Dead 0][12:17 (Avg 19:32)][367.9/466.4MB]
SaveWork: work2080...............done [367.9 MB] [02s] Fri May  8 16:57:32 2020
[2292.20 MK/s][GPU 2292.20 MK/s][Count 2^40.52][Dead 0][12:21 (Avg 20:50)][369.0/467.7MB]  ^C


---------------------------------------------------------
(2) Continue work on 1x2080ti

$ ./kangaroo -gpu -gpuId 0 -i work2080 -w work2080 -wi 150 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 367.9/466.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^21.09
Suggested DP: 18
Expected operations: 2^41.23
Expected RAM: 761.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [11.8s]
[1224.73 MK/s][GPU 1224.73 MK/s][Count 2^40.64][Dead 0][14:33 (Avg 35:02)][402.0/508.9MB]
SaveWork: work2080...............done [402.0 MB] [01s] Fri May  8 17:00:46 2020
[1224.25 MK/s][GPU 1224.25 MK/s][Count 2^40.77][Dead 0][17:03 (Avg 35:03)][439.1/555.3MB]
SaveWork: work2080...............done [439.1 MB] [02s] Fri May  8 17:03:16 2020
[1224.60 MK/s][GPU 1224.60 MK/s][Count 2^40.89][Dead 0][19:34 (Avg 35:02)][476.1/601.6MB]
SaveWork: work2080...............done [476.1 MB] [02s] Fri May  8 17:05:47 2020
[1225.37 MK/s][GPU 1225.37 MK/s][Count 2^41.00][Dead 0][22:05 (Avg 35:01)][513.1/647.8MB]
SaveWork: work2080...............done [513.1 MB] [02s] Fri May  8 17:08:18 2020
[1223.11 MK/s][GPU 1223.11 MK/s][Count 2^41.10][Dead 0][24:36 (Avg 35:05)][550.1/694.1MB]
SaveWork: work2080...............done [550.1 MB] [03s] Fri May  8 17:10:50 2020
[1225.35 MK/s][GPU 1225.35 MK/s][Count 2^41.19][Dead 0][27:08 (Avg 35:01)][587.1/740.4MB]
SaveWork: work2080...............done [587.1 MB] [03s] Fri May  8 17:13:23 2020
[1137.83 MK/s][GPU 1137.83 MK/s][Count 2^41.19][Dead 0][27:14 (Avg 37:43)][587.6/741.0MB]  ^C

---------------------------------------------------------
(3a) Continue work on 2x2080ti (5 times):

$ ./kangaroo -gpu -gpuId 0,1 -i work2080 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [12.1s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [14.4s]
[2444.55 MK/s][GPU 2444.55 MK/s][Count 2^41.48][Dead 0][31:31 (Avg 19:32)][718.7/904.8MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 31:49


$ ./kangaroo -gpu -gpuId 0,1 -i work2080 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [12.9s]
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [13.1s]
[2436.24 MK/s][GPU 2436.24 MK/s][Count 2^42.54][Dead 2][57:24 (Avg 19:36)][1491.2/1870.5MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 57:54


$ ./kangaroo -gpu -gpuId 0,1 -i work2080 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [12.9s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [13.5s]
[2439.37 MK/s][GPU 2439.37 MK/s][Count 2^41.97][Dead 3][41:04 (Avg 19:35)][1003.9/1261.4MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 41:26


$ ./kangaroo -gpu -gpuId 0,1 -i work2080 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [12.5s]
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [13.9s]
[2446.30 MK/s][GPU 2446.30 MK/s][Count 2^41.35][Dead 0][29:25 (Avg 19:32)][655.4/825.8MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 29:42


$ ./kangaroo -gpu -gpuId 0,1 -i work2080 -t 0
Kangaroo v1.4
Loading: work2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^22.09
Suggested DP: 17
Expected operations: 2^41.38
Expected RAM: 846.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [11.9s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [15.2s]
[2442.00 MK/s][GPU 2442.00 MK/s][Count 2^42.02][Dead 1][42:16 (Avg 19:34)][1040.2/1306.8MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 42:42

---------------------------------------------------------
(3b) Continue work on Tesla T4

$ ./kangaroo -t 0 -gpu -i work_from2080 -w work_teslaT4 -wi 300
Kangaroo v1.4
Loading: work_from2080
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
LoadWork: [HashTalbe 587.1/740.4MB] [01s]
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^20.32
Suggested DP: 19
Expected operations: 2^41.16
Expected RAM: 726.5MB
DP size: 17 [0xffff800000000000]
GPU: GPU #0 Tesla T4 (40x64 cores) Grid(80x128) (129.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.32 kangaroos [8.1s]
[572.34 MK/s][GPU 572.34 MK/s][Count 2^41.27][Dead 1][31:59 (Avg 01:11:29)][621.1/782.9MB]
SaveWork: work_teslaT4...............done [621.2 MB] [02s] Fri May  8 17:34:58 2020
[574.98 MK/s][GPU 574.98 MK/s][Count 2^41.35][Dead 1][37:00 (Avg 01:11:09)][656.1/826.7MB]
SaveWork: work_teslaT4...............done [656.1 MB] [03s] Fri May  8 17:40:01 2020
[573.82 MK/s][GPU 573.82 MK/s][Count 2^41.43][Dead 1][42:01 (Avg 01:11:18)][690.8/870.1MB]
SaveWork: work_teslaT4...............done [690.8 MB] [04s] Fri May  8 17:45:03 2020
[573.58 MK/s][GPU 573.58 MK/s][Count 2^41.50][Dead 1][47:02 (Avg 01:11:20)][725.6/913.5MB]
SaveWork: work_teslaT4...............done [725.6 MB] [04s] Fri May  8 17:50:04 2020
[573.76 MK/s][GPU 573.76 MK/s][Count 2^41.57][Dead 1][52:03 (Avg 01:11:18)][760.3/956.9MB]
SaveWork: work_teslaT4...............done [760.3 MB] [04s] Fri May  8 17:55:05 2020
[575.16 MK/s][GPU 575.16 MK/s][Count 2^41.63][Dead 1][57:05 (Avg 01:11:08)][795.1/1000.3MB]
SaveWork: work_teslaT4...............done [795.1 MB] [05s] Fri May  8 18:00:08 2020
[573.54 MK/s][GPU 573.54 MK/s][Count 2^41.69][Dead 1][01:02:05 (Avg 01:11:20)][829.6/1043.5MB]
SaveWork: work_teslaT4...............done [829.6 MB] [05s] Fri May  8 18:05:08 2020
[573.72 MK/s][GPU 573.72 MK/s][Count 2^41.75][Dead 1][01:07:05 (Avg 01:11:19)][864.0/1086.5MB]
SaveWork: work_teslaT4...............done [864.0 MB] [06s] Fri May  8 18:10:09 2020
[575.11 MK/s][GPU 575.11 MK/s][Count 2^41.81][Dead 1][01:12:06 (Avg 01:11:08)][898.5/1129.7MB]
SaveWork: work_teslaT4...............done [898.5 MB] [06s] Fri May  8 18:15:10 2020
[573.57 MK/s][GPU 573.57 MK/s][Count 2^41.86][Dead 1][01:17:08 (Avg 01:11:20)][932.9/1172.6MB]
SaveWork: work_teslaT4...............done [932.9 MB] [06s] Fri May  8 18:20:11 2020
[575.05 MK/s][GPU 575.05 MK/s][Count 2^41.91][Dead 1][01:22:09 (Avg 01:11:09)][967.4/1215.8MB]
SaveWork: work_teslaT4...............done [967.4 MB] [07s] Fri May  8 18:25:13 2020
[574.34 MK/s][GPU 574.34 MK/s][Count 2^41.97][Dead 2][01:27:11 (Avg 01:11:14)][1001.9/1258.8MB]
SaveWork: work_teslaT4...............done [1001.9 MB] [07s] Fri May  8 18:30:15 2020
[574.75 MK/s][GPU 574.75 MK/s][Count 2^42.01][Dead 2][01:32:11 (Avg 01:11:11)][1036.1/1301.7MB]
SaveWork: work_teslaT4...............done [1036.1 MB] [07s] Fri May  8 18:35:16 2020
[573.45 MK/s][GPU 573.45 MK/s][Count 2^42.06][Dead 2][01:37:11 (Avg 01:11:21)][1070.3/1344.4MB]
SaveWork: work_teslaT4...............done [1070.4 MB] [08s] Fri May  8 18:40:17 2020
[575.05 MK/s][GPU 575.05 MK/s][Count 2^42.11][Dead 2][01:42:13 (Avg 01:11:09)][1104.6/1387.2MB]
SaveWork: work_teslaT4...............done [1104.6 MB] [08s] Fri May  8 18:45:19 2020
[575.27 MK/s][GPU 575.27 MK/s][Count 2^42.15][Dead 2][01:47:14 (Avg 01:11:07)][1138.8/1430.1MB]
SaveWork: work_teslaT4...............done [1138.8 MB] [08s] Fri May  8 18:50:20 2020
[572.15 MK/s][GPU 572.15 MK/s][Count 2^42.16][Dead 2][01:47:57 (Avg 01:11:30)][1142.8/1435.0MB]
Key# 0 [1S]Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 01:48:18

---------------------------------------------------------
sr. member
Activity: 443
Merit: 350
May 08, 2020, 02:27:48 PM
@MrFreeDragon
You have to take in consideration that on this first test it has written a ~2GB file and ~4GB one and created 2^24 kangaroos.
On the second with DP18 and 2^24, this gives an overhead of ~2^42 so more than the average, however it was solved solved in only 2^41, more than 2 time less than expected, he was lucky Smiley
This is the problem of multiple powerful GPU, with 2^24 kangaroos, 80bit range is a too small range.
The client/server mode is a good idea as it will avoid memory consumption on the client side, but the server should be well tuned., the overhead due to DP still apply.

@HardwareCollector

I agree

I compared the results for default settings (for DP selected by program).
In HardwareCollector's 1st case DP was 15, as in my tests as well.
He has 2^24 created kangaroos, but I had 2^23 created kangaroos (because I had just 4x2080ti, and he used 8x2080ti) - 2 times more cards, and so 2 times more kangaroos.

However, the number of created kangaroos per one cards is the same (as for 8x, so for 4x) - 2^21.09. So it should not be an issue...

sr. member
Activity: 462
Merit: 696
May 08, 2020, 01:53:00 PM
@MrFreeDragon
You have to take in consideration that on this first test it has written a ~2GB file and ~4GB one and created 2^24 kangaroos.
On the second with DP18 and 2^24, this gives an overhead of ~2^42 so more than the average, however it was solved solved in only 2^41, more than 2 time less than expected, he was lucky Smiley
This is the problem of multiple powerful GPU, with 2^24 kangaroos, 80bit range is a too small range.
The client/server mode is a good idea as it will avoid memory consumption on the client side, but the server should be well tuned., the overhead due to DP still apply.

@HardwareCollector

I agree
member
Activity: 144
Merit: 10
May 08, 2020, 01:42:15 PM

In you test you used 8xRTX2080ti. It strange for me why you spent 15 minutes to solve one key. Some earlier I made the test for 80bit range with 4xRTX2080ti and in all cases the key was solved for 6-7 minutes. I guess that your result 15 minutes some wrong one.

I am not that surprised because hardware configurations do matter, which seems to be an issue on my end. The optimal number of kangaroos launched and distinguished point mask is based on your hardware configuration. Ideally, with a centralized server implementation, it should be solely based on the interval size and TMTO that you are satisfied with.
sr. member
Activity: 443
Merit: 350
May 08, 2020, 12:21:10 PM
-snip-
Start:7FFFFFFFFFFFFFFFFFFF
Stop :FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
.....
Done: Total time 14:53

In you test you used 8xRTX2080ti. It strange for me why you spent 15 minutes to solve one key. Some earlier I made the test for 80bit range with 4xRTX2080ti and in all cases the key was solved for 6-7 minutes. I guess that your result 15 minutes some wrong one.

For test I used the key from the repository VC_CUDA8/in80.txt
This key is "in open space", not shifted to 0 value.

Code:
TEST1:

$ ./kangaroo -gpu -gpuId 0,1,2,3 -t 0 VC_CUDA8/in80.txt
Kangaroo v1.4beta
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^39.97
Number of kangaroos: 2^23.09
Suggested DP: 15
Expected operations: 2^41.15
Expected RAM: 5672.9MB
DP size: 15 [0xfffe000000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos in 12460.6ms
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos in 13354.0ms
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos in 14831.5ms
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos in 15159.3ms
[4917.62 MK/s][GPU 4917.62 MK/s][Count 2^40.26][Dead 0][05:04 (Avg 08:15)][3066.1MB]
Key# 0 Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 05:49

------------------------------------------------------------------

TEST2:

$ ./kangaroo -gpu -gpuId 0,1,2,3 -t 0 VC_CUDA8/in80.txt
Kangaroo v1.4beta
Start:B60E83280258A40F9CDF1649744D730D6E939DE92A2B00000000000000000000
Stop :B60E83280258A40F9CDF1649744D730D6E939DE92A2BE19BFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.04
Number of kangaroos: 2^23.09
Suggested DP: 15
Expected operations: 2^41.15
Expected RAM: 5672.9MB
DP size: 15 [0xfffe000000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos in 12070.3ms
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos in 12887.1ms
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos in 13696.3ms
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos in 14832.6ms
[4925.05 MK/s][GPU 4925.05 MK/s][Count 2^40.61][Dead 0][06:28 (Avg 08:14)][3907.7MB]
Key# 0 Pub:  0x0284A930C243C0C2F67FDE3E0A98CE6DB0DB9AB5570DAD9338CADE6D181A431246
       Priv: 0xB60E83280258A40F9CDF1649744D730D6E939DE92A2BE19B0D19A3D64A1DE032

Done: Total time 07:19

------------------------------------------------------------------

TEST3 (the same key repeated 3 times, and shift range to the end (close to order)):

$ ./kangaroo -gpu -gpuId 0,1,2,3 -t 0 in80order.txt
Kangaroo v1.4beta
Start:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF47BE9FBFD25E8CD0364141
Stop :FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
Keys :3
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.01
Number of kangaroos: 2^23.09
Suggested DP: 15
Expected operations: 2^41.15
Expected RAM: 5672.9MB
DP size: 15 [0xfffe000000000000]
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos in 12114.8ms
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos in 12728.4ms
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos in 13667.7ms
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos in 14680.0ms
[4907.53 MK/s][GPU 4907.53 MK/s][Count 2^38.75][Dead 0][01:46 (Avg 08:16)][1080.8MB]
Key# 0 Pub:  0x033F93598445D64434F6D8F92621EBA31346864990095A13D142E8E83D5CC701EB
       Priv: 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03ACCEC02631A542173
[4905.64 MK/s][GPU 4905.64 MK/s][Count 2^41.11][Dead 0][09:24 (Avg 08:16)][5515.9MB]
Key# 1 Pub:  0x033F93598445D64434F6D8F92621EBA31346864990095A13D142E8E83D5CC701EB
       Priv: 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03ACCEC02631A542173
[4904.92 MK/s][GPU 4904.92 MK/s][Count 2^40.40][Dead 0][05:52 (Avg 08:16)][3391.6MB]
Key# 2 Pub:  0x033F93598445D64434F6D8F92621EBA31346864990095A13D142E8E83D5CC701EB
       Priv: 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03ACCEC02631A542173

Done: Total time 18:50


So, in all cases the average time to solve 80bit range with 4x2080ti was 6-7 minutes
sr. member
Activity: 462
Merit: 696
May 08, 2020, 11:09:42 AM
Thanks again for the tests Wink

What you can try is to store the work file on a NFS mount and write a script on an other host that check for collision using -wm option.
That should be more or less equivalent to what you suggest.
Of course, using a centralized server is much more convenient.
Interesting, i know rather well socket and I have good experience with server coding, so that should not be too hard to write an efficient distributed solution.

Edit: Don't forget to update you repository Wink

member
Activity: 144
Merit: 10
May 08, 2020, 10:44:17 AM
@Jean_Luc

I did some (90-bit) tests from yesterday’s commit and an instance running on a single power server always seemed to solve the DL much faster, up to 2x faster in most cases than with a distributed model with distinguished point persistence. I will need to run more tests in the 100-bit range to see if this observation holds.

I do agree that it’s very difficult to come up with a formula for calculating the optimal distinguished point mask and memory requirements for solving large intervals 95+ bits in size with diverse hardware configurations. The issue in my case seems to be the lack of a centralized server for saving distinguished points and checking for collisions. So I came up with an idea, modify the code not to use the internal hash table, but periodically (every 5 minutes) send accumulated distinguished points to a centralized server for storage and verification. I will be testing this idea sometime this weekend for comparison against an instance running on single powerful server.

I ran some small tests with your latest commit, version 1.4; and so far it's looking good.

80-bit Default:
Code:
./Kangaroos -t 0 -gpu -gpuId 0,1,2,3,4,5,6,7 -w server_1 -wi 300 input_80_bit_interval.txt
Kangaroo v1.4
Start:7FFFFFFFFFFFFFFFFFFF
Stop :FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^24.09
Suggested DP: 15
Number of kangaroos: 2^24.09
Suggested DP: 15
Expected operations: 2^41.38
Expected RAM: 3350.0MB
DP size: 15 [0xfffe000000000000]
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #4 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#4: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #5 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#5: creating kangaroos...
GPU: GPU #7 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#7: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #6 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#6: creating kangaroos...
SolveKeyGPU Thread GPU#7: 2^21.09 kangaroos [21.4s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [21.9s]
SolveKeyGPU Thread GPU#5: 2^21.09 kangaroos [21.5s]
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos [21.8s]
SolveKeyGPU Thread GPU#4: 2^21.09 kangaroos [22.0s]
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos [21.9s]
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [22.0s]
SolveKeyGPU Thread GPU#6: 2^21.09 kangaroos [22.1s]
[8893.11 MK/s][GPU 8893.11 MK/s][Count 2^40.97][Dead 0][04:36 (Avg 05:22)][2009.8/2518.7MB]
SaveWork: savefile...............done [2010.0 MB] [14s] Fri May  8 13:23:32 2020
[8197.71 MK/s][GPU 8197.71 MK/s][Count 2^41.96][Dead 2][09:37 (Avg 05:49)][3980.3/4981.8MB]
SaveWork: savefile...............done [3980.5 MB] [32s] Fri May  8 13:28:51 2020
[6947.03 MK/s][GPU 6947.03 MK/s][Count 2^42.31][Dead 2][13:12 (Avg 06:52)][5077.2/6353.1MB]
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 14:53

80-bit d=18:
Code:
./Kangaroos -t 0 -d 18 -gpu -gpuId 0,1,2,3,4,5,6,7 -w server_2 -wi 300 input_80_bit_interval.txt
Kangaroo v1.4
Start:7FFFFFFFFFFFFFFFFFFF
Stop :FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^24.09
Suggested DP: 15
Expected operations: 2^42.66
Expected RAM: 1024.2MB
DP size: 18 [0xffffc00000000000]
GPU: GPU #7 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#7: creating kangaroos...
GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #5 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#5: creating kangaroos...
GPU: GPU #4 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#4: creating kangaroos...
GPU: GPU #6 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#6: creating kangaroos...
GPU: GPU #3 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #2 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(136x128) (213.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#1: 2^21.09 kangaroos [21.6s]
SolveKeyGPU Thread GPU#0: 2^21.09 kangaroos [21.8s]
SolveKeyGPU Thread GPU#5: 2^21.09 kangaroos [21.8s]
SolveKeyGPU Thread GPU#4: 2^21.09 kangaroos [21.9s]
SolveKeyGPU Thread GPU#7: 2^21.09 kangaroos [22.3s]
SolveKeyGPU Thread GPU#2: 2^21.09 kangaroos [21.9s]
SolveKeyGPU Thread GPU#6: 2^21.09 kangaroos [22.4s]
SolveKeyGPU Thread GPU#3: 2^21.09 kangaroos [24.2s]
[8860.95 MK/s][GPU 8860.95 MK/s][Count 2^40.97][Dead 0][04:34 (Avg 13:05)][252.2/321.7MB]
SaveWork: savefile...............done [252.2 MB] [02s] Fri May  8 13:41:21 2020
[8871.56 MK/s][GPU 8871.56 MK/s][Count 2^41.21][Dead 1][05:26 (Avg 13:04)][297.6/378.5MB]
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 05:55
sr. member
Activity: 462
Merit: 696
May 08, 2020, 08:34:29 AM
The release 1.4 is ready.
Available on github

https://github.com/JeanLucPons/Kangaroo/releases

Quote
-  Added load/save/merge work file
-  Memory usage improvements
-  Added max step option

Thanks to test it Wink
sr. member
Activity: 462
Merit: 696
May 08, 2020, 05:03:13 AM
I did a test:

Starting from scratch on 64bit range, save the work file every 10sec. It has solved the key in 2^33.8, a bit more than the average and the last work file at 2^33.27.

Code:
C:\C++\Kangaroo\VC_CUDA10>x64\Release\Kangaroo.exe -d 11 -t 0 -w work1.wrk -wi 10 -gpu ..\VC_CUDA8\in64.txt
Kangaroo v1.4notready
Start:5B3F38AF935A3640D158E871CE6E9666DB862636383386EE0000000000000000
Stop :5B3F38AF935A3640D158E871CE6E9666DB862636383386EEFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^64
Jump Avg distance: 2^31.99
Number of kangaroos: 2^18.58
Suggested DP: 13
Expected operations: 2^33.18
Expected RAM: 193.1MB
DP size: 11 [0xFFE0000000000000]
GPU: GPU #0 GeForce GTX 1050 Ti (6x128 cores) Grid(12x256) (45.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^18.58 kangaroos [1.9s]
[183.89 MK/s][GPU 183.89 MK/s][Count 2^29.99][Dead 0][08s (Avg 52s)][17.9/47.6MB]
SaveWork: work1.wrk...............done [18.0 MB] [00s] Fri May  8 09:37:01 2020
[151.40 MK/s][GPU 151.40 MK/s][Count 2^31.17][Dead 0][18s (Avg 01:04)][37.9/71.6MB]
SaveWork: work1.wrk...............done [38.0 MB] [00s] Fri May  8 09:37:12 2020
[149.85 MK/s][GPU 149.85 MK/s][Count 2^31.80][Dead 0][28s (Avg 01:04)][57.8/91.8MB]
SaveWork: work1.wrk...............done [57.9 MB] [00s] Fri May  8 09:37:22 2020
[148.78 MK/s][GPU 148.78 MK/s][Count 2^32.24][Dead 0][39s (Avg 01:05)][77.7/111.9MB]
SaveWork: work1.wrk...............done [77.7 MB] [00s] Fri May  8 09:37:33 2020
[147.04 MK/s][GPU 147.04 MK/s][Count 2^32.58][Dead 0][49s (Avg 01:06)][97.5/132.9MB]
SaveWork: work1.wrk...............done [97.6 MB] [00s] Fri May  8 09:37:44 2020
[145.74 MK/s][GPU 145.74 MK/s][Count 2^32.85][Dead 0][01:00 (Avg 01:06)][117.2/155.0MB]
SaveWork: work1.wrk...............done [117.2 MB] [00s] Fri May  8 09:37:55 2020
[144.31 MK/s][GPU 144.31 MK/s][Count 2^33.07][Dead 1][01:11 (Avg 01:07)][136.8/178.2MB]
SaveWork: work1.wrk...............done [136.9 MB] [01s] Fri May  8 09:38:06 2020
[143.51 MK/s][GPU 143.51 MK/s][Count 2^33.27][Dead 2][01:22 (Avg 01:07)][156.4/202.3MB]
SaveWork: work1.wrk...............done [156.5 MB] [01s] Fri May  8 09:38:17 2020
[136.98 MK/s][GPU 136.98 MK/s][Count 2^33.31][Dead 2][01:26 (Avg 01:10)][160.3/207.0MB]
Key# 0 [1S]Pub:  0x03BB113592002132E6EF387C3AEBC04667670D4CD40B2103C7D0EE4969E9FF56E4
       Priv: 0x5B3F38AF935A3640D158E871CE6E9666DB862636383386EE510F18CCC3BD72EB

[  0] 2^33.338 Dead:2 Avg:2^33.338 DeadAvg:2.0 (2^33.179)


work file info:

Code:
Loading: work1.wrk
Version: 0
DP bits: 11
Start  :5B3F38AF935A3640D158E871CE6E9666DB862636383386EE0000000000000000
Stop   :5B3F38AF935A3640D158E871CE6E9666DB862636383386EEFFFFFFFFFFFFFFFF
Key    :03BB113592002132E6EF387C3AEBC04667670D4CD40B2103C7D0EE4969E9FF56E4
Count  : 10355736576 2^33.270
Time   :01:22
DP Size: 156.5/202.4MB
DP Cnt : 5063157 2^22.272
DP Max : 41 [@ 006CDC]
DP Min : 4 [@ 003037]
DP Avg : 19.31

I get this work file and restarted 10 times the search:

Code:
[  0] 2^31.073 Dead:0 Avg:2^31.073 DeadAvg:0.0 (2^33.179)
[  0] 2^32.051 Dead:2 Avg:2^32.051 DeadAvg:2.0 (2^33.179)
[  0] 2^30.842 Dead:0 Avg:2^30.842 DeadAvg:0.0 (2^33.179)
[  0] 2^31.480 Dead:1 Avg:2^31.480 DeadAvg:1.0 (2^33.179)
[  0] 2^31.268 Dead:0 Avg:2^31.268 DeadAvg:0.0 (2^33.179)
[  0] 2^30.116 Dead:0 Avg:2^30.116 DeadAvg:0.0 (2^33.179)
[  0] 2^30.999 Dead:0 Avg:2^30.999 DeadAvg:0.0 (2^33.179)
[  0] 2^32.083 Dead:1 Avg:2^32.083 DeadAvg:1.0 (2^33.179)
[  0] 2^30.306 Dead:0 Avg:2^30.306 DeadAvg:0.0 (2^33.179)
[  0] 2^33.031 Dead:4 Avg:2^33.031 DeadAvg:4.0 (2^33.179)

So an average of 2^31.583 (~3 times less than the average)

The dp overhead is about 2^29.58 = 2^31.583 / 4.
I would expect 2^31.583/3, one dp overhead when starting the first tine, one when writing the file and one when restarting the search....

If you save the kangaroo, you get rid off this extra overhead, only one when starting the first time, but you need to continue on the same hardware or on an hardware that handle all kangaroos.

Need more average and test to fully understand this...

sr. member
Activity: 462
Merit: 696
May 08, 2020, 03:30:50 AM
The statistics will not be present in the final release.
It is more for developers.
You can remove it by removing the #define STATS in Kangaroo.cpp:796
copper member
Activity: 188
Merit: 0
May 08, 2020, 03:04:44 AM
I committed the mods. Linux user can try them. (Edit: or Windows user who compile,
I updated project files)

Good afternoon.
You can make statistics at the end of the work cycle.
It is very inconvenient to leaf through all the console output if there are several keys in the input file.
sr. member
Activity: 462
Merit: 696
May 08, 2020, 12:12:28 AM
Many thanks for this test @HardwareCollector and  @MrFreeDragon Wink

Yes this is a bit tricky and this extra delay delay can be due to the overload of the dp. I will add note on the README about that.

First the suggested dp is a bad approximation, it needs improvement, you have to tune it manually according to available RAM and expected operation, the goal is to be as close as possible to 2sqrt(N).

When you continue a job on a different configuration, depending on how much kangaroo you have, the overload will be different, each kangaroo need 2^dp (in average) to reach a distinguished point. So if you continue on a configuration with much more kangaroo, the overload will be much higher and during the merge, it will add a count but only count/(nbKangaroo*2^dp) distinguished points to the hashtable.

Each time you do a merge, the count recorded in the merge will get an error of ~nbKangaroo*2^dp which is the counting granularity of actual distinguished point which will impact the estimation...

Another things is that when you merge 2 work files with a different dp bits, the lowest will be saved.

If you overload the dp bits by hand using -d and you choose a larger one (let's say +2), you will benefit only of 1 distinguished point on 4 in the hashtable.


I tried to continue the job 7 times - for the 1st time 2080ti solved the key for the extra 1 minute (with total operations 2^41.9), but all other 6 attempts I stopped while they reach 2^42.3 group operations (actually 2 times more than the expected).

However I'm also a bit surprised here, i would need more info to try to understand, especially number of kangaroo of each configuration and evolution of the number of distinguished point bits in the work files...

sr. member
Activity: 443
Merit: 350
May 07, 2020, 07:05:29 PM
I committed the mods. Linux user can try them. (Edit: or Windows user who compile,
I updated project files)

./kangaroo -w save.work -wi 10 in.txt (Save work file every 10 sec)
./kangaroo -w save2.work -wi 10 in.txt (Save work file every 10 sec)
-snip
Thanks to test

I made the test with separate works on GPU, merge, continue on CPU, transfer to another machine with another GPU and return back. I did not save kangaroos, only the work (so kangaroos were created again).

The test was made with 2^80 range (for info: to solve it on 4xRTX2080ti i need only 6-8 minutes).

1) I started the job with 1xRTX2080ti and had expected operations 2^41.38 and about 38min time --> made work1 for 22 min (2^40.37 operations)
2) Started the same job from the beginning on the same 1xRTX2080ti and made work2 for 2min (2^36.5 operations)
3) Merged work1 and work2 to work3, and started work3 for 1 minute
4) Continued the job on CPU (4 threads) - started and stop 3 times.
5) Transferred the job to another machine with Tesla T4 (expected time there to finish was 1:16hour with expected operations 2^41.26). I started on Tesla (continued) with 2^40.84 operations saving the work to the same file, after 7 minutes stops and start again.
6) Tesla T4 did not finish the work for expected time and operations. Actually it finished the work with total time 2:22 hour (1 hour more as expected) and with total operations 2^42.26 - actually 2 times more.

While Tesla T4 was trying to finish the job, I copied the current work file from Tesla T4 machine back to 2080ti and continued the job on the same machine there I start (continued from 2^41.Cool
I tried to continue the job 7 times - for the 1st time 2080ti solved the key for the extra 1 minute (with total operations 2^41.9), but all other 6 attempts I stopped while they reach 2^42.3 group operations (actually 2 times more than the expected).

Suggested DP on 2080ti was 18, but on Tesla T4 it was 19, however DP size was also 18 (as was started). So I do not think that different machines use different distinguish points pattern.

The feature to save work, merge work and continue work is a very good. Does this option takes care about hardware configuration change? The idea was to implement "a pause" button, and continue the "movie" later, or later on another screen.

As you developed these, probably you could understand what was the reason for such 2 times longer delay?
- Just no luck
- Creation of new kangaroos many times (instead of saving them and continue the full job)
- Configuration change (2080ti, then CPU, and then Tesla T4)
sr. member
Activity: 443
Merit: 350
May 07, 2020, 03:56:24 PM
Jean_Luc, does this command $ ./kangaroo -wm save.work save2.work save3.work mean that two work files (save.work and save2.work) merged to the 3d file save3.work, and later all the further work is saved to save3.work file? I understood it like this.

I expected to see normal txt tables in work files Smiley Distinguished points with distance, X, and kangaroo type. So my intention was to perform the search for the known public key (in the same range), then stop the work and manually change all the "wild" values in DP to "tame" values. After that re-start the program for the "unknown" public key. So, i wanted to test the idea about RangeWidth^(1/3) group operations (cube root instead of square root) if we have RangeWidth^(1/3) distinguished points at start.

-snip-
With 90-bit space, you need to perform 2^60 operations to get 2^30 DP and then only 2^30 operations to retrieve each key.
But if you need to find only 2^10 private keys, it is faster to run 2^10 times the program -> 2^10*2^46 = 2^56 operations.
Make sense if all the job is performed by one machine... However if the precomputation work is delegated to "free" machines/servers/users, we could benefit from these operations  Wink
Jump to: