Author

Topic: Pollard's kangaroo ECDLP solver - page 137. (Read 59389 times)

sr. member
Activity: 642
Merit: 316
May 13, 2020, 06:33:10 AM
A first draft of the client/server is ready on github (source only).
The client and server can be run on both windows or linux.
There is few words on the readme.
Thanks to test it Wink
Can you make release?
sr. member
Activity: 462
Merit: 701
May 13, 2020, 06:22:42 AM
A first draft of the client/server is ready on github (source only).
The client and server can be run on both windows or linux.
There is few words on the readme.
Thanks to test it Wink
sr. member
Activity: 462
Merit: 701
May 12, 2020, 09:29:59 PM
Hi,

I'm working on the client/server solution.
Almost ready, commit probably tomorrow.
Difficulty to find an efficient protocol for handling dead kangaroo and server backup.

@MrFreeDragon

HT Max    : 9 [@ 01598C]
HT Min    : 0 [@ 000003]
HT Avg    : 0.88
HT SDev   : 0.94

This is the distribution of DP in the HashTable:
9 DP at hash 01598C (largest entry)
0 DP at hash 000003 (smallest)
0.88 DP per hash in average
0.94 standard deviation of DP distribution in the HashTable

@Hardware collector:
I'm sure you will solve the #110 Wink
With the client/server version, only the server (without GPU) will need a large amount of memory.

@arulbero
I didn't worked yet on endomorphism optimization and studied your proposal.
I will be on holidays (in mountains, far from computer) the next week, I would like to try to publish the client/server release first...


sr. member
Activity: 443
Merit: 350
May 12, 2020, 05:59:38 PM
Can you please describe what does these stats mean in workfile info (showed with -winfo):

HT Max    : 9 [@ 01598C]
HT Min    : 0 [@ 000003]
HT Avg    : 0.88
HT SDev   : 0.94

All other stats are clear.

PS. Some small misprint also in usage section README.md: -gpuId gpuId1,gpuId2,...: List of GPU(s) to use, default is 0 (Id missed)
Also in all headers of all your files there a first line:  * This file is part of the BSGS distribution (https://github.com/JeanLucPons/Kangaroo)
However BSGS remained from baby step giant step as I understand.
member
Activity: 144
Merit: 10
May 12, 2020, 12:01:51 PM
Multiple search is possible but I won't work on it first.
I will work on a distributed client/server version.
The #110 will be likely solved soon.
With the distributed version, I'm almost sure the #115 will also be solved.

N = Interval size, number of group elements.
Then, with current implementation, most instances will be solved between ~(1/2)(sqrt(N)) and ~(7)(sqrt(N)) number of group operations, with the average case being ~(2)(sqrt(N)).

If we set the distinguished points (DP) mask to 23-bit, then the number of DPs needed to solve the #110 challenge will fall within ~1.519G (very lucky) to 21.259G (very unlucky). Obviously, for each additional bit that we add to the DP mask, we reduce the storage requirements by a factor of two, but may also increase the solution time.

We need ~16-bytes for each DP: 8-btyes for x-coordinate, 1-byte for herd type, 5-bytes for slow storage lookup index, and 2-bytes for hash table overhead. So we only need ~22.64GiB (very lucky) to ~333.68GiB (very unlucky) of RAM with DP=23.

If increasing DP mask doesn’t adversely affect the solution time by much, then I can see the following challenges falling before the end of this year: #110, #115, and #120.


I've solved several instances of the DLP where the solution's number of expected operations were on the very lucky side. The frequency increased as DP mask was lowered:
Code:
./dlpserver in80.txt
ECDLP Server Started and Listening on Port 8090...
ECDLP File Merger Process Started...
Exiting File Merger Process...
Receiving savefile_1589254133 (11458340 bytes)
Loading: savefile_1589254133
MergeWork: [HashTalbe 2.0/4.0MB] [00s]
Appending...
Range width: 2^80
Dead kangaroo: 0
DP[16-bit]: count 2^18.16 [12s] Calculated Operations:[2^34.16]
Receiving savefile_1589254163 (19464260 bytes)
Loading: savefile_1589254163
MergeWork: [HashTalbe 10.9/34.4MB] [00s]
Appending...
Range width: 2^80
Dead kangaroo: 0
DP[16-bit]: count 2^19.67 [42s] Calculated Operations:[2^35.67]
.
.
.
Receiving savefile_1589254864 (19403588 bytes)
Loading: savefile_1589254864
MergeWork: [HashTalbe 389.8/493.7MB] [00s]
Appending...
Range width: 2^80
Dead kangaroo: 0
DP[16-bit]: count 2^23.66 [12:22] Calculated Operations:[2^39.66]
Receiving savefile_1589254894 (19371428 bytes)
Loading: savefile_1589254894
MergeWork: [HashTalbe 406.3/514.4MB] [00s]
Appending...
Range width: 2^80

Start:80000000000000000000
Stop :17FFFFFFFFFFFFFFFFFFF
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 12:22

Code:
./dlpserver in79.txt
ECDLP Server Started and Listening on Port 8090...
ECDLP File Merger Process Started...
Exiting File Merger Process...
Receiving savefile_1589205241 (39837508 bytes)
Loading: savefile_1589205241
MergeWork: [HashTalbe 2.0/4.0MB] [00s]
Appending...
Range width: 2^79
Dead kangaroo: 0
DP[14-bit]: count 2^20.17 [12s] Calculated Operations:[2^34.17]
.
.
.
Receiving savefile_1589205739 (71532132 bytes)
Loading: savefile_1589205739
MergeWork: [HashTalbe 1033.3/1298.1MB] [00s]
Appending...
Range width: 2^79
Dead kangaroo: 0
DP[14-bit]: count 2^25.10 [08:30] Calculated Operations:[2^39.10]
Receiving savefile_1589205770 (71452420 bytes)
Loading: savefile_1589205770
MergeWork: [HashTalbe 1099.5/1380.9MB] [00s]
Appending...
Range width: 2^79

Start:80000000000000000000
Stop :FFFFFFFFFFFFFFFFFFFF
Key# 0 [1S]Pub:  0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC
       Priv: 0xEA1A5C66DCC11B5AD180

Done: Total time 08:30

Code:
./Kangaroos -t 0 -d 14 -gpu -gpuId 0,1 in80.txt
Kangaroo v1.5alpha
Start:100000000000000000000
Stop :1FFFFFFFFFFFFFFFFFFFF
Keys :1
Number of CPU thread: 0
Range width: 2^80
Jump Avg distance: 2^40.03
Number of kangaroos: 2^21.52
Suggested DP: 18
Expected operations: 2^41.09
Expected RAM: 5443.0MB
DP size: 14 [0xfffc000000000000]
GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(72x128) (117.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #1 GeForce GTX 1080 Ti (28x128 cores) Grid(56x256) (177.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos [10.3s]
SolveKeyGPU Thread GPU#1: 2^20.81 kangaroos [16.1s]
[1346.70 MK/s][GPU 1346.70 MK/s][Count 2^40.55][Dead 0][22:42 (Avg 28:52)][2995.6/3751.0MB]
Key# 0 [1S]Pub:  0x035E9F3C5AA41562E796041B0AA65EAE3E1ECD6E9000C246E617F9E8580634DBE1
       Priv: 0x1FFFFFFFFFFFFFFFFFFFE

Done: Total time 23:32
member
Activity: 348
Merit: 34
May 11, 2020, 10:44:04 PM
Multiple search is possible but I won't work on it first.
I will work on a distributed client/server version.
The #110 will be likely solved soon.
With the distributed version, I'm almost sure the #115 will also be solved.

110 solve depend on multiple pubkeys search at once, in others , you all will spend time in jumping here and there, mean you need lot of gpu powers, you need lot of ram, etc, and you can not manage all things togather, and no one have these all option available, only my last word, if you design multiple pubkeys search ver, that will be solution for just within 3 days Smiley
Enjoy your research!!!

I am not sure the multiple pubkey search will really help to solve #110 within 3 days.
As I see your idea is to "decrease" the pow of the key and solve the lower bits key, however decreasing the power you will also receive the incorrect public keys which will be the dust for you and will "burn" the resources.

This is the example what you want to do (correct me if I am not right):
1) The key #110 (with kno public key P110) has range with width 2^109 (start range is 2^109, end range is 2^110-1). To solve it, the current GPU solver needs approx. 2.22*sqrt(2^109) = 2^55.65
2) You want to divide this key by some number and receive the "lower bit" public key and solve that new one. Let's imagine you want to receive the key with range 2^90 instead of 2^109, so you need to divide by 2^(109-90) = 2^19 = 524 288. However as you do not know the remainder from that division you should check all the keys.
3) You  make 2^19 (524 288) additions (or divisions, it does not matter) of P110 with points 0*G, 1*G, 2*G, 3*G, ... 2^19-1*G (total 2^19 points) and receive 2^19 new points. All these new points are subsequent points, and the scalar of one of that points could be divided by 2^19 with 0 remainder.
4) Now you divide (i.e multiply by inverse scalar) every of these new points by 2^19, and receive another set of 2^19 points, where one of them will be in range 2^90
5) New range for that divided points is now from 2^90 to 2^91-1 (with width exactly 2^90).

However, you do not know which point from generated 2^19 is that one with the private key in the new range. So you should check all of them, all 524 288 points (2^19). Yes, the solver could find 90 bit range for some hours, but only for one key to search. If you put 2^19 public keys, (a) the speed will not be the same as for one, (b) you will have "wrong" DPs in the hashtable generated by wild kangaroos started from "wrong" points. Only 1 point from 2^19 is correct, others are wrong and will never solved (because of the incorrect range for them).

Now, let's calculate the number of operations required to solve these 2^19 new public keys with new ranges (with one key in range 2^90). For one key, the total number of operations is 2.22*sqrt(2^90) = 2^46.15; However we have not one, but 2^19 key to check. So, the total number of operations for all new generated public keys is 2^19 * 2^46.15 = 2^65.15

So, with the whole range you could solve the key for 2^55.65 operations, but now need 2^65.15 operations. This is 725 times more (2^9.5)

Moreover, this approximation does not count the new overheads due to incorrect DPs in hashtable.
Even you have several independent public keys in the same range, running their solving altogether at the same time will proportionally decrease the speed to find the collision.
Dear Friends
your mention method of division pubkeys point, are an other way, and maybe not workable as you said, my Proof of Exact 1 key inside low pow you tested,
you stated " total number of operations for all new generated public keys is 2^19 * 2^46.15 = 2^65.15 "
its also not multi pubkeys search method, its look like each key create its own dp points, and start, and think 2**19 point how much ram you need, ? here for 1 key in big range peoples facing ram problem.
its already i mention in above my message
multi pubkey search will be multi pubkeys support ver
example
bitcrack load 1 key my gpu 45mkeys, 100k keys load 40mkeys, gradually last 25 million keys load 33mkeys, and 925mb/1gb ram used for 25m keys,
similar when  kangaroo ver support multiple pubkeys checking, it show decress speed some how 10% 20% etc, not like " total number of operations for all new generated public keys is 2^19 * 2^46.15 = 2^65.15"
Jean Luc understand how multikey search could work,
hope you all will have ability to understand what method could be applied for multi support
sr. member
Activity: 443
Merit: 350
May 11, 2020, 08:27:54 PM
Multiple search is possible but I won't work on it first.
I will work on a distributed client/server version.
The #110 will be likely solved soon.
With the distributed version, I'm almost sure the #115 will also be solved.

110 solve depend on multiple pubkeys search at once, in others , you all will spend time in jumping here and there, mean you need lot of gpu powers, you need lot of ram, etc, and you can not manage all things togather, and no one have these all option available, only my last word, if you design multiple pubkeys search ver, that will be solution for just within 3 days Smiley
Enjoy your research!!!

I am not sure the multiple pubkey search will really help to solve #110 within 3 days.
As I see your idea is to "decrease" the pow of the key and solve the lower bits key, however decreasing the power you will also receive the incorrect public keys which will be the dust for you and will "burn" the resources.

This is the example what you want to do (correct me if I am not right):
1) The key #110 (with kno public key P110) has range with width 2^109 (start range is 2^109, end range is 2^110-1). To solve it, the current GPU solver needs approx. 2.22*sqrt(2^109) = 2^55.65
2) You want to divide this key by some number and receive the "lower bit" public key and solve that new one. Let's imagine you want to receive the key with range 2^90 instead of 2^109, so you need to divide by 2^(109-90) = 2^19 = 524 288. However as you do not know the remainder from that division you should check all the keys.
3) You  make 2^19 (524 288) additions (or divisions, it does not matter) of P110 with points 0*G, 1*G, 2*G, 3*G, ... 2^19-1*G (total 2^19 points) and receive 2^19 new points. All these new points are subsequent points, and the scalar of one of that points could be divided by 2^19 with 0 remainder.
4) Now you divide (i.e multiply by inverse scalar) every of these new points by 2^19, and receive another set of 2^19 points, where one of them will be in range 2^90
5) New range for that divided points is now from 2^90 to 2^91-1 (with width exactly 2^90).

However, you do not know which point from generated 2^19 is that one with the private key in the new range. So you should check all of them, all 524 288 points (2^19). Yes, the solver could find 90 bit range for some hours, but only for one key to search. If you put 2^19 public keys, (a) the speed will not be the same as for one, (b) you will have "wrong" DPs in the hashtable generated by wild kangaroos started from "wrong" points. Only 1 point from 2^19 is correct, others are wrong and will never solved (because of the incorrect range for them).

Now, let's calculate the number of operations required to solve these 2^19 new public keys with new ranges (with one key in range 2^90). For one key, the total number of operations is 2.22*sqrt(2^90) = 2^46.15; However we have not one, but 2^19 key to check. So, the total number of operations for all new generated public keys is 2^19 * 2^46.15 = 2^65.15

So, with the whole range you could solve the key for 2^55.65 operations, but now need 2^65.15 operations. This is 725 times more (2^9.5)

Moreover, this approximation does not count the new overheads due to incorrect DPs in hashtable.
Even you have several independent public keys in the same range, running their solving altogether at the same time will proportionally decrease the speed to find the collision.
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
May 11, 2020, 05:55:55 PM
Hello  Smiley

Quote

Expected RAM: 228.1MB
DP size: 23 [0xFFFFFE0000000000]
GPU: GPU #0 Tesla T4 (40x64 cores) Grid(80x128) (129.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.32 kangaroos [6.4s]
[511.32 MK/s][GPU 511.32 MK/s][Count 2^33.64][Dead 0][30s (Avg 1.1d)][2.0/4.2MB]  ^C
C:\Chromidrol2020>chromatom2020 -t 0  -gpu in.txt
Kangaroo v1.4
Start:0
Stop :FFFFFFFFFFFFFFFFFFFFF
Keys :2
Number of CPU thread: 0
Range width: 2^84
Jump Avg distance: 2^42.03
Number of kangaroos: 2^20.32
Suggested DP: 21
Expected operations: 2^43.43
Expected RAM: 228.1MB
DP size: 21 [0xFFFFF80000000000]
GPU: GPU #0 Tesla T4 (40x64 cores) Grid(80x128) (129.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
SolveKeyGPU Thread GPU#0: 2^20.32 kangaroos [6.5s]
[511.17 MK/s][GPU 511.17 MK/s][Count 2^43.02][Dead 0][05:32:59 (Avg 06:27:23)][132.1/172.6MB]  ^C
C:\Chromidrol2020>


Code working fine I think.

p.s. Im not go away from there.   Grin



legendary
Activity: 1932
Merit: 2077
May 11, 2020, 03:13:56 PM
So, the cost for 128 devices will be 128/8 * 620 = 9.9 kUSD per week.. Current prize value of #110 is 1.1x8.8kUSD = 9.7 kUSD, so the prize is more o less the same like the investments need to find it. But there is no 100% guarantee to solve the key, it is only 50%.

To be precise, the current prize is 1.1 BTC + 1.1 BCH + 1.1 BSV = 1.1 * (9300 USD) = 10230 USD


Multiple search is possible but I won't work on it first.
I will work on a distributed client/server version.
The #110 will be likely solved soon.
With the distributed version, I'm almost sure the #115 will also be solved.

In that case it would be a record, 114 bit.

******************************************************************************

I rewrite my idea with more details:

we need 2.sqrt(N) steps to get a collision in a N-space,

we need 2.sqrt(2N) = sqrt(2).2.sqrt(N) steps to get a collision in a 2N-space


if we find a way to generate sqrt(2).2.sqrt(N) points faster than 2.sqrt(N), we get a collision in less time, even if we double our searching space.

If we use the endomorphism, we can double our interval, from

[1,2,3, ..., n-1,n]

to

[1,2,3,....., n-1,n] + [lambda, 2*lambda, 3*lambda, ....., n*lambda]

We have 2 types of jumps (always positive): 16 small steps (in first interval) + 16 big steps (in second interval)


For example:
(1G, 2G, 4G, 8G, 24G, 37G,....., 512G)  + (lambda*18G, lambda*72G, ..., lambda*816G)

+ 1 jump from range1 to range2  or  from range1 to range2:     P -> lambda*P  or lambda*P -> lambda^2*(lambda*P)

You have to know at each step in which interval the kangaroo is.

We set the probability to perform a jump between 2 different ranges = 50%, the other 50% is for normal jumps.


When a kangaroo is in the point P(x,y):

if x-coordinate > 2^255 then:
 change interval
else:
 do a normal jumps (x-coordinate % 16, small or big steps depends on the interval in which P is)


If  x-coordinate > 2^255 then:
 you compute a single multiplication:   beta*x (or beta^2*x, it depends on the interval in which P is)
else:
  you compute the normal jump P+k*G (or P+s*lambda*G, it depends on the interval in which P is)

 (3 multiplications for the inverse + 1M + 1S for the x-coordinate + 1M for the y-coordinate = about 6M + some additions)


You cannot have loops, you have only to avoid 2 consecutive jumps between 2 ranges like: P -> lambda*P -> lambda^2*(lambda*P) = P, in this case you do a normal step P+kG instead of computing lambda*P:

P -> lambda*P -> lambda^2*(lambda*P) = P  instead of doing lambda^2*(lambda*P) you do P+k*G  (this jump depends on the x-coordinate of P; note you don't have to compute lambda^2*(lambda*P), you mantain in the memory the last point P)

if another kangaroo falls in the same (lambda*P) from another point you will have a collision:

Q -> Q + lambda*7*G = lambda*P -> lambda^2*(lambda*P) = P -> lambda*P  instead of doing again lambda*P, you do P+k*G as before


If we look at a couple of consecutive jumps, we will have:

                                            probability
normal jump + normal jump = (1/2)^2
normal jump + lambda jump = (1/2)^2
lambda jump + normal jump =  (1/2)^2
lambda jump + lambda jump= (1/2)^2 but -> loop, it is forbidden
-> it becomes: lambda jump  + lambda jump + normal jump


On average for each 2 points we need to perform (1/2)^2*12M + (1/2)^2*2*7M + (1/2)^2*7M = 8.25M,    4,125 M for each point.


If you double the interval:

[1,2,3,....., n-1,n] + [lambda, 2*lambda, 3*lambda, ....., n*lambda]

then you perform sqrt(2).2.sqrt(N) steps in the same time you are performing now sqrt(2).2.(4,125/6).sqrt(N)= 1.945*sqrt(N)


But

we have a collision not only when 2 kangaroos (a wild and a tame) meet in a single range (first o second); even when they fall in 2 points P and lambda*P (with the same y-coordinate), there is a 75% that they will continue on the same path:
                                
P -> lambda*P -> P    there is a collision (same path)
P -> lambda*P -> lambda*P + lambda*7*G  there is a collision (same path)
P -> P+12*G        and  lambda*P -> P     there is a collision (same path)
P -> P+12*G   and  lambda*P ->   lambda*P + lambda*7*G  there is no collision
 there is no a collision, one kangaroo jumps from P to P+12*G and the other one jumps from  lambda*P to lambda*P + lambda*7*G

In a N-space, we need 2 lists of sqrt(N) points to form sqrt(N)*sqrt(N) = N couples. N couples over N^2 possible couples, on average we should generate N couples to get one with the same elements (a collision)

In our 2N-space we have 2N couples with the same 2 points + 2N couples with 2 related points (P,lambda*P). On average 3/4 of (P,lambda*P) will produce (if xP > 2^255 or if x(lambda*P)>2^255) a collision,
then we have (2N + 3/4*2N) couples of collisions over (2N*2N) possible couple.
The probability to find a collision for each couple is:

   (7/4 * 2N) / (4N^2) = 14/16 * 1/N = 7/8 * 1/N  instead of 1/2N, this means 7/4 bigger.

We need than 8N/7 couples to have a collision, 2 lists of sqrt(8N/7) points -> 2,14.sqrt(N)  'cheap' points ->
2,14.(4,125/6).sqrt(N) = 1,47.sqrt(N) normal points
jr. member
Activity: 43
Merit: 1
May 11, 2020, 03:00:42 PM
Yeah the multi pubkey search as for me more interesting and on github there was a first topic about this)
newbie
Activity: 22
Merit: 3
May 11, 2020, 01:50:28 PM
Who else other than you has that kind of GPU power?
Those of us lucky to have 4 2080 Ti's have no chance of solving anything in time.

A lot of miners do, but we do not have a server with the much RAM available to solve the (110-bit interval) problem as fast as possible. So we need to come up with the best time-memory trade-off parameters. Also, I by far are more interested in the intellectual challenge, although the prize (in BTC) can be a very good motivator for some.

Why don't you focus on the harder 115 and 120 bit problems and give the smaller people a chance? You have 20-30 times the GPUs of most people here.
member
Activity: 348
Merit: 34
May 11, 2020, 01:14:40 PM
Multiple search is possible but I won't work on it first.
I will work on a distributed client/server version.
The #110 will be likely solved soon.
With the distributed version, I'm almost sure the #115 will also be solved.

110 solve depend on multiple pubkeys search at once, in others , you all will spend time in jumping here and there, mean you need lot of gpu powers, you need lot of ram, etc, and you can not manage all things togather, and no one have these all option available, only my last word, if you design multiple pubkeys search ver, that will be solution for just within 3 days Smiley
Enjoy your research!!!
sr. member
Activity: 462
Merit: 701
May 11, 2020, 12:32:18 PM
Here is loaded file with new code

So it takes now up to 6sec to transfer the kangaroos, try with a timeout of ~10sec. I don't know if cuda can do parallel transfer on multiple GPU...
I'll add a way to set the up the timeout in the next release and see if I can speed up more this data transfer...
jr. member
Activity: 91
Merit: 3
May 11, 2020, 10:56:59 AM
Here is loaded file with new code

Code:
Keys :1
LoadWork: [HashTalbe 8.5/28.1MB] [00s]
Number of CPU thread: 0
Range width: 2^86
Jump Avg distance: 2^42.96
Number of kangaroos: 2^22.49
Expected operations: 2^44.47
Expected RAM: 901.5MB
DP size: 20 [0xFFFFF00000000000]
Fetch kangaroos: 983040
Fetch kangaroos: 983040
Fetch kangaroos: 983040
Fetch kangaroos: 983040
Fetch kangaroos: 983040
Fetch kangaroos: 983040
LoadWork: [2^22.49 kangaroos loaded] [0 created] [01s]
GPU: GPU #2 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
GPU: GPU #0 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
GPU: GPU #1 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#2: 2^19.91 kangaroos [0.8s]
SolveKeyGPU Thread GPU#0: 2^19.91 kangaroos [0.8s]
GPU: GPU #4 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
GPU: GPU #5 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
GPU: GPU #3 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#1: 2^19.91 kangaroos [1.6s]
SolveKeyGPU Thread GPU#3: 2^19.91 kangaroos [1.0s]
SolveKeyGPU Thread GPU#5: 2^19.91 kangaroos [1.0s]
SolveKeyGPU Thread GPU#4: 2^19.91 kangaroos [1.1s]
[1999.63 MK/s][GPU 1999.63 MK/s][Count 2^38.23][Dead 0][13:26 (Avg 03:23:47)][11.4/35.5MB]
SaveWork timweout !
[62.04 MK/s][GPU 62.04 MK/s][Count 2^38.25][Dead 0][14:27 (Avg 4.6d)][11.5/35.7MB]  6MB]
SaveWork timweout !
[59.46 MK/s][GPU 59.46 MK/s][Count 2^38.26][Dead 0][15:28 (Avg 4.8d)][11.6/35.9MB]
SaveWork: safe.work1..............................done [551.6 MB] [25s] Mon May 11 18:03:44 2020
[1949.67 MK/s][GPU 1949.67 MK/s][Count 2^38.51][Dead 0][16:30 (Avg 03:29:00)][13.4/39.7MB]
SaveWork timweout !
[59.95 MK/s][GPU 59.95 MK/s][Count 2^38.52][Dead 0][17:32 (Avg 4.7d)][13.5/39.9MB]  7MB]
SaveWork timweout !
[59.30 MK/s][GPU 59.30 MK/s][Count 2^38.53][Dead 0][18:33 (Avg 4.8d)][13.6/40.1MB]
SaveWork: safe.work1..............................done [553.6 MB] [28s] Mon May 11 18:06:52 2020
[1966.21 MK/s][GPU 1966.21 MK/s][Count 2^38.72][Dead 0][19:33 (Avg 03:27:15)][15.2/43.0MB]
SaveWork timweout !
[60.46 MK/s][GPU 60.46 MK/s][Count 2^38.73][Dead 0][20:36 (Avg 4.7d)][15.3/43.2MB]  1MB]
SaveWork timweout !
[61.23 MK/s][GPU 61.23 MK/s][Count 2^38.74][Dead 0][21:38 (Avg 4.6d)][15.4/43.3MB]
SaveWork: safe.work1..............................done [555.4 MB] [26s] Mon May 11 18:09:56 2020
[2002.32 MK/s][GPU 2002.32 MK/s][Count 2^38.92][Dead 0][22:39 (Avg 03:23:31)][17.1/46.3MB]
SaveWork timweout !
[62.15 MK/s][GPU 62.15 MK/s][Count 2^38.93][Dead 0][23:41 (Avg 4.6d)][17.2/46.4MB]  3MB]
SaveWork timweout !
[60.69 MK/s][GPU 60.69 MK/s][Count 2^38.93][Dead 0][24:41 (Avg 4.7d)][17.3/46.6MB]
SaveWork timweout !
[62.82 MK/s][GPU 62.82 MK/s][Count 2^38.94][Dead 0][25:41 (Avg 4.5d)][17.4/46.7MB]
SaveWork timweout !
[58.73 MK/s][GPU 58.73 MK/s][Count 2^38.95][Dead 0][26:44 (Avg 4.8d)][17.5/46.9MB]
SaveWork timweout !
[59.46 MK/s][GPU 59.46 MK/s][Count 2^38.96][Dead 0][27:46 (Avg 4.8d)][17.6/47.0MB]
SaveWork: safe.work1..............................done [557.6 MB] [28s] Mon May 11 18:16:05 2020
[282.34 MK/s][GPU 282.34 MK/s][Count 2^38.98][Dead 0][28:19 (Avg 1.0d)][17.8/47.4MB]

and this is -winfo from Workfile

Code:
Version   : 0
DP bits   : 20
Start     : 2000000000000000000000000000
Stop      : 3FFFFFFFFFFFFFFFFFFFFFFFFFFF
Key       : 0309976BA5570966BF889196B7FDF5A0F9A1E9AB340556EC29F8BB60599616167D
Count     : 534868131840 2^38.960
Time      : 27:46
DP Size   : 17.6/47.0MB
DP Count  : 510335 2^18.961
HT Max    : 11 [@ 0090E4]
HT Min    : 0 [@ 000008]
HT Avg    : 1.95
HT SDev   : 1.40
Kangaroos : 5898240 2^22.492
newbie
Activity: 17
Merit: 0
May 11, 2020, 10:37:28 AM
I did a small patch to try to speed up kangaroo transfer and also changed the timeout to 3 sec. Could you try it and tell me if it improves something ?

I had same problem. Also, when the save fails, the GPU downs from ~500M to ~20M.

The 3sec patch fixed it. Thanks


jr. member
Activity: 91
Merit: 3
May 11, 2020, 10:02:52 AM
Yeah I will test when I am back on PC
sr. member
Activity: 462
Merit: 701
May 11, 2020, 09:39:05 AM
I did a small patch to try to speed up kangaroo transfer and also changed the timeout to 3 sec. Could you try it and tell me if it improves something ?
sr. member
Activity: 462
Merit: 701
May 11, 2020, 08:28:55 AM
When you save kangaroo, the CPU has to get back all kangaroos from GPUs memory this might be long.
Could you reload a work and post the output ?
jr. member
Activity: 91
Merit: 3
May 11, 2020, 08:17:40 AM
Code:
Number of kangaroos: 2^22.49
Suggested DP: 20
Expected operations: 2^44.47
Expected RAM: 901.5MB
DP size: 20 [0xFFFFF00000000000]
GPU: GPU #1 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#1: creating kangaroos...
GPU: GPU #2 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#2: creating kangaroos...
GPU: GPU #0 P104-100 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#0: creating kangaroos...
GPU: GPU #3 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#3: creating kangaroos...
GPU: GPU #4 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#4: creating kangaroos...
GPU: GPU #5 GeForce GTX 1070 (15x128 cores) Grid(30x256) (99.0 MB used)
SolveKeyGPU Thread GPU#5: creating kangaroos...
SolveKeyGPU Thread GPU#2: 2^19.91 kangaroos [11.5s]
SolveKeyGPU Thread GPU#1: 2^19.91 kangaroos [11.7s]
SolveKeyGPU Thread GPU#0: 2^19.91 kangaroos [13.9s]
SolveKeyGPU Thread GPU#3: 2^19.91 kangaroos [11.7s]
SolveKeyGPU Thread GPU#5: 2^19.91 kangaroos [11.2s]
SolveKeyGPU Thread GPU#4: 2^19.91 kangaroos [11.5s]
[1984.76 MK/s][GPU 1984.76 MK/s][Count 2^36.22][Dead 0][45s (Avg 03:25:19)][4.3/14.4MB]
SaveWork timweout !
[62.13 MK/s][GPU 62.13 MK/s][Count 2^36.28][Dead 0][01:47 (Avg 4.6d)][4.4/14.8MB]  5MB]
SaveWork timweout !
[59.95 MK/s][GPU 59.95 MK/s][Count 2^36.33][Dead 0][02:49 (Avg 4.7d)][4.5/15.2MB]
SaveWork: safe.work1..............................done [544.5 MB] [26s] Mon May 11 15:17:09 2020
[523.14 MK/s][GPU 523.14 MK/s][Count 2^36.54][Dead 0][03:22 (Avg 12:58:57)][4.9/16.6MB]

Code:
[1984.76 MK/s][GPU 1984.76 MK/s][Count 2^36.22][Dead 0][45s (Avg 03:25:19)][4.3/14.4MB]
SaveWork timweout !
[62.13 MK/s][GPU 62.13 MK/s][Count 2^36.28][Dead 0][01:47 (Avg 4.6d)][4.4/14.8MB]  5MB]
SaveWork timweout !
[59.95 MK/s][GPU 59.95 MK/s][Count 2^36.33][Dead 0][02:49 (Avg 4.7d)][4.5/15.2MB]
SaveWork: safe.work1..............................done [544.5 MB] [26s] Mon May 11 15:17:09 2020
[2034.60 MK/s][GPU 2034.60 MK/s][Count 2^37.12][Dead 0][03:50 (Avg 03:20:17)][6.3/21.7MB]
SaveWork timweout !
[61.08 MK/s][GPU 61.08 MK/s][Count 2^37.15][Dead 0][04:52 (Avg 4.6d)][6.4/22.0MB]  8MB]
SaveWork timweout !
[57.19 MK/s][GPU 57.19 MK/s][Count 2^37.18][Dead 0][05:55 (Avg 4.9d)][6.5/22.3MB]
SaveWork timweout !
[63.30 MK/s][GPU 63.30 MK/s][Count 2^37.21][Dead 0][06:55 (Avg 4.5d)][6.6/22.6MB]
SaveWork timweout !
[62.27 MK/s][GPU 62.27 MK/s][Count 2^37.24][Dead 0][07:58 (Avg 4.5d)][6.7/22.9MB]
SaveWork timweout !
[60.29 MK/s][GPU 60.29 MK/s][Count 2^37.26][Dead 0][08:58 (Avg 4.7d)][6.8/23.2MB]
SaveWork: safe.work1..............................done [546.8 MB] [26s] Mon May 11 15:23:19 2020
[1999.43 MK/s][GPU 1999.43 MK/s][Count 2^37.71][Dead 0][09:59 (Avg 03:23:48)][8.5/28.4MB]
SaveWork timweout !
[1306.40 MK/s][GPU 1306.40 MK/s][Count 2^37.72][Dead 0][10:09 (Avg 05:11:55)][8.6/28.4MB]
jr. member
Activity: 91
Merit: 3
May 11, 2020, 08:15:10 AM
yes 6 GPUs and i use -ws
Jump to: