Pages:
Author

Topic: Pollard's kangaroo ECDLP solver - page 9. (Read 60189 times)

newbie
Activity: 9
Merit: 0
January 12, 2024, 02:54:28 PM
I pushed an update to Kangaroo-256 a couple of days ago that fixes the botched GPU implementation. https://gitea.datahoarding.agency/ZenulAbidin/Kangaroo-256

Next I want to add the ability to create your own dpmask - whether you want certain bits to always be set and others always be cleared. I think I will need two different masks for this, one for the bits that should be set and another for the ones that should be cleared.

Currently all of the dpmask bits are at the left of every kangaroo but I think the ability to define it at random positions could influence the speed it takes to find a collision. I could even make the dpmask to change at random it I want to.


Hi NotATether, can you please explain what this version with 256 can do that the original version from JeanLucPons can not do ?

- Is it just faster ?
- can solve ranges that the JeanLucPons version can not solve (like puzzle 160 for example because the key space is wider)?

I know that i am a bit later as that was posted 3 years ago, but i find the subject very interesting.

Regards
jr. member
Activity: 44
Merit: 2
December 26, 2023, 03:56:42 AM
but I had tweaked the way DPs were found.

Can we have your code?
full member
Activity: 1232
Merit: 242
Shooters Shoot...
December 25, 2023, 09:57:03 PM
Quote
4090 cards have a lot more cache than previous ones.
Looking for changes to fit to cache as much data as possible to speed up kangaroo and prevent memory bottleneck kangaroo have.

Code def needs tweaked to see if there as any pickup in performance.

Code also needs tweaked in how it finds DPs.

The best speed I got out of Kangaroo with a 4090 was 7,750 MKey/s and an A100 got 7,350 MKey/s; but I had tweaked the way DPs were found.

I haven't messed with the code since #125 was found though.

Hope you have success!

Just found old exe file and ran a full 44 bit range to check avg speed; 3060Ti = 2,685 MKey/s Kangaroo.
jr. member
Activity: 44
Merit: 2
December 25, 2023, 04:56:53 PM


But why you use 01 as starting point?   Roll Eyes Roll Eyes

I already told you:


For sequential private keys there is the NextKey function.

DoubleDirect does, as the name indicates, only the double of P.

Then, if you start from G:

G
P = DoubleDirect(G) :  G -> P = 2G
P = NextKey(P)         :  P -> P+1 = 3G
P = NextKey(P)         :  P -> P+1 = 4G
P = NextKey(P)         :  P -> P+1 = 5G

and so on


If you use 0000000000000000000000000000000000000000000000000000000000000002

as starting point it should work.

You can't compute G+G = 2G with NextKey, what is not clear?


Point Secp256K1::NextKey(Point &key) {
  // Input key must be reduced and different from G
  // in order to use AddDirect
  return AddDirect(key,G);
 }

You are using the NextKey in the only case you can't (and it is clearly written)

Thank you very much. Now I understand, finally Wink It working on CPU fine. Moving on to GPU Smiley

4090 cards have a lot more cache than previous ones.
Looking for changes to fit to cache as much data as possible to speed up kangaroo and prevent memory bottleneck kangaroo have.
legendary
Activity: 1948
Merit: 2097
December 25, 2023, 04:06:47 PM


But why you use 01 as starting point?   Roll Eyes Roll Eyes

I already told you:


For sequential private keys there is the NextKey function.

DoubleDirect does, as the name indicates, only the double of P.

Then, if you start from G:

G
P = DoubleDirect(G) :  G -> P = 2G
P = NextKey(P)         :  P -> P+1 = 3G
P = NextKey(P)         :  P -> P+1 = 4G
P = NextKey(P)         :  P -> P+1 = 5G

and so on


If you use 0000000000000000000000000000000000000000000000000000000000000002

as starting point it should work.

You can't compute G+G = 2G with NextKey, what is not clear?


Point Secp256K1::NextKey(Point &key) {
  // Input key must be reduced and different from G
  // in order to use AddDirect
  return AddDirect(key,G);
 }

You are using the NextKey in the only case you can't (and it is clearly written)
jr. member
Activity: 44
Merit: 2
December 25, 2023, 03:15:19 PM

Hard to tell anything without the results of what your program is spitting out.

What happens when you use NextKey? Give exact examples.

// test 1 (WORKING) returns correct public key and BTC address for 0x01 and 0x02 private keys
privateKey.SetBase16((char*)"0000000000000000000000000000000000000000000000000000000000000001");
point = secp->ComputePublicKey(&privateKey);
std::cout << " (" << secp->GetPublicKeyHex(true, point) << ") [" << secp->GetAddress(P2PKH, true, point) << "]" << std::endl;

privateKey.AddOne();

point = secp->ComputePublicKey(&privateKey);
std::cout << " (" << secp->GetPublicKeyHex(true, point) << ") [" << secp->GetAddress(P2PKH, true, point) << "]" << std::endl;


// test 2 (NOT WORKING) returns correct public key and BTC address for 0x01 private key only
privateKey.SetBase16((char*)"0000000000000000000000000000000000000000000000000000000000000001");
point = secp->ComputePublicKey(&privateKey);
std::cout << " (" << secp->GetPublicKeyHex(true, point) << ") [" << secp->GetAddress(P2PKH, true, point) << "]" << std::endl;

point = secp->NextKey(point);

std::cout << " (" << secp->GetPublicKeyHex(true, point) << ") [" << secp->GetAddress(P2PKH, true, point) << "]" << std::endl;


// test 3 as arulbero suggested (NOT WORKING) returns correct public key and BTC address for 0x01 private key only
privateKey.SetBase16((char*)"0000000000000000000000000000000000000000000000000000000000000001");
Point P(secp->ComputePublicKey(&privateKey));
std::cout << " (" << secp->GetPublicKeyHex(true, secp->ComputePublicKey(&privateKey)) << ") [" << secp->GetAddress(P2PKH, true, secp->ComputePublicKey(&privateKey)) << "]" << std::endl;

P = secp->NextKey(P);   

std::cout << " (" << secp->GetPublicKeyHex(true, P) << ") [" << secp->GetAddress(P2PKH, true, P) << "]" << std::endl;
legendary
Activity: 1948
Merit: 2097
December 25, 2023, 02:57:48 PM
do you set bool reduce = True?

Yes.
I'm using ComputePublicKey() function from vanitysearch project, not from kangaroo project. It do reduction by default at the end of ComputePublicKey() function.
NextKey() function still not working as it should.

Looking at this code:

https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/SECP256K1.cpp#L42C1-L52C4

Code:
  // Compute Generator table
  Point N(G);
  for(int i = 0; i < 32; i++) {
    GTable[i * 256] = N;
    N = DoubleDirect(N);
    for (int j = 1; j < 255; j++) {
      GTable[i * 256 + j] = N;
      N = AddDirect(N, GTable[i * 256]);
    }
    GTable[i * 256 + 255] = N; // Dummy point for check function
  }


I would try this:

Code:
Int* privateKey = "some random private key";

Point P(secp->ComputePublicKey(&privateKey));


while(1) {

      P = NextKey(P)
}
jr. member
Activity: 44
Merit: 2
December 25, 2023, 02:42:26 PM
do you set bool reduce = True?

Yes.
I'm using ComputePublicKey() function from vanitysearch project, not from kangaroo project. It do reduction by default at the end of ComputePublicKey() function.
NextKey() function still not working as it should.
legendary
Activity: 1948
Merit: 2097
December 25, 2023, 02:15:22 PM

This is NOT working:
point = secp->ComputePublicKey(&privateKey);
while(1)
{   
   point = secp->NextKey(point);
}


https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/SECP256K1.cpp#L59C7-L59C60

Secp256K1::ComputePublicKey(Int *privKey,bool reduce)

do you set bool reduce = True?
full member
Activity: 1232
Merit: 242
Shooters Shoot...
December 25, 2023, 02:10:10 PM

UPD: 25.12.2023
Actually DoubleDirect() also not working. It starting to calculate wrong keys after some interval. Why this is can happen?
The only function work so far for sequential private keys to public keys is ComputePublicKey().


For sequential private keys there is the NextKey function.

DoubleDirect does, as the name indicates, only the double of P.

Then, if you start from G:

G
P = DoubleDirect(G)  :  G -> P = 2G
P = NextKey(P)         :  P -> P+1 = 3G
P = NextKey(P)         :  P -> P+1 = 4G
P = NextKey(P)         :  P -> P+1 = 5G

and so on


But what if I need to start not from G point...

Int* privateKey = "some random private key";
Point* point;

This is working:
point = secp->ComputePublicKey(&privateKey);
while(1)
{
   privateKey.AddOne();
   point = secp->ComputePublicKey(&privateKey);
}


This is NOT working:
point = secp->ComputePublicKey(&privateKey);
while(1)
{   
   point = secp->NextKey(point);
}



Hard to tell anything without the results of what your program is spitting out.

What happens when you use NextKey? Give exact examples.
jr. member
Activity: 44
Merit: 2
December 25, 2023, 01:52:39 PM

UPD: 25.12.2023
Actually DoubleDirect() also not working. It starting to calculate wrong keys after some interval. Why this is can happen?
The only function work so far for sequential private keys to public keys is ComputePublicKey().


For sequential private keys there is the NextKey function.

DoubleDirect does, as the name indicates, only the double of P.

Then, if you start from G:

G
P = DoubleDirect(G)  :  G -> P = 2G
P = NextKey(P)         :  P -> P+1 = 3G
P = NextKey(P)         :  P -> P+1 = 4G
P = NextKey(P)         :  P -> P+1 = 5G

and so on


But what if I need to start not from G point...

Int* privateKey = "some random private key";
Point* point;

This is working:
point = secp->ComputePublicKey(&privateKey);
while(1)
{
   privateKey.AddOne();
   point = secp->ComputePublicKey(&privateKey);
}


This is NOT working:
point = secp->ComputePublicKey(&privateKey);
while(1)
{   
   point = secp->NextKey(point);
}

legendary
Activity: 1948
Merit: 2097
December 25, 2023, 10:00:36 AM

UPD: 25.12.2023
Actually DoubleDirect() also not working. It starting to calculate wrong keys after some interval. Why this is can happen?
The only function work so far for sequential private keys to public keys is ComputePublicKey().


For sequential private keys there is the NextKey function.

DoubleDirect does, as the name indicates, only the double of P.

Then, if you start from G:

G
P = DoubleDirect(G)  :  G -> P = 2G
P = NextKey(P)         :  P -> P+1 = 3G
P = NextKey(P)         :  P -> P+1 = 4G
P = NextKey(P)         :  P -> P+1 = 5G

and so on
jr. member
Activity: 44
Merit: 2
December 22, 2023, 02:16:56 PM

Thanks.
DoubleDirect() works, but it only about 10% faster, than do entire private key to public key calculation with ComputePublicKey() function.
Is any faster way possible to do this 'increment private key by 1 and get public key' method?

UPD: 25.12.2023
Actually DoubleDirect() also not working. It starting to calculate wrong keys after some interval. Why this is can happen?
The only function work so far for sequential private keys to public keys is ComputePublicKey().
legendary
Activity: 1948
Merit: 2097
December 22, 2023, 11:36:37 AM
What is reduced public key?

A public key is a point of the elliptic curve. A point can be represented by 3 coordinates (X,Y,Z) (projective coordinates) or by 2 coordinates (x,y) (affine coordinates).

The passage from projective to affine coordinates is the "reduction"

x = X/Z
y = Y/Z

https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/Point.cpp#L64



Point Secp256K1::NextKey(Point &key) {
  // Input key must be reduced and different from G
  // in order to use AddDirect
  return AddDirect(key,G);
}

Is anybody can explain me why this function in secp256k1.cpp file not work correctly?
As I understand it should return next public key like as increment private key by 1 and compute public key from it.

The addition between 2 points in projective coordinates is performed by the function Add:

https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/SECP256K1.cpp#L369

and it is faster, because it avoids the inverse of x;


the addition between the same 2 points in affine coordinates is performed instead by the function AddDirect:

https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/SECP256K1.cpp#L238.

To compute NextKey function (that calls the AddDirect function), you have:

a) to use a point in affine coordinates (because the AddDirect function needs affine coordinates)
b) to avoid the case P = G, because to perform G+G (in the general P + P ) the AddDirect function doesn't work

you need instead the DoubleDirect function:
https://github.com/JeanLucPons/Kangaroo/blob/354bb80491752262eaca3613557e1bd306b5414d/SECPK1/SECP256K1.cpp#L438  
jr. member
Activity: 44
Merit: 2
December 22, 2023, 09:20:26 AM
Hello.

Point Secp256K1::NextKey(Point &key) {
  // Input key must be reduced and different from G
  // in order to use AddDirect
  return AddDirect(key,G);
}

Is anybody can explain me why this function in secp256k1.cpp file not work correctly?
As I understand it should return next public key like as increment private key by 1 and compute public key from it.
What is reduced public key?

Thanks.
jr. member
Activity: 44
Merit: 2
December 14, 2023, 05:09:26 AM
--snip--
Yes. I tried all possible grid size combinations. Also, I tried to compile on latest cuda version using more recent compute_89 shader model. Same result. The maximum I have on 4090 is about 2500 MK/s. On 3090 is 2200 MK/s. 4090 should be faster. Looks like Kangaroo need some optimization for large number of cores 4090 have. Any ideas where this optimization can be made? Thanks.

I see. While i have no idea which optimization can be done, you might want to try fork of this software or implementation written by different people. Here are few example,
https://github.com/ZenulAbidin/Kangaroo-256
https://github.com/PawelGorny/Kangaroo
https://github.com/secoc/Pollard-Rho-kangaroo

P.S. i haven't tried any of those.

After some tests I found that memory speed is an issue. It GPU memory bound, not GPU chip. This is why it can't scale well for more GPU cores.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
December 12, 2023, 03:43:07 AM
--snip--
Yes. I tried all possible grid size combinations. Also, I tried to compile on latest cuda version using more recent compute_89 shader model. Same result. The maximum I have on 4090 is about 2500 MK/s. On 3090 is 2200 MK/s. 4090 should be faster. Looks like Kangaroo need some optimization for large number of cores 4090 have. Any ideas where this optimization can be made? Thanks.

I see. While i have no idea which optimization can be done, you might want to try fork of this software or implementation written by different people. Here are few example,
https://github.com/ZenulAbidin/Kangaroo-256
https://github.com/PawelGorny/Kangaroo
https://github.com/secoc/Pollard-Rho-kangaroo

P.S. i haven't tried any of those.
jr. member
Activity: 44
Merit: 2
December 11, 2023, 08:53:58 AM
Hello. What parameters or something to change in the code to speed-up this program on 4090 card? It only 1.2 times faster than on 3090 card, but it should be about 1.9 times faster. Thanks.

After searching for all post in this thread[1] and github repository[2], it looks like you're first person who mention "4090" or any RTX 4000 series. And while it's obvious, have you tried all guidance on github repository. For example,

-g g1x,g1y,g2x,g2y,...: Specify GPU(s) kernel gridsize, default is 2*(MP),2*(Core/MP)

Powerfull GPUs with large number of cores won't be very efficient on small range, you can try to decrease the grid size in order to have less kangaroos but the GPU performance may not be optimal

[1] https://ninjastic.space/search?content=4090&topic_id=5244940
[2] https://github.com/search?q=repo%3AJeanLucPons%2FKangaroo%204090&type=issues

Yes. I tried all possible grid size combinations. Also, I tried to compile on latest cuda version using more recent compute_89 shader model. Same result. The maximum I have on 4090 is about 2500 MK/s. On 3090 is 2200 MK/s. 4090 should be faster. Looks like Kangaroo need some optimization for large number of cores 4090 have. Any ideas where this optimization can be made? Thanks.
newbie
Activity: 13
Merit: 0
December 11, 2023, 07:09:02 AM
Hi, i'm studying Kangaroo Software made by JeanLucPons and I'm trying to know if there is a possibility to compute more pubkeys during an attack.

Searching in the official git I've found that Jean said:

Yes,
I will add some note about this on the readme. It is a bit tricky.
Multi key support is not yet supported, for this you will need first to create a large tame array for a given range and then attack keys with it.

What does it mean? large tame array? Means maybe doing an one pubkey attack in a specific range complete that range using save.work and use it for other keys to attack? If I wanna do multi pubkey attack?

Thank you guys for replies.

legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
December 11, 2023, 06:12:21 AM
Hello. What parameters or something to change in the code to speed-up this program on 4090 card? It only 1.2 times faster than on 3090 card, but it should be about 1.9 times faster. Thanks.

After searching for all post in this thread[1] and github repository[2], it looks like you're first person who mention "4090" or any RTX 4000 series. And while it's obvious, have you tried all guidance on github repository. For example,

-g g1x,g1y,g2x,g2y,...: Specify GPU(s) kernel gridsize, default is 2*(MP),2*(Core/MP)

Powerfull GPUs with large number of cores won't be very efficient on small range, you can try to decrease the grid size in order to have less kangaroos but the GPU performance may not be optimal

[1] https://ninjastic.space/search?content=4090&topic_id=5244940
[2] https://github.com/search?q=repo%3AJeanLucPons%2FKangaroo%204090&type=issues
Pages:
Jump to: