Pages:
Author

Topic: Solving ECDLP with Kangaroos: Part 1 + 2 + RCKangaroo - page 3. (Read 3351 times)

member
Activity: 127
Merit: 32
The 135 remains technically feasible with very large resources like that used by retiredcoder (400 GPU 4090 if he is telling the truth) and with a very well optimized RCKangaroo, depending on the position of the privateKey it would take between 0 and 1.2 years to resolve it. Depending on the price of bitcoin, the operation can be profitable but it must be possible to do so with such a resource. for 99% of people who try to solve it it is useless to do so unless they believe in phenomenal luck like the lottery it will be a waste of money and time
newbie
Activity: 6
Merit: 0
If one RTX4090 card can do it in +/- 249934 days (68 years). Then eighty RTX4090 cards can do it i in less than a year.


Who taught you mathematics?
249934 days=684 years
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
If one RTX4090 card can do it in +/- 249934 days (68 years). Then eighty RTX4090 cards can do it i in less than a year.

if will be not one 130, but 2^45 in 2^90 how long it take ?
jr. member
Activity: 56
Merit: 2
If one RTX4090 card can do it in +/- 249934 days (68 years). Then eighty RTX4090 cards can do it i in less than a year.
?
Activity: -
Merit: -
To solve the # 135 puzzle with RCKangaroo, it would take tens of thousands of years to use an RTX4090 for computation.

Mathematicians need to continue working hard to create algorithms that are better than the kangaroo algorithm in order to complete this task.

+/- 249934d

CUDA devices: 1, CUDA driver/runtime: 12.7/12.6
GPU 0: NVIDIA GeForce RTX 4090, 23.99 GB, 128 CUs, cap 8.9, PCI 1, L2 size: 73728 KB
Total GPUs for work: 1

MAIN MODE

Solving public key
X: 145D2611C823A396EF6712CE0F712F09B9B4F3135E3E0AA3230FB9B6D08D1E16
Y: 667A05E9A1BDD6F70142B66558BD12CE2C0F9CBC7001B20C8A6A109C80DC5330
Offset: 0000000000000000000000000000004000000000000000000000000000000000

Solving point: Range 134 bits, DP 16, start...
SOTA method, estimated ops: 2^67.202, RAM for DPs: 96468992.188 GB. DP and GPU overheads not included!
Estimated DPs per kangaroo: 3292808260.267.
GPU 0: allocated 4437 MB, 786432 kangaroos.
GPUs started...
MAIN: Speed: 7859 MKeys/s, Err: 0, DPs: 1154K/2589569785738K, Time: 0d:00h:00m, Est: 249934d:16h:16m
MAIN: Speed: 7822 MKeys/s, Err: 0, DPs: 2359K/2589569785738K, Time: 0d:00h:00m, Est: 251116d:22h:21m
MAIN: Speed: 7822 MKeys/s, Err: 0, DPs: 3563K/2589569785738K, Time: 0d:00h:00m, Est: 251116d:22h:21m
newbie
Activity: 6
Merit: 0
To solve the # 135 puzzle with RCKangaroo, it would take tens of thousands of years to use an RTX4090 for computation.

Mathematicians need to continue working hard to create algorithms that are better than the kangaroo algorithm in order to complete this task.
member
Activity: 165
Merit: 26
Whenever I see a post going like "splitting the range is a good idea and not an issue at all" after all the uncountable and obvious proofs to the contrary, I do 50 pushups, to compensate for the 41% extra steps required to solve the same problem in 2 ranges.
newbie
Activity: 22
Merit: 1
Hi all,

first of all thanks for RetiredCoder for his research and work and providing it to the community!

I am new here but followed all the posts already for a while and trying to get my head around it Cheesy

I got a dumb question, its mentioned the theres no workload tooling, but lets say i have 2 machines I can run, can I split the worklod e.g. as sample take puzzle #85.
Do I just split the range parameter as well as the start point?

Your question isn’t dumb at all! Yes, if you're running a workload like solving puzzle #85 and have multiple machines to use, splitting the range and start point is a common approach to divide the workload.

How it works:
Range Parameter:
If the puzzle involves iterating over a range of numbers or states, you can split that range across the machines. For instance, if the range is [0, 100], you could let:

Machine 1 handle [0, 49]
Machine 2 handle [50, 100]
Start Point:
If there's a start parameter involved, ensure each machine knows where to begin for its portion of the range.

Independent Processing:
Make sure that each machine can independently process its assigned range without depending on results from the other. This ensures no overlap or missed parts.

Steps to Implement:
Divide the workload logically based on the problem's parameters (e.g., ranges or chunks of input data).
Ensure the start and end points for each machine are mutually exclusive.
If the workload has side effects or shared state, manage synchronization carefully (e.g., use locks or shared memory if needed, or avoid shared state altogether).
Example:
Assuming puzzle #85 involves calculating something over a range [0, 1000]:

On Machine 1, run the code with start=0 and end=499.
On Machine 2, run the code with start=500 and end=1000.
This approach scales well as long as:

The problem is divisible.
The results from one range don’t depend on another range.
Let me know if you'd like more detailed help with setting up the splitting logic!

Thanks for the fast response, and yes maybe I need more clarity on the splitting. I started to get my head around the details just 2 weeks ago and sometimes there comes a lot of questionmarks.

Lets assume with puzzle #81, I got the private key range in 100000000000000000000:1ffffffffffffffffffff

So if I have one machine only I would do:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 100000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce

And for 2 machines, would I split the range in half, which would be: 17fffffffffffffffffff

Machine 1:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 100000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce

Machine 2:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 180000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce


With the range I am not sure about.
I did not find the "END" parameter you mentioned in the RCKangaroo, and im sure it works with the range parameter to determine "the end"


RCKangaroo supports only the start of the range, but this is not an issue. You can simply split the range into multiple pieces depending on how many machines you have. For example, if the key range is from 100000000000000000000 to 1ffffffffffffffffffff, and you split the range into two parts, it would look like:

Machine 1:
Code:
Start at 100000000000000000000

Machine 2:
Code:
Start at 100800000000000000000

"Don't worry about the end of the range, as some machines will find the key before reaching the end."

PD: You can use AI to create a Python tool that splits the hex range. Wink
?
Activity: -
Merit: -
Hi all,

first of all thanks for RetiredCoder for his research and work and providing it to the community!

I am new here but followed all the posts already for a while and trying to get my head around it Cheesy

I got a dumb question, its mentioned the theres no workload tooling, but lets say i have 2 machines I can run, can I split the worklod e.g. as sample take puzzle #85.
Do I just split the range parameter as well as the start point?

Your question isn’t dumb at all! Yes, if you're running a workload like solving puzzle #85 and have multiple machines to use, splitting the range and start point is a common approach to divide the workload.

How it works:
Range Parameter:
If the puzzle involves iterating over a range of numbers or states, you can split that range across the machines. For instance, if the range is [0, 100], you could let:

Machine 1 handle [0, 49]
Machine 2 handle [50, 100]
Start Point:
If there's a start parameter involved, ensure each machine knows where to begin for its portion of the range.

Independent Processing:
Make sure that each machine can independently process its assigned range without depending on results from the other. This ensures no overlap or missed parts.

Steps to Implement:
Divide the workload logically based on the problem's parameters (e.g., ranges or chunks of input data).
Ensure the start and end points for each machine are mutually exclusive.
If the workload has side effects or shared state, manage synchronization carefully (e.g., use locks or shared memory if needed, or avoid shared state altogether).
Example:
Assuming puzzle #85 involves calculating something over a range [0, 1000]:

On Machine 1, run the code with start=0 and end=499.
On Machine 2, run the code with start=500 and end=1000.
This approach scales well as long as:

The problem is divisible.
The results from one range don’t depend on another range.
Let me know if you'd like more detailed help with setting up the splitting logic!

Thanks for the fast response, and yes maybe I need more clarity on the splitting. I started to get my head around the details just 2 weeks ago and sometimes there comes a lot of questionmarks.

Lets assume with puzzle #81, I got the private key range in 100000000000000000000:1ffffffffffffffffffff

So if I have one machine only I would do:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 100000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce

And for 2 machines, would I split the range in half, which would be: 17fffffffffffffffffff

Machine 1:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 100000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce

Machine 2:
Code:
RCKangaroo.exe -dp 16 -range 80 -start 180000000000000000000 -pubkey 351e605fac813965951ba433b7c2956bf8ad95ce


With the range I am not sure about.
I did not find the "END" parameter you mentioned in the RCKangaroo, and im sure it works with the range parameter to determine "the end"




newbie
Activity: 22
Merit: 1
Hi all,

first of all thanks for RetiredCoder for his research and work and providing it to the community!

I am new here but followed all the posts already for a while and trying to get my head around it Cheesy

I got a dumb question, its mentioned the theres no workload tooling, but lets say i have 2 machines I can run, can I split the worklod e.g. as sample take puzzle #85.
Do I just split the range parameter as well as the start point?

Your question isn’t dumb at all! Yes, if you're running a workload like solving puzzle #85 and have multiple machines to use, splitting the range and start point is a common approach to divide the workload.

How it works:
Range Parameter:
If the puzzle involves iterating over a range of numbers or states, you can split that range across the machines. For instance, if the range is [0, 100], you could let:

Machine 1 handle [0, 49]
Machine 2 handle [50, 100]
Start Point:
If there's a start parameter involved, ensure each machine knows where to begin for its portion of the range.

Independent Processing:
Make sure that each machine can independently process its assigned range without depending on results from the other. This ensures no overlap or missed parts.

Steps to Implement:
Divide the workload logically based on the problem's parameters (e.g., ranges or chunks of input data).
Ensure the start and end points for each machine are mutually exclusive.
If the workload has side effects or shared state, manage synchronization carefully (e.g., use locks or shared memory if needed, or avoid shared state altogether).
Example:
Assuming puzzle #85 involves calculating something over a range [0, 1000]:

On Machine 1, run the code with start=0 and end=499.
On Machine 2, run the code with start=500 and end=1000.
This approach scales well as long as:

The problem is divisible.
The results from one range don’t depend on another range.
Let me know if you'd like more detailed help with setting up the splitting logic!
?
Activity: -
Merit: -
Hi all,

first of all thanks for RetiredCoder for his research and work and providing it to the community!

I am new here but followed all the posts already for a while and trying to get my head around it Cheesy

I got a dumb question, its mentioned the theres no workload tooling, but lets say i have 2 machines I can run, can I split the worklod e.g. as sample take puzzle #85.
Do I just split the range parameter as well as the start point?
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
I appreciate it for your great work!
It work fine for my A3000 and it more faster than JPL about 30%!
Will you work for the client/server version the next step?
Thanks.

Code:
CUDA devices: 1, CUDA driver/runtime: 12.5/12.5
GPU 0: NVIDIA RTX A3000 Laptop GPU, 5.70 GB, 32 CUs, cap 8.6, PCI 1, L2 size: 3072 KB
Total GPUs for work: 1

MAIN MODE

Solving public key
X: 145D2611C823A396EF6712CE0F712F09B9B4F3135E3E0AA3230FB9B6D08D1E16
Y: 667A05E9A1BDD6F70142B66558BD12CE2C0F9CBC7001B20C8A6A109C80DC5330
Offset: 0000000000000000000000000000004000000000000000000000000000000000

Solving point: Range 135 bits, DP 16, start...
SOTA method, estimated ops: 2^67.202, RAM for DPs: 96468992.188 GB. DP and GPU overheads not included!
Estimated DPs per kangaroo: 13171233041.067.
GPU 0: allocated 1118 MB, 196608 kangaroos.
GPUs started...
MAIN: Speed: 969 MKeys/s, Err: 0, DPs: 141K/2589569785738K, Time: 0d:00h:00m, Est: 2027075d:23h:50m
MAIN: Speed: 959 MKeys/s, Err: 0, DPs: 288K/2589569785738K, Time: 0d:00h:00m, Est: 2048213d:09h:16m
MAIN: Speed: 957 MKeys/s, Err: 0, DPs: 435K/2589569785738K, Time: 0d:00h:00m, Est: 2052493d:20h:58m
MAIN: Speed: 957 MKeys/s, Err: 0, DPs: 582K/2589569785738K, Time: 0d:00h:00m, Est: 2052493d:20h:58m
MAIN: Speed: 955 MKeys/s, Err: 0, DPs: 724K/2589569785738K, Time: 0d:00h:00m, Est: 2056792d:06h:58m
MAIN: Speed: 955 MKeys/s, Err: 0, DPs: 871K/2589569785738K, Time: 0d:00h:01m, Est: 2056792d:06h:58m
MAIN: Speed: 954 MKeys/s, Err: 0, DPs: 1018K/2589569785738K, Time: 0d:00h:01m, Est: 2058948d:06h:10m


A3000 slow then 4090
newbie
Activity: 9
Merit: 0
I appreciate it for your great work!
It work fine for my A3000 and it more faster than JPL about 30%!
Will you work for the client/server version the next step?
Thanks.

Code:
CUDA devices: 1, CUDA driver/runtime: 12.5/12.5
GPU 0: NVIDIA RTX A3000 Laptop GPU, 5.70 GB, 32 CUs, cap 8.6, PCI 1, L2 size: 3072 KB
Total GPUs for work: 1

MAIN MODE

Solving public key
X: 145D2611C823A396EF6712CE0F712F09B9B4F3135E3E0AA3230FB9B6D08D1E16
Y: 667A05E9A1BDD6F70142B66558BD12CE2C0F9CBC7001B20C8A6A109C80DC5330
Offset: 0000000000000000000000000000004000000000000000000000000000000000

Solving point: Range 135 bits, DP 16, start...
SOTA method, estimated ops: 2^67.202, RAM for DPs: 96468992.188 GB. DP and GPU overheads not included!
Estimated DPs per kangaroo: 13171233041.067.
GPU 0: allocated 1118 MB, 196608 kangaroos.
GPUs started...
MAIN: Speed: 969 MKeys/s, Err: 0, DPs: 141K/2589569785738K, Time: 0d:00h:00m, Est: 2027075d:23h:50m
MAIN: Speed: 959 MKeys/s, Err: 0, DPs: 288K/2589569785738K, Time: 0d:00h:00m, Est: 2048213d:09h:16m
MAIN: Speed: 957 MKeys/s, Err: 0, DPs: 435K/2589569785738K, Time: 0d:00h:00m, Est: 2052493d:20h:58m
MAIN: Speed: 957 MKeys/s, Err: 0, DPs: 582K/2589569785738K, Time: 0d:00h:00m, Est: 2052493d:20h:58m
MAIN: Speed: 955 MKeys/s, Err: 0, DPs: 724K/2589569785738K, Time: 0d:00h:00m, Est: 2056792d:06h:58m
MAIN: Speed: 955 MKeys/s, Err: 0, DPs: 871K/2589569785738K, Time: 0d:00h:01m, Est: 2056792d:06h:58m
MAIN: Speed: 954 MKeys/s, Err: 0, DPs: 1018K/2589569785738K, Time: 0d:00h:01m, Est: 2058948d:06h:10m

member
Activity: 124
Merit: 37
I am trying to get the code to run on my humble GEFORCE 1060
I have made etar's modifications and get this.
CUDA devices: 1, CUDA driver/runtime: 12.2/12.0
GPU 0: NVIDIA GeForce GTX 1060 6GB, 5.93 GB, 10 CUs, cap 6.1, PCI 83, L2 size: 1536 KB
Total GPUs for work: 1

BENCHMARK MODE

Solving point: Range 78 bits, DP 16, start...
SOTA method, estimated ops: 2^39.202, RAM for DPs: 0.547 GB. DP and GPU overheads not included!
Estimated DPs per kangaroo: 157.013.
GPU 0, cuSetGpuParams failed: invalid argument!
GPU 0 Prepare failed
GPUs started...
BENCH: Speed: 125829 MKeys/s, Err: 0, DPs: 0K/9646K, Time: 0d:00h:00m, Est: 0d:00h:00m

How can I fix the GPU 0, cuSetGpuParams failed: invalid argument! error?

Thank you.
newbie
Activity: 22
Merit: 1
Hello would anyone be able to adapt it to RTX 20xx series and compile it for windows?

Yes, some people already took my code optimized for 40xx, compiled it on 1xxx/20xx/30xx and said that it's slow, completely unexpected behavior Cheesy
However, you can find instructions how to compile in this thread.

I really appreciate it. I prioritize faster results over raw speed, as they are not the same thing.
Thks Roll Eyes
?
Activity: -
Merit: -
Hello would anyone be able to adapt it to RTX 20xx series and compile it for windows?

Yes, some people already took my code optimized for 40xx, compiled it on 1xxx/20xx/30xx and said that it's slow, completely unexpected behavior Cheesy
However, you can find instructions how to compile in this thread.
member
Activity: 127
Merit: 32
Hello would anyone be able to adapt it to RTX 20xx series and compile it for windows?
?
Activity: -
Merit: -
Discovered a problem
I didn't modify your source code, I ran it directly using the program you compiled
However, after running for a period of time, a large number of errors occurred.
DPs buffer overflow,some points lost, increase DP value!
DPs buffer overflow,some points lost, increase DP value!
DPs buffer overflow,some points lost, increase DP value!
At the beginning, it was normal, but the error message above kept appearing. May I ask why?

Yes, if parameters are not optimal, in some cases it will show you a warning.
In this case you should increase "-dp" option value, DB is growing and CPU cannot add so many DPs every second.
newbie
Activity: 6
Merit: 0

No, it's "cudaStreamSetAttribute".
Be careful with modifications if you don't know what you are doing exactly. The algorithm is not so straight as the classic one, for example, if you damage loop handling it will work for small ranges but fail for high ranges.



Discovered a problem
I didn't modify your source code, I ran it directly using the program you compiled
However, after running for a period of time, a large number of errors occurred.

DPs buffer overflow,some points lost, increase DP value!
DPs buffer overflow,some points lost, increase DP value!
DPs buffer overflow,some points lost, increase DP value!

At the beginning, it was normal, but the error message above kept appearing. May I ask why?
?
Activity: -
Merit: -
Hello, can you tell me in which file you can find L2 and what you have to deactivate?
Thank you
I found it GpuKang.cpp that
Is that right there?

No, it's "cudaStreamSetAttribute".
Be careful with modifications if you don't know what you are doing exactly. The algorithm is not so straight as the classic one, for example, if you damage loop handling it will work for small ranges but fail for high ranges.
Pages:
Jump to: