What algorithm is this ? | Bitcointalksearch.org

678AFDB0

newbie

Activity: 12

Merit: 3

Quote from: ymgve2 on February 06, 2023, 11:41:56 AM

The way you parallelize these kinds of searches is to work with several different private key starting points in parallel, not trying to do several things at once at the low level. A fast FPGA implementation would have several of these curve adders alongside each other, each set up to work on different inputs.

Hello,

Thanks for all the tips! Yes, i also think this is the way. Currently running 9 in parallel per chip and need a huge heatsink, even for slow 50Mhz clock.

NotATether

legendary

Activity: 1568

Merit: 6660

bitcoincleanup.com / bitmixlist.org

Quote from: 678AFDB0 on February 06, 2023, 03:53:22 AM

In the diagram i have posted, i don't see much room for changing the algorithm to execute in parallel, except maybe running the inversion logic in parallel with the last multiplication, but that gives only ~10% efficiency boost due to inversion being much slower than multiplication.

For the Y calculation, since you are not doing any addition/subtraction, you can leave out the modulus until the very end.

You can also try computing multiple points in parallel.

And I'm not sure if factoring will work, but just as how in regular numbers we can compute a factor tree, for numbers representing point coordinates, we can have them in terms of G, G*2, G*3, G*5 and so on. I'm not exactly sure how this would be implemented, but if computers were able to find a prime number much bigger than 2^256, then it should be reasonably possible to enumerate all of the prime numbers from 1 to 2^256 and calculate the products with G, and make some kind of factor tree out of that.

The idea is you'd no longer have to do the algorithm for P, but you could do it for all of its factors instead, and multiply all of the results together at the end.

ymgve2

full member

Activity: 162

Merit: 230

Do NOT use ChatGPT for crypto code, FFS... That snippet is like 10% of the code required to do point addition. This is the code I used some years ago (slow, but readable)

Code:

def modinv(x, n):
    return pow(x, n-2, n)

class Point(object):
    def __init__(self, x, y, inf=False):
        self.x = x
        self.y = y
        self.inf = inf

def curve_add(p, q, N):
    if p.inf:
        return q
        
    if q.inf:
        return p
    
    if p.x == q.x:
        if p.y == q.y:
            d1 = (3 * p.x * p.x) % N
            d2 = (2 * p.y) % N
        else:
            return Point(-1, -1, True)
    else:
        d1 = (q.y - p.y) % N
        d2 = (q.x - p.x) % N

    d2i = modinv(d2, N)
    d = (d1 * d2i) % N
        
    resx = (d * d - p.x - q.x) % N
    resy = (d * (p.x - resx) - p.y) % N
    
    return Point(resx, resy)

def scalar_mul(scalar, p, N):
    t = p
    res = None
    while scalar != 0:
        if scalar & 1 == 1:
            if res is None:
                res = t
            else:
                res = curve_add(res, t, N)
        
        t = curve_add(t, t, N)
        
        scalar = scalar >> 1
        
    return res

The modular inversion can be implemented in a lot of different ways, but as I recall not in any parallel ways. But it is also not a "search".

The way you parallelize these kinds of searches is to work with several different private key starting points in parallel, not trying to do several things at once at the low level. A fast FPGA implementation would have several of these curve adders alongside each other, each set up to work on different inputs.

(As an aside, the modular inversion is by far the slowest part of the process, so most speed focused code use a trick - if you want to find the inverse of A and B, you can calculate the inverse I=1/(A*B), then the inverse of A becomes B*I and the inverse of B becomes A*I - this can be extended to calculate the inverse of any number of integers with only a single inversion)

678AFDB0

newbie

Activity: 12

Merit: 3

Quote from: BlackHatCoiner on February 05, 2023, 12:18:38 PM

Quote from: 678AFDB0 on February 05, 2023, 08:18:51 AM

Is generating public key x point from private key more costly than the euclidean inversion used in this type of point addition ?

Yes. Elliptic curve multiplication is generally considered more computationally expensive than any point addition, because it involves lots of double-and-add operations.

Quote from: 678AFDB0 on February 05, 2023, 08:18:51 AM

It seems to take absolutely forever compare to the multiplication.

Unless I misunderstood the question, neither the former nor the latter are very complex algorithmically.

Hey,

Thanks for your input! My point was that it seems to me that the inversion is some type of search algorithm and is well suited for a
CPU where the instructions are executed in serial fashion, and maybe can't be executed in parallel to take full advantage of the
FPGA architecture.

In the diagram i have posted, i don't see much room for changing the algorithm to execute in parallel, except maybe running the inversion logic in parallel with the last multiplication, but that gives only ~10% efficiency boost due to inversion being much slower than multiplication.

Is the inversion part of the scalar multiplication for generation x from k as well ?

According to our dear AI friend, this is what the process is:

Code:

// Initial point (x1,y1) and scalar value (k)
reg [255:0] x1, y1, k;

// Resulting point (x2,y2)
reg [255:0] x2, y2;

// Loop through each bit in the scalar value
for (int i = 0; i < 256; i++)
{
    // If the current bit in the scalar is 1, add the initial point to the result
    if (k[i])
    {
        // Use the secp256k1 elliptic curve equation to calculate the new x and y coordinates
        x2 = (x1^2 * 3 + 7) % p;
        y2 = (y1^2) % p;
    }

    // Double the initial point
    x1 = (x1^2) % p;
    y1 = (y1^2 * 2) % p;
}

So which part is the inversion ? '% p' ?

Sorry for the lame questions.

BlackHatCoiner

legendary

Activity: 1512

Merit: 7340

Farewell, Leo

Quote from: 678AFDB0 on February 05, 2023, 08:18:51 AM

Is generating public key x point from private key more costly than the euclidean inversion used in this type of point addition ?

Yes. Elliptic curve multiplication is generally considered more computationally expensive than any point addition, because it involves lots of double-and-add operations.

Quote from: 678AFDB0 on February 05, 2023, 08:18:51 AM

It seems to take absolutely forever compare to the multiplication.

Unless I misunderstood the question, neither the former nor the latter are very complex algorithmically.

678AFDB0

newbie

Activity: 12

Merit: 3

Quote from: ymgve2 on February 02, 2023, 11:18:52 AM

It seems to be plain standard point addition between a point and the base point (basically the same as doing +1 to the private key): https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_addition

And most vanity generators do work in the same way as this - they take a random private key, generate the public key (a costly operation), then continuously add the base point to the public key (a cheap operation) and hash the result.

Thank you!

Is generating public key x point from private key more costly than the euclidean inversion used in this type of point addition ? It seems
to take absolutely forever compare to the multiplication.

There don't seems to be a way to pipeline the operation as is more or less serial state machine(need next inversion for new point). With
private key to x and lots of logic at least high throughput could be achieved.

ymgve2

full member

Activity: 162

Merit: 230

It seems to be plain standard point addition between a point and the base point (basically the same as doing +1 to the private key): https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_addition

And most vanity generators do work in the same way as this - they take a random private key, generate the public key (a costly operation), then continuously add the base point to the public key (a cheap operation) and hash the result.

678AFDB0

newbie

Activity: 12

Merit: 3

Quote from: mausuv on February 02, 2023, 12:14:00 AM

Quote from: 678AFDB0 on February 01, 2023, 04:06:54 AM

Quote from: mausuv on February 01, 2023, 01:22:04 AM

Quote from: 678AFDB0 on January 31, 2023, 03:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

You need to install Quartus Prime, then follow the 13 step guide.

But keep in mind this is mostly useful to learn Verilog and play with your FPGA board. For practical purposes it is best
to run VanitySearch on your CPU, it will be x10000 faster.

i try to install Quartus Prime lot of error my linex

send me your fpgaminer-vanitygen compile file please Embarrassed

# uplode your file this site https://www.transfernow.net/en send link please or gdrive , other link

Hey,

Sorry, but this is not my project, i just found it on GitHub and was wondering what is the name of the algorithm, that's it.

mausuv

jr. member

Activity: 70

Merit: 1

Quote from: 678AFDB0 on February 01, 2023, 04:06:54 AM

Quote from: mausuv on February 01, 2023, 01:22:04 AM

Quote from: 678AFDB0 on January 31, 2023, 03:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

You need to install Quartus Prime, then follow the 13 step guide.

But keep in mind this is mostly useful to learn Verilog and play with your FPGA board. For practical purposes it is best
to run VanitySearch on your CPU, it will be x10000 faster.

i try to install Quartus Prime lot of error my linex

send me your fpgaminer-vanitygen compile file please Embarrassed

# uplode your file this site https://www.transfernow.net/en send link please or gdrive , other link

678AFDB0

newbie

Activity: 12

Merit: 3

Quote from: mausuv on February 01, 2023, 01:22:04 AM

Quote from: 678AFDB0 on January 31, 2023, 03:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

You need to install Quartus Prime, then follow the 13 step guide.

But keep in mind this is mostly useful to learn Verilog and play with your FPGA board. For practical purposes it is best
to run VanitySearch on your CPU, it will be x10000 faster.

mausuv

jr. member

Activity: 70

Merit: 1

Quote from: 678AFDB0 on January 31, 2023, 03:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

678AFDB0

newbie

Activity: 12

Merit: 3

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

678AFDB0

newbie

Activity: 12

Merit: 3

Hello,

I was doing research on various Vanity search algorithms and stumbled upon this repo:

https://github.com/fpgaminer/fpgaminer-vanitygen

I did a state diagram to understand it better:

https://postimg.cc/YG010qSK

as the math behind it is well above my pay grade.

So my question is - what algorithm is that ? Is this a part of OpenSSL standard elliptic curve functions ? Or something custom ?

Unlike most Vanity search programs that usually generate random k, then produce x and hash it, this algorithm seems to be using
only public keys (x and y are needed for next iteration) and finds the next public key without dealing with private keys.

Thank you!

Topic: What algorithm is this ? (Read 329 times)