Science Fair Project to trap Bitcoin private keys using Kangaroos! - page 4.

BurtW

legendary

Activity: 2646

Merit: 1138

All paid signature campaigns should be banned.

Thanks for all your input. We are reading it over carefully to see what we can use in our project and will get back to you.

We have already rewritten the following functions to make them as fast as possible:

static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context *ctx, secp256k1_gej *r, const secp256k1_scalar *gn)
SECP256K1_INLINE static int secp256k1_scalar_reduce(secp256k1_scalar *r, unsigned int overflow)
static int secp256k1_scalar_add(secp256k1_scalar *r, const secp256k1_scalar *a, const secp256k1_scalar *b)

We are working on rewriting:

static void secp256k1_gej_add_ge(secp256k1_gej *r, const secp256k1_gej *a, const secp256k1_ge *b)

Our internal function

Code:

#define ec_point_mul_and_add(r,a,n)    
{ 
    secp256k1_gej rj; 
    secp256k1_ecmult_gen(&ecgrp->ecmult_gen_ctx, &rj, n);  // highly optimized
    secp256k1_gej_add_ge_var(&rj, &rj, a, NULL); // optimization is work in progress
    secp256k1_ge_set_gej_var(r, &rj); 
}

We heavily rely on and use the following observation to inform our parameter choices:

Code:

If the PRF is defined as follows:

    x = 1 << (X % m)

The PRF will generate one of the numbers: 2**0, 2**1, .., 2**(m-1) with equal probability

So the average hop distance as a function of m can be calculated as follows:

    ahd = (1/m)(2**0) + (1/m)(2**1) + .. + (1/m)(2**(m-1))
        = (1/m)[(2**0) + (2**1) + .. + (2**(m-1))]
        = (1/m)[(2**m) - 1]
        = ((2**m) - 1)/m

Therefore the entire distance hopped by the kangaroo, given nn hops, can be estimated as:

    c = nn(ahd)
      = (nn/m)((2**m) - 1)

Telariust

jr. member

Activity: 38

Merit: 18

~~The claimed avg "expected of 2w^(1/2) group operations" is not achievable with m = w^(1/2))/2 (mean jump size from acticles by Pollard)~~
~~Empirical tests have shown that 2w^1/2 is achievable at m * 8.~~
~~I ask you to check on your implementation what speed will be if the MID/MAX step size is increased by 8 times, plz.~~
thx, no need
all norm after add

Code:

# get JmaxofSp
def getJmaxofSp(optimalmeanjumpsize, dS):
	if flag_verbose > 0: 
		print('[optimal_mean_jumpsize] %s' % optimalmeanjumpsize)

	sumjumpsize = 0

	for i in range(1,len(dS)):

		#sumjumpsize = (2**i)-1
		#sumjumpsize += 2**(i-1)
		sumjumpsize += dS[i-1]

		now_meanjumpsize	= int(round(1.0*(sumjumpsize)/(i)))

		#next_meanjumpsize	= int(round(1.0*(sumjumpsize+2**i)/(i+1)))
		next_meanjumpsize	= int(round(1.0*(sumjumpsize+dS[i])/(i+1)))

		if flag_verbose > 1: 
			print('[meanjumpsize#Sp[%d]] %s(now) <= %s(optimal) <= %s(next)' % (i, now_meanjumpsize, optimalmeanjumpsize, next_meanjumpsize ))


		if  optimalmeanjumpsize - now_meanjumpsize <= next_meanjumpsize - optimalmeanjumpsize : 
			if flag_verbose > 0: 
				print('[meanjumpsize#Sp[%d]] %s(now) <= %s(optimal) <= %s(next)' % (i, now_meanjumpsize, optimalmeanjumpsize, next_meanjumpsize ))

			if flag_verbose > 0: 
				print('[JmaxofSp] Sp[%s]=%s nearer to optimal mean jumpsize of Sp set' % (i, now_meanjumpsize))

			return i

	print("\n[FATAL_ERROR] JmaxofSp not defined!\n"); exit(-1)

#################
about speed

Quote from: Firebox on September 10, 2019, 04:01:32 AM

Quote from: BurtW on September 09, 2019, 10:56:42 AM

We have updated the OP with our latest results. Here is a summary of the results of runs for tests 0 through 49:...

That is already good result!

no, absolutely not

August 26, 2019

Code:

pkey    Standard OpenSSL library     secp256k1 (S 8X32, F 10X26)
bits   hops/second   test run time   hops/second   test run time
====   ===========================   =========================== 
  31   1942.378857   28.4496 secs    5520.634781   10.0240 secs

Standard OpenSSL library averaged 1897 hops per second
secp256k1 (S 8X32, F 10X26) averaged 5258 hops per second

September 09, 2019 (secp256k1)

Code:

test       time         hps       hops
----  --------------  -------  ----------
31    9.2328 secs    11,884  109705

already better, but..

it very slow for 1core C/C++ openssl/secp256k1, it should be more!
(even if u built early classic 1 Wild algo with 3.28w^(1/2), this does not explain x10 speed drop)

Quote from: arulbero on February 27, 2019, 02:25:46 PM

Are you using affine or jacobian coordinates for the points?

~~42: Answer to the Ultimate Question of Life, the Universe, and Everything"~~

secp256k1 have functions J+J->J , J+A->J , and no have functions A+A->A (Is this a precisely optimized library for bitcoin?

)
(all simple, because in real often need multiply rand point instead adding, and jacobian is better for multiply)
J+J->J (12M, 4S) - bad choice
J+A->J (8M, 3S) - already better
(look eprint.iacr.org/2016/103.pdf)
you can take the code from break_short by Jochen Hoenicke, it fits perfectly.
this code uses J+A->J and convert only X to Affine xJ ->xA (its max what we have in secp256k1 as it is)
where A is S-table with pre-computed G,2G,4G,8G,16G..
..u used pre-computed, y? not say me what u calc every Si permanently used multiplication, nonono,plz,no! X%mask instead pow2(X%k)?
(..by the way, that would explain your terrible speed..)
..and use old alternatively functions without protection against side channel attacks, +10-20% speed
secp256k1_gej_add_ge_var() instead secp256k1_gej_add_ge()
(warn, _var() need normalize before use in some functions.. my early mistake, "leading zeros" github.com/bitcoin-core/secp256k1/issues/601)
in ideal - need write new functions:
A+A->A (1I, 2M, 1S) - best choice, because in the end we need X in affine for unique representation each point to adding in storage hashtable
..u used hashtable, y?)
after which do not need to spend 8M,3S, only nearly 20M to each 1I, it +35% speed-up
like that

########
I seem to know what the problem is.
This is a very obvious fact for me, I should have started with it.
(I should have guessed at a time when about ECMULT_WINDOW_SIZE)

No multiplies the points inside the loop!
I warn, multiplication points is very expensive, more expensive than the slowest addition points.
Shouldn't use it at all in multi-million integration cycles.
You must calculate all possible displacement points before the start of the cycle.
Then just add them, being inside cycle.

I think you are using a mask
https://bitcointalksearch.org/topic/m.52068840
..so you can’t store several million pre-calculated points in memory. it explains your choice.

Quote

x = PRF(X)
= 1 << (X % m)
...
ec_point_mul_and_add(r,a,n)

not mask, i see,.. then simple, u tried built classic pattern of a*b+c

Quote

..rewritten the following functions..

I tell you - take loop from break_short as it is (with standart functions), rewrite to kangaroos, and benchmark it.
if after you still want to rewrite what, then ok (but i think what not)
I think the "secret sauce" will not create, it will turn out an exotic slow implementation through multiplication.
I would like to be mistaken, but my experience says that you lose time.
..How do I know about multiplication?.. I checked it myself,.. and from the topic about why brainflayer is slow and why it cannot be done faster,.. and it was on one of the topics pages LBC.
This is not the most famous fact, it is not surprising that you did not know about it, no fault.

Its not random, not brainflayer, its pseudo random walk, so we can pre-compute our jumps size (offset points) for using addition points instead multiplication points.

Firebox

jr. member

Activity: 59

Merit: 3

Quote from: BurtW on September 09, 2019, 10:56:42 AM

We have updated the OP with our latest results. Here is a summary of the results of runs for tests 0 through 49:...

That is already good result!
When will be the first release?

BurtW

legendary

Activity: 2646

Merit: 1138

All paid signature campaigns should be banned.

We have updated the OP with our latest results. Here is a summary of the results of runs for tests 0 through 49:

Code:

test       time         hps       hops       compares    mem(bytes)    total time     time hopping   not hopping
----  --------------  -------  ----------  ------------  ----------  --------------  --------------  -----------
   0    0.3237 msecs   10,504  1           3             1032                   323              95          228
   1    0.2853 msecs   11,062  1           3             1032                   285              90          194
   2    0.2785 msecs   11,111  1           3             1032                   278              90          188
   3    1.8730 msecs   11,271  8           10            1032                 1,873             709        1,163
   4    1.8351 msecs   11,147  11          13            1032                 1,835             986          848
   5    2.1243 msecs   11,123  15          17            1032                 2,124           1,348          775
   6    5.2260 msecs   11,069  52          54            1032                 5,226           4,697          528
   7    3.9406 msecs   11,065  32          34            1032                 3,940           2,891        1,048
   8    4.6425 msecs   10,174  45          47            1032                 4,642           4,423          219
   9   46.8674 msecs   11,052  510         512           1032                46,867          46,146          721

  10   84.2980 msecs   10,674  893         895           1032                84,298          83,661          636
  11  122.2013 msecs   11,662  1413        1415          1032               122,201         121,162        1,039
  12  257.2438 msecs   11,579  2976        2978          1032               257,243         257,019          224
  13  514.6517 msecs   11,402  5840        5842          1032               514,651         512,196        2,455
  14  501.1961 msecs   11,791  5901        5903          1032               501,196         500,466          729
  15    1.1839 secs    11,886  14026       14028         1032             1,183,908       1,180,079        3,829
  16    3.0370 secs    11,631  35249       35251         1032             3,037,005       3,030,586        6,419
  17    5.6526 secs    11,230  63475       63477         1032             5,652,638       5,652,025          613
  18   14.0759 secs    11,847  166753      166755        1032            14,075,930      14,075,708          222
  19   15.5404 secs    11,922  185259      185261        1032            15,540,417      15,539,472          945

  20  742.1446 msecs   11,882  8804        5677          32088              742,144         740,944        1,200
  21  475.1973 msecs   11,881  5627        1903          32696              475,197         473,623        1,574
  22  561.2366 msecs   11,555  6359        3169          33912              561,236         550,304       10,932
  23  428.6246 msecs   11,867  5044        1167          36344              428,624         425,047        3,577
  24    2.2156 secs    11,842  26026       22985         41208            2,215,606       2,197,825       17,780
  25  350.9793 msecs   11,864  4025        695           50936              350,979         339,275       11,704
  26  650.9386 msecs   11,695  7326        4443          70392              650,938         626,434       24,504
  27   11.2562 secs    11,793  132214      128675        109304          11,256,223      11,211,378       44,845
  28    4.5728 secs    11,819  52883       49679         187128           4,572,770       4,474,294       98,476
  29    4.2294 secs    11,907  48300       45216         342776           4,229,449       4,056,522      172,927

  30   10.6122 secs    11,759  124766      56010         62808           10,612,244      10,610,509        1,735
  31    9.2328 secs    11,884  109705      41191         63416            9,232,760       9,231,168        1,592
  32   14.3117 secs    11,922  170592      103053        64632           14,311,717      14,309,552        2,165
  33   55.8702 secs    11,902  664893      596368        67064           55,870,151      55,862,085        8,066
  34   22.1413 secs    11,923  263922      195453        71928           22,141,326      22,135,223        6,103
  35    1.0531 mins    11,927  753508      683623        81656           63,186,883      63,175,281       11,602
  36   28.3459 secs    11,924  337734      269198        101112          28,345,873      28,323,662       22,210
  37    9.2387 mins    11,860  6573832     6505951       140024         554,319,152     554,275,660       43,492
  38   11.2025 secs    11,926  132534      64338         217848          11,202,535      11,113,380       89,154
  39    4.5557 mins    11,925  3257543     3189102       373496         273,342,973     273,170,580      172,393

  40   13.1216 mins    11,927  9389808     7285993       124248         787,294,602     787,289,468        5,133
  41   10.3313 mins    11,929  7394352     5292081       124856         619,880,117     619,878,329        1,787
  42    3.8015 mins    11,935  2722138     620876        126072         228,090,413     228,087,570        2,842
  43    5.7648 mins    11,924  4124314     2019820       128504         345,888,060     345,884,231        3,828
  44    3.2043 mins    11,930  2293481     193018        133368         192,258,803     192,252,406        6,397
  45   33.9166 mins    11,919  24254078    22153676      143096       2,034,994,703   2,034,982,887       11,815
  46   47.9810 mins    11,922  34321486    32221012      162552       2,878,859,972   2,878,832,117       27,854
  47    2.4000 hours   11,918  102968649   100868429     201464       8,639,864,920   8,639,820,958       43,962
  48    4.1320 hours   11,924  177371081   175268406     279288      14,875,131,910  14,875,045,149       86,760
  49   23.9987 hours   11,914  1029325281  1027225432    434936      86,395,345,966  86,395,170,751      175,214
      ==============   ======                                        ==============  ==============  ===========
                       11,630                                       118,106,695,147 118,105,560,480    1,134,666

Where

Code:

test          test number
time          run time needed to find the private key
hps           hops per second (averaged 11,630 on my laptop with a highly modified secp256k1 library)
hops          total number of hops it took to find the private key
cmps          total number of point comparisons it took to find the private key
mem           amount of memory dynamically allocated during the run
total time    total run time in msec
time hopping  total time spent hopping in msec
not hopping   the overhead (time not hopping) in msec

racminer

member

Activity: 245

Merit: 17

Topic: Science Fair Project to trap Bitcoin private keys using Kangaroos! - page 4. (Read 7850 times)