static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context *ctx, secp256k1_gej *r, const secp256k1_scalar *gn)
...
#define ec_point_mul_and_add(r,a,n) {
.. why the name of this function is not familiar to me? (although I used multiplication) .. and why I myself didn’t feel the need to write a*b+c ?..
i looked at my source code - always use
// R = na*A + ng*G
// secp256k1_ecmult(*ctx, gej *r, const gej *a, const scalar *na, const scalar *ng);
1899. Mr. Deull's most famous attributed utterance is that "everything that can be invented has been invented."
I bet your experiments with ECMULT_WINDOW_SIZE didn’t give anything, because secp256k1_ecmult_gen() doesn’t use it
I used exactly secp256k1_ecmult() because it is accelerated in ECMULT_WINDOW_SIZE.
moreover, secp256k1_ecmult() uses endomorphism to accelerate multiplication (but only if you use both na*A+ng*G, not one of them)
also why _ecmult() more faster than _gen()
https://github.com/bitcoin-core/secp256k1/pull/41https://github.com/bitcoin-core/secp256k1/pull/210#issuecomment-96145070https://github.com/bitcoin-core/secp256k1/pull/210#issuecomment-96166989simple init for secp256k1_ecmult()
//secp256k1_context *ctx = secp256k1_context_create(SECP256K1_CONTEXT_NONE);
secp256k1_context *ctx = secp256k1_context_create(SECP256K1_CONTEXT_SIGN | SECP256K1_CONTEXT_VERIFY);
// ecmult vars
secp256k1_scalar scalarK; secp256k1_scalar_set_int(&scalarK, 123456);
secp256k1_scalar scalar0; secp256k1_scalar_set_int(&scalar0, 0);
secp256k1_gej point_0gej; secp256k1_gej_set_infinity(&point_0gej);
// jacobian base point G
secp256k1_gej secp256k1_gej_const_g;
secp256k1_gej_set_ge(&secp256k1_gej_const_g, &secp256k1_ge_const_g);
#################
I was interested to compare these functions.
1core i7-6820
secp256k1 (now, without endomorphism)
make clean
./autogen.sh
./configure --enable-ecmult-static-precomputation --with-bignum=gmp --with-scalar=64bit --with-field=64bit --with-asm=x86_64
make
default ECMULT_WINDOW_SIZE = 15
########
Test#1, multiplication, increment sequential points, calculation 1million pubkeys in loop:
for(uint64_t i = 1; i<1000000 ; ++i){
secp256k1_scalar_set_uint64(&scalarK, i); // sequential points
MULT_FUNCTION_HERE(&context, point_jacobian, &scalarK)
}
results of benchmark
//// Multiply: R = q*A (in constant-time)
secp256k1_ecmult_const(&TestGej, &secp256k1_ge_const_g, &scalarK, 256);
1M at 58,0 sec; 0,017Mk/s
constant-time? hoho)
//// Multiply: R = a*G (with the generator)
//// To harden against timing attacks .. Break up the multiplicand into groups of..
secp256k1_ecmult_gen(&ctx->ecmult_gen_ctx, &TestGej, &scalarK);
1M at 26,5 sec; 0,037Mk/s
.. against timing attacks? y, its need rewrite, its right.
//// R = na*A + ng*G
//// secp256k1_ecmult(*ctx, gej *r, const gej *a, const scalar *na, const scalar *ng);
//secp256k1_ecmult(&ctx->ecmult_ctx, &TestGej, &secp256k1_gej_const_g, &scalarK, NULL); // NULL not recommended!
secp256k1_ecmult(&ctx->ecmult_ctx, &TestGej, &secp256k1_gej_const_g, &scalarK, &scalar0); // k*G + 0*G
1M at 8,2 sec; 0,121Mk/s
same function, another
//// R = na*A + ng*G
//secp256k1_ecmult(&ctx->ecmult_ctx ,&TestGej, NULL, &scalar0, &scalarK); // NULL not recommended!
secp256k1_ecmult(&ctx->ecmult_ctx, &TestGej, &point_0gej, &scalar0, &scalarK); // 0*P0 + k*G
1M at 3,5 sec; 0,285Mk/s
now u can see, how efficiency working ECMULT_WINDOW_SIZE (because we calculation sequential points)
secp256k1_ecmult() (option "0*P0 + K*G") - best choice (x7.5 faster than secp256k1_ecmult_gen() for multiply sequential points)
########
Test#2, multiplication, pseudo random points, calculation 1million pubkeys in loop:
for(uint64_t i = 1; i<1000000 ; ++i){
secp256k1_scalar_set_uint64(&scalarK, TestGej.x.n[0]); // pseudo random points
MULT_FUNCTION_HERE(&context, point_jacobian, &scalarK)
}
results of benchmark
secp256k1_ecmult_const(&TestGej, &secp256k1_ge_const_g, &scalarK, 256);
1M at 58,5 sec; 0,017Mk/s
same
secp256k1_ecmult_gen(&ctx->ecmult_gen_ctx, &TestGej, &scalarK);
1M at 26,5 sec; 0,037Mk/s
same
secp256k1_ecmult(&ctx->ecmult_ctx, &TestGej, &secp256k1_gej_const_g, &scalarK, &scalar0); // k*G + 0*G
1M at 16,4 sec; 0,061Mk/s
diff! slower
same function, another
secp256k1_ecmult(&ctx->ecmult_ctx, &TestGej, &point_0gej, &scalar0, &scalarK); // 0*P0 + k*G
1M at 9,5 sec; 0,105Mk/s
diff! slower
efficiency lower, ECMULT_WINDOW_SIZE working is not so good (because we calculation pseudo random points)
secp256k1_ecmult() with "0*P0 + K*G" - best choice (x2.8 faster than secp256k1_ecmult_gen() for multiply pseudo random points)
########
..multiplication points is very expensive, more expensive than the slowest addition points.
Test#3, addition points, calculation 1million pubkeys in loop:
for(uint64_t i = 1; i<1000000 ; ++i){
ADD_FUNCTION_HERE(&result_point_jacobian, 1point_jacobian, 2point_affine)
}
results of benchmark
secp256k1_gej_add_ge(&TestGej, &TestGej, &secp256k1_ge_const_g);
1M at 0sec 337msec; 2,9Mk/s
secp256k1_gej_add_ge_var(&TestGej, &TestGej, &secp256k1_ge_const_g, NULL);
1M at 0sec 285msec; 3,5Mk/s
3,5Mk/s, Karl!
secp256k1_gej_double_var(&TestGej, &TestGej, NULL);
1M at 0sec 183msec; 5,4Mk/s
(its double, rarely used)
secp256k1_gej_add_ge_var() - best choice (x33.3 faster than secp256k1_ecmult(), x92.9 faster than secp256k1_ecmult_gen())
https://github.com/Telariust/pollard-kangaroo-c99/blob/master/compare-test.c