Pages:
Author

Topic: lightweight database, for brute force using publickeys-32Mk =3.81MB(secp256k1) - page 7. (Read 2625 times)

full member
Activity: 1162
Merit: 237
Shooters Shoot...

I store every key, 2^25 keys, represented with a “0” or a “1”.


Ok, then you are using the idea of this thread.




But how does this help you do a search for a key?
What search would you run with this script?

The OPs (and my modified script) basically stores “wilds” a known pubkey’s offset keys.

And the search script can land in any key space, do 64 key computations, and find the key of the key is in that range.

The idea is:

if you know that a public key P is in your db, of course you need only 64 key computations. The same for my script.
( I didn't write a script "search" so far, I made only a script to build the database as small as possible).


The general case is:

1) I have a public keys P. I don't know the private key, but I know it is in range [1 - 2^64] (example). But this range is too big, I cant store or build a such huge database.

There are 2 main possibilities:

a) I generate a database of 2^32 public keys [from 1*G to 2^32*G] (baby steps) and I store all of them, then

I compute P1 = P - 2^32*G and verify if it is in my db, otherwise I compute P2 = P - 2*(2^32*G) and so on (giant steps).

This is baby-giant steps algorithm, about 2^32 computation but a huge size database.

b) I generate a database of 2^32 public keys [from 1*G to 2^32*G], but I store only 1 key each 64 (to save space).

Then I compute P1 = P -2^32G and I verify if it is in my db.
I have to perform 64 subtraction from P1: P1 - G, P1 - 2G, P3 - G , .... P3 - 64*G

to be sure that P1 is or is not in the interval [1->2^32].

If P1 or the other 64 keys are not in my db, I compute P2 = P - 2*(2^32*G) and so on.
Understood.

I have a database with 63,570,000,000 keys stored in it, with a size of only 7.7GB. This one was created using the BitArray function, but it has flaws when doing a search so I am not using it at the moment.

But when I run the search script, it uses the equivalent in RAM, roughly 7.7GB of RAM used during the search.

But I can make jumps the size of 63,570,000,000 (almost 2^36), do 64 computations, and know within a second if key is in that range or not. So every less than a second, it jumps 2^36 keys and checks for a match. But this is using python, could be much faster in C, C++, and a lot faster with GPU support.
legendary
Activity: 1932
Merit: 2077

I store every key, 2^25 keys, represented with a “0” or a “1”.


Ok, then you are using the idea of this thread.




But how does this help you do a search for a key?
What search would you run with this script?

The OPs (and my modified script) basically stores “wilds” a known pubkey’s offset keys.

And the search script can land in any key space, do 64 key computations, and find the key of the key is in that range.

The idea is:

if you know that a public key P is in your db, of course you need only 64 key computations. The same for my script.
( I didn't write a script "search" so far, I made only a script to build the database as small as possible).

Your idea: store a bit of any key

My idea: store several bits of some keys



The general case is:

1) I have a public keys P. I don't know the private key, but I know it is in range [1 - 2^64] (example). But this range is too big, I cant store or build a such huge database.

There are 2 main possibilities:

a) I generate a database of 2^32 public keys [from 1*G to 2^32*G] (baby steps) and I store all of them, then

I compute P1 = P - 2^32*G and verify if it is in my db, otherwise I compute P2 = P - 2*(2^32*G) and so on (giant steps).

This is baby-giant steps algorithm, about 2^32 computation but a huge size database.

b) I generate a database of 2^32 public keys [from 1*G to 2^32*G], but I store only 1 key each 64 (to save space).

Then I compute P1 = P - 2^32G and I verify if it is in my db.
Now I have to perform 64 subtraction from P1: P1 - G, P1 - 2G, P3 - G , .... P3 - 64*G because P1 could be in interval [1*G -> (2^32)*G] but not in my db (I don't store all the keys). These 64 subtraction are the price we have to pay to have a smaller database.

to be sure that P1 is or is not in the interval [1->2^32].

If P1 or the other 64 keys are not in my db, I compute P2 = P - 2*(2^32*G) and so on.
full member
Activity: 1162
Merit: 237
Shooters Shoot...
Can you give me a concrete example of how or what keys you would store (making the filesize smaller) and how you could use it in a search?

My example to you is:

If I generate 2^25 keys in the database, it's size will be 16MB.


2^25 keys -> 16MB means :

you store 4 bits for each key?

2^25 * 4 bits = 2^27 bits = 2^24 bytes = 16MB. 

Or you store only some keys?
I store every key, 2^25 keys, represented with a “0” or a “1”.
full member
Activity: 1162
Merit: 237
Shooters Shoot...
This script generates a database of 32 million keys (3.9 MB) in 3.5 seconds,

it stores only 1 key each 64 keys, and only 8 bytes for each key


Code:
#!/usr/bin/env python3
# 2023/Dec/03, arulbero_pub_key.py
import secp256k1 as ice

#############################################################################
# Set the number of public keys to generate and other parameters
#############################################################################
start_key = 1
num_public_keys = 32000000
bits_to_store = 64
bytes_to_store = bits_to_store//8
rate_of_key_to_generate = 64
rate_of_key_to_store = rate_of_key_to_generate

interval_to_generate = range(start_key,num_public_keys+1, rate_of_key_to_store)

interval_to_store = range(start_key,num_public_keys+1,rate_of_key_to_store)

binary_mode = 1

#private_keys = list(interval_to_generate)

#############################################################################


if (binary_mode == 1):

f = open("public_keys_database.bin", "wb") #binary mode

###########generation#################
public_keys=[]

for i in interval_to_generate:                 #generate the other keys
P = ice.scalar_multiplication(i)
public_keys.append(P[33-bytes_to_store:33].hex())

###########writing the db###############
for i in range(0,len(interval_to_store)):
h = int(public_keys[i],16)
f.write(h.to_bytes(bytes_to_store,byteorder='big'))

f.close()

else:

f = open("public_keys_database.txt", "w")

###########generation#################
public_keys=[]

for i in interval_to_generate:
P = ice.scalar_multiplication(i)
public_keys.append(P[33-bytes_to_store:33].hex())

###########writing the db###############
for i in range(0,len(interval_to_store)):
h = public_keys[i]
f.write(h)
f.close()

If you want to read the data, switch to "binary_mode = 0".

With 2^32 keys (4 billions), it takes about 4 minutes to generates a db of 257 MB (1 key each 128 keys, 64 bits for key).
But how does this help you do a search for a key?
What search would you run with this script?

The OPs (and my modified script) basically stores “wilds” a known pubkey’s offset keys.

And the search script can land in any key space, do 64 key computations, and find the key of the key is in that range.
legendary
Activity: 1932
Merit: 2077
Can you give me a concrete example of how or what keys you would store (making the filesize smaller) and how you could use it in a search?

My example to you is:

If I generate 2^25 keys in the database, it's size will be 16MB.


2^25 keys -> 16MB means :

you store 4 bits for each key?

2^25 * 4 bits = 2^27 bits = 2^24 bytes = 16MB. 

Or you store only some keys?
legendary
Activity: 1932
Merit: 2077
This script generates a database of 32 million keys (3.9 MB) in 3.5 seconds,

it stores only 1 key each 64 keys, and only 8 bytes for each key


Code:
#!/usr/bin/env python3
# 2023/Dec/03, arulbero_pub_key.py
import secp256k1 as ice

#############################################################################
# Set the number of public keys to generate and other parameters
#############################################################################
start_key = 1
num_public_keys = 32000000
bits_to_store = 64
bytes_to_store = bits_to_store//8
rate_of_key_to_generate = 64
rate_of_key_to_store = rate_of_key_to_generate

interval_to_generate = range(start_key,num_public_keys+1, rate_of_key_to_store)

interval_to_store = range(start_key,num_public_keys+1,rate_of_key_to_store)

binary_mode = 1

#private_keys = list(interval_to_generate)

#############################################################################


if (binary_mode == 1):

f = open("public_keys_database.bin", "wb") #binary mode

###########generation#################
public_keys=[]

for i in interval_to_generate:                 #generate the other keys
P = ice.scalar_multiplication(i)
public_keys.append(P[33-bytes_to_store:33].hex())

###########writing the db###############
for i in range(0,len(interval_to_store)):
h = int(public_keys[i],16)
f.write(h.to_bytes(bytes_to_store,byteorder='big'))

f.close()

else:

f = open("public_keys_database.txt", "w")

###########generation#################
public_keys=[]

for i in interval_to_generate:
P = ice.scalar_multiplication(i)
public_keys.append(P[33-bytes_to_store:33].hex())

###########writing the db###############
for i in range(0,len(interval_to_store)):
h = public_keys[i]
f.write(h)
f.close()

If you want to read the data, switch to "binary_mode = 0".

With 2^32 keys (4 billions), it takes about 4 minutes to generates a db of 257 MB (1 key each 128 keys, 64 bits for key).
full member
Activity: 1162
Merit: 237
Shooters Shoot...
arulbero

I understand what you are saying as far as time and space and the examples given.

Can you give me a concrete example of how or what keys you would store (making the filesize smaller) and how you could use it in a search?

My example to you is:

If I generate 2^25 keys in the database, it's size will be 16MB.
Now I can start at any point and skip 2^25 points.
If I start at 0, I jump 2^25 points and do 64 consecutive subs and if the key is in that range, the script will find it.

So basically, 2^25 keys = 16MB, jump 2^25 points, compute 64 sub loops. 100% find if key is in that range. If not, jump to the next 2^25 point; 0+2^25, 2^25+2^25, etc.

Each point landed on, only requires 64 computations.

I generated 40,000,000,000 keys in a database and it took just under a second to do the 64 computations and check the DB for a match.
legendary
Activity: 1932
Merit: 2077
It's definitely how the bytes(BitArray(bin=binary)), BitArray stores the strings.

I have successfully implemented a spinoff script that works 100% of the time with random or increment.

However, 32 Million Keys creates a 15.62MB file/database; not as good as your original 3.81MB file/database, but everything works.

Would love to shrink the database more


I see that this script creates a db with 1 bit for each key, to reduce the db size of 256.

32 million keys = 2^25 keys x 1 bit = 2^25 bits  = 4 MB

instead of 32 million keys = 2^25 keys x 256 bits = 2^33 bits  = 1GB


If the goal is to keep low the size of the database by a factor of 1/256,

instead of generating all the keys (from 1 to 2^25), simply you can generate half million of keys and store 4 bytes each.


If you generate only a key each 64 keys:  1*2^6, 2*2^6, 3*2^6, 4*2^6, 5*2^6, ..... ,  2^19*2^6

and store the last 64 bits (8 bytes) of each keys,

your database will contain:

0.5 million keys = 2^19 keys * 8 bytes = 2^22 bytes = 4MB.



CONS:

maybe the search will be more slow, because for each of the 64 keys you have to search through the entire database, instead of search a single string made by all the 64 keys. But you can perform the search in parallel, and scan the database only once.
And you don't have to search through 2^31 strings of 64 bits, but only through 2^19 windows of 64 bits.



PROS:

1) you generate the database in less time

2) you can use a simple array, no need of conversion bits->bytes

3) you can strech this idea, and obtain a even more tiny db if you accept to do more computation
        a) only when you do the search
        b) when you create the db and when you do the search
        


a) generate only a key each 128 keys (and store 64 bits each) -> size of db = 2 MB

you spend half time to produce a db with half size;

in the search script, you have to check 128 consecutive keys instead of 64 (double work).





b) generate all the keys, and store only the keys with the first 8 bits = 0

in a 2^25 keys, there are about 2^17 keys with this property.

You store only 7 bytes (56 bits) of these special keys (in this way the probability of a collision remains the same).

Now you have a database of 2^17 keys x 7 bytes =  about 896 kB.

In this db you have to add the private keys too (2^17 * 32 bit = ) for a total of 1408kB.

In the search script, now you have to produce at least 256 consecutive keys to find a match, until you find a key with the first 8 bits = 0 (you have to do x4 search work), but:

- the file to scan is less than 1/4 of 4 MB
- you generate only 1 key with the first 8 bits equal to zero, then the search will be more fast

------------------------------------------------------------------------------------------------------------------------------------

How does scale this approach?

Let's say we want to store the information about 2^32 keys (4 billions),
usually we would need to store 2^32 keys x 32 bytes = 2^37 bytes, 128 GB

Instead:

generate all the keys, and store only the keys with the first 16 bits = 0


in a interval of 2^32 keys, there are about 2^16 keys with this property.

You store 8 bytes (64 bits) of these special keys (in this way you have 64 + 16 = 80 bits of informations for each key stored).

Now you have a database of 2^16 keys x 8 bytes =  about 512 kB + 2^16 x 4 bytes for the private keys for a total of 768 kB.

You have to compute about 2^16 = 64k keys from your public key to find a public key with the first 16 bits = 0, but it takes less than 0.05 seconds to generate 64000 consecutive keys.

The tradeoff between time and space is: if you want to halve the number of computation in the search, you have to double the size of the db (and viceversa).

If you accept a db size of 6 MB (= 768 kB x 8 ), then you have to store 2^19 keys (with the first 13 bits = 0) and you can compute only 2^13 = 8k keys in the search.
full member
Activity: 1162
Merit: 237
Shooters Shoot...
It's definitely how the bytes(BitArray(bin=binary)), BitArray stores the strings.

I have successfully implemented a spinoff script that works 100% of the time with random or increment.

However, 32 Million Keys creates a 15.62MB file/database; not as good as your original 3.81MB file/database, but everything works.

Would love to shrink the database more, but the BitArray is messing things up writing or reading.

None the less, your idea is great and I learned something from it.

Hopefully you can figure out the BitArray error and I can test your script(s) once again.

full member
Activity: 1162
Merit: 237
Shooters Shoot...
@macduglasx if during creating DB, suddenly switch off light , DB valid or it’s must be recreated?
The keys in the DB are valid. If you were trying to create 10M and power went out at 6M, you’ll have 6M valid keys.
jr. member
Activity: 43
Merit: 1
@macduglasx if during creating DB, suddenly switch off light , DB valid or it’s must be recreated?
full member
Activity: 1162
Merit: 237
Shooters Shoot...
Quote
You are right, no match.
It must be a problem with the reading of bytes.
You definitely found a bug.
Because, it is in the DB but does not get it.
I'll fix it.

Edit:
The curious part is that if we start at 6 we get 3092000006
and match.
Any update? I've been trying tweaks to the search script, but I have not been successful.
jr. member
Activity: 43
Merit: 1
OP doesn't understand the program/script.

I used your single key check and it didn't even find a match.

Check it yourself:

With 4,000,000 keys in database I ran this for a single key check.

pk = 3093472814 - 62500

No results.

But I think I figured it out.

Whatever pk is landed on, whether random or increment, it has to be a multiple of the collision margin.

because if you run single key check using:

pk= 3093472814 - 3999936
pk= 3093472814 - 640
pk= 3093472814 - 64

you will get a match.

I'm even more lost now lol.


You just miss your key when substract 40000.000 in you DB
member
Activity: 873
Merit: 22
$$P2P BTC BRUTE.JOIN NOW ! https://uclck.me/SQPJk
OP doesn't understand the program/script.

I used your single key check and it didn't even find a match.

Check it yourself:

With 4,000,000 keys in database I ran this for a single key check.

pk = 3093472814 - 62500

No results.

But I think I figured it out.

Whatever pk is landed on, whether random or increment, it has to be a multiple of the collision margin.

because if you run single key check using:

pk= 3093472814 - 3999936
pk= 3093472814 - 640
pk= 3093472814 - 64

you will get a match.

I'm even more lost now lol.



You are right, no match.
It must be a problem with the reading of bytes.
You definitely found a bug.
Because, it is in the DB but does not get it.
I'll fix it.

Edit:
The curious part is that if we start at 6 we get 3092000006
and match.

Good day.

What is a "magic" of your algorithm ?

Do you think about make backward operation ? Take  a 3 mb of base, check all base and generate 2^60 pubkeys ? I call sach algo as "multipliers".

p.s. Looks like  your DB can not oly archive pubkeys, but posible to generate pub keys too Huh? Looks like, what for generate 35 Mk tou need only 3.81 MB :-))

Thank you

I like your thread, Thread , I think is a bast of the all past year

member
Activity: 239
Merit: 53
New ideas will be criticized and then admired.
OP doesn't understand the program/script.

I used your single key check and it didn't even find a match.

Check it yourself:

With 4,000,000 keys in database I ran this for a single key check.

pk = 3093472814 - 62500

No results.

But I think I figured it out.

Whatever pk is landed on, whether random or increment, it has to be a multiple of the collision margin.

because if you run single key check using:

pk= 3093472814 - 3999936
pk= 3093472814 - 640
pk= 3093472814 - 64

you will get a match.

I'm even more lost now lol.



You are right, no match.
It must be a problem with the reading of bytes.
You definitely found a bug.
Because, it is in the DB but does not get it.
I'll fix it.

Edit:
The curious part is that if we start at 6 we get 3092000006
and match.
full member
Activity: 1162
Merit: 237
Shooters Shoot...
OP doesn't understand the program/script.

I used your single key check and it didn't even find a match.

Check it yourself:

With 4,000,000 keys in database I ran this for a single key check.

pk = 3093472814 - 62500

No results.

But I think I figured it out.

Whatever pk is landed on, whether random or increment, it has to be a multiple of the collision margin.

because if you run single key check using:

pk= 3093472814 - 3999936
pk= 3093472814 - 640
pk= 3093472814 - 64

you will get a match.

I'm even more lost now lol.

full member
Activity: 1162
Merit: 237
Shooters Shoot...
If you store 2 bits instead of 1 bit per key, would that help to find a key?
It can find a key when using random and/or incrementing by 1; I am trying to use an increment other than 1.
full member
Activity: 1162
Merit: 237
Shooters Shoot...
I even tried your way but with an increment other than 1.
All pks print correctly but key is not found.

Code:
start= 0
end=   3093472814
for i in range(start, end, 4000000):
    print(i)


I am still baffled lol.

Do you want to look for 4000000, 8000000, 12000000?
I am incrementing by 4,000,000 because that's how many keys are in the database.

Answer me this,
if I have generated 4,000,000 keys in the database, 1:4000000 as you call it,
if the target pubkey's pk is:
3093472814
and this pk is landed on during a search:
3092000000
Would the key not be found?
3093472814 - 3092000000 = 1472814 (1,472,814); which is within 1:4000000

What am I missing man? Lol.

Would the key not be found?!

copper member
Activity: 1330
Merit: 899
🖤😏
If you store 2 bits instead of 1 bit per key, would that help to find a key?
member
Activity: 239
Merit: 53
New ideas will be criticized and then admired.
I even tried your way but with an increment other than 1.
All pks print correctly but key is not found.

Code:
start= 0
end=   3093472814
for i in range(start, end, 4000000):
    print(i)


I am still baffled lol.

Do you want to look for 4000000, 8000000, 12000000?
Pages:
Jump to: