Pages:
Author

Topic: VanitySearch (Yet another address prefix finder) - page 60. (Read 32966 times)

legendary
Activity: 1948
Merit: 2097
About difficulty, your program shows slightly (negligible) different results respect of vanitygen:


Code:
./vanitygen 1Bit
Difficulty: 77178


./VanitySearch -stop -u -t 7 -gpu  1Bit
Start Tue Mar 12 18:36:06 2019
Difficulty: 78509Search: 1Bit


./vanitygen 1Bitcoin
Difficulty: 873388193410

./VanitySearch -stop -u -t 7 -gpu  1Bitcoin
Start Tue Mar 12 18:35:46 2019
Difficulty: 888446610539Search: 1Bitcoin

For the output, a space between Difficulty and Search would be better  Cheesy
legendary
Activity: 1948
Merit: 2097
Hello,
 
I added a note in the readme about attacking full address. It may be not clear. I would like a simple and understandable text about his.
Thanks to help me to make it clear.

Quote
Please don't use VanitySearch to attack a list of complete addresses. It is very unlikely that you find a collision. The time displayed indicates the time needed to reach the displayed probability of the most probable prefix in the list. In case of having n complete addresses in the input file, simply divide this time by the number of entries to get an aproximative idea of the time needed to reach the displayed probability (in fact it is longer). Even with a file containing 1 billion of addresses, using a very competitive hardware, the time needed to reach a probability of 50% will be much longer than the age of the universe. Note that the birthday paradox cannot be applied here as we look for fixed addresses and there is no trick possible (as for Pollard rho method on points coordinates) to simulate random walks because addresses are hashed.



You compute the probability this way?

I  use the concept of the target set T of the addresses we want to find. The target set of the addresses with prefix 1A has many (1/22* 2^160) elements, the target set of the addresses with prefix 1A11XxW8qtLWXoxJrjkHuvKDgjnJ1Kfjss has instead only 1 element.

Your example (1 billion of addresses) is like to find a collision with an address with a very long prefix. Just so you know, the target set of LBC has size of 2^24 addresses (about 16 million).



How does vanitygen compute the probability?

Let's do an example with 1A prefix. Let T be the target set of the addresses 1Axxxxx. To get on average 1 match we have to generate 22 addresses.

Smaller is the target, bigger is the difficulty to hitting it and viceversa.

Let me rephrase this: difficulty * size of target = search space, in our case a small difficulty and therefore a big target ( it's easy to hit this target!):
Code:
22 * 63703009616853067642911677093369144589991624155  = 2^160

number of keys we have to use (= numbers of addresses we have to generate to get on average 1 match) * number of addresses in T = 2^160 possible addresses.

Every time we try a new private key, we have 1 chance over 22 to hit our target set T (the probability is 1/22). At every roll then we don't hit the set T with probability 1 - 1/22. If we use k private keys, we don't hit T with probability (21/22)^k.
The probability of hitting T in k trials is then:

P = 1 - (21/22)^k

https://en.wikipedia.org/wiki/Geometric_distribution
Quote
The geometric distribution gives the probability that the first occurrence of success requires k independent trials, each with success probability p. If the probability of success on each trial is p, then the probability that the kth trial (out of k trials) is the first success is

P(X=k) = p(1-p)^(k-1)  for k = 1, 2, 3, ....

The cumulative distribution function is P(X<=k) = 1 - (1-p)^k  (it is the probability that X will take a value less than or equal to k ).


If we want to have :

a 50% chance, 1 - (21/22)^k = 0.50 --> k = log(0.50)/log(21/22) = 14.9 tries (not 11!)

a 64% chance, 1 - (21/22)^k = 0.64  --> k = log(0.36)/log(21/22) = 22.0 tries

a 75% chance, 1 - (21/22)^k = 0.75  --> k = log(0.25)/log(21/22) = 29.8 tries (not 11!)

a 90% chance, 1 - (21/22)^k = 0.90  --> k = log(0.10)/log(21/22) = 49.5 tries

a 95.5% chance, 1 - (21/22)^k = 0.955  --> k = log(0.045)/log(21/22) = 66.7 tries (3 times 22!!)

a 99% chance,  1 - (21/22)^k = 0.99  --> k = log(0.01)/log(21/22) = 99 tries

and so on (100% only for k -> infinity).


On average it takes 22 tries to get 1 match, but if you do 22 tries only once you will have only a 64% chance to get a match!

And vanitygen computes right the probability to find a match in the particular sequence you are running, vanitygen doesn't compute anything "on average".

sr. member
Activity: 462
Merit: 701
Hello,
 
I added a note in the readme about attacking full address. It may be not clear. I would like a simple and understandable text about his.
Thanks to help me to make it clear.

Quote
Please don't use VanitySearch to attack a list of complete addresses. It is very unlikely that you find a collision. The time displayed indicates the time needed to reach the displayed probability of the most probable prefix in the list. In case of having n complete addresses in the input file, simply divide this time by the number of entries to get an aproximative idea of the time needed to reach the displayed probability (in fact it is longer). Even with a file containing 1 billion of addresses, using a very competitive hardware, the time needed to reach a probability of 50% will be much longer than the age of the universe. Note that the birthday paradox cannot be applied here as we look for fixed addresses and there is no trick possible (as for Pollard rho method on points coordinates) to simulate random walks because addresses are hashed.
sr. member
Activity: 462
Merit: 701
Next Step:
- Optimize CPU/GPU exchange
- Add missing ECC optimizations (some symmetries and endomorphism)
- Add support for GPU funnel shift that should speed up SHA (but I need to find a board with compute capability >3.5, mine is 3.0).

Did you implement already all the steps 1, 2, 3 or there is still space to further improvements?

- Support for funnel shift no yet done.
- p-iG/p+iG done.
- k.(x,y)/-k.(x,-y) done.
- Endomorphism is in progress.
- CPU/GPU exchange done but still need improvement (difficult to find good compromises with multi prefixes search)
legendary
Activity: 1948
Merit: 2097
Next Step:
- Optimize CPU/GPU exchange
- Add missing ECC optimizations (some symmetries and endomorphism)
- Add support for GPU funnel shift that should speed up SHA (but I need to find a board with compute capability >3.5, mine is 3.0).

Did you implement already all the steps 1, 2, 3 or there is still space to further improvements?
jr. member
Activity: 82
Merit: 1
Code:
G:\vanitysearch>vanitysearch -stop -t 0 -gpu -gpuId 0,1,2,3,4,5,6 1Testtttt
Start Tue Mar 12 20:03:18 2019
Difficulty: 2988734397852221Search: 1Testtttt
Base Key:E04D2ABC020297FCE97546517F7F2B190EFE4A6055B293615B326B03F1857ACB
Number of CPU thread: 0
GPU: GPU #6 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
GPU: GPU #0 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)y][0]
GPU: GPU #5 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #3 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #2 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #1 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #4 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
1571.848 MK/s (GPU 1571.848 MK/s) (2^34.87) [P 0.00%][50.00% in 15.3d][0]

 Smiley
sr. member
Activity: 462
Merit: 701
Thank you very much to all of you for testing this software and helping me to make it better Smiley
legendary
Activity: 1948
Merit: 2097
@arulbero:
I see that you make a test with a Quadro M2200, it was on Linux ? If yes, would it possible that you try to compile from the last source and execute VanitySearch -check to see if it works on your config. Thanks Wink

It works on Linux with CUDA 8.0.

Amazing improvements :

./VanitySearch -stop -t 6 -gpu  1Happgggy
130 MKeys/s   Shocked


./VanitySearch -stop -t 8  1Happgggy
13.7 MKeys/s



Code:
----:~/VanitySearch$ ./VanitySearch -check
GetBase10() Results OK
Add() Results OK : 121.951 MegaAdd/sec
Mult() Results OK : 24.213 MegaMult/sec
Div() Results OK : 4.132 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 342.810 KiloInv/sec
IntGroup.ModInv() : 8.944 MegaInv/sec
ModMulK1() : 12.643 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
Seed: 591012
82.215 MegaKey/sec
ComputeKeys() found 530 items , CPU check...
GPU/CPU check OK
sr. member
Activity: 462
Merit: 701
I reached the performance of the good old oclvanitygen on my GeForce GTX 645 (3x192 cores) with my Core [email protected] alone Wink

Code:
C:\C++\VanitySearch\x64\Release>VanitySearch.exe -u 1Happy
Start Mon Mar 11 14:47:44 2019
Difficulty: 264104224Search: 1Happy
Base Key:4F3C51AA76D9FFDD605B50174A0F2E54A79B58434DD5A0FBED72DCB9EBA69855
Number of CPU thread: 8
9.489 MK/s (GPU 0.000 MK/s) (2^25.77) [P 19.46%][50.00% in 00:00:13][0]

Code:
C:\C++\VanityGen>oclvanitygen.exe 1Happy
Difficulty: 259627881
[9.28 Mkey/s][total 56623104][Prob 19.6%][50% in 13.3s]

C:\C++\VanityGen>vanitygen64.exe 1Happy
Difficulty: 259627881
[1.31 Mkey/s][total 3227136][Prob 1.2%][50% in 2.3min]
sr. member
Activity: 462
Merit: 701
The i option works very well at home ... I dream of compatibility with CUDA 8.0 to enjoy my old GPU GT520M  Grin

I'll work on this ASAP.

on the other hand I do not know what I'm doing wrong it only records the results of the first pattern

May be because you didn't use the -stop option. In that case all matching addresses are recorded and if one is much more probable than others, then you will have a lot of them in the output.
By using the -stop option, only one record per prefix entry are saved and the program ends when all prefixes have been found.

good job Jean_Luc  Smiley

Thanks Wink


member
Activity: 131
Merit: 32
versions 1.2 to 1.3 looked on my old CPU 1.8 MK / s
For version 1.6 it was already a nice increase in speed, a nice optimization Jean_Luc nice!

Tested Jean_Luc nice, on my old gear the improvement is 65% CPU .... for the GPU I can not enjoy it for the moment CUDA 8.0
Start Wed Mar  6 15:55:46 2019
Search: 1testr
Difficulty: 15318045009
Base Key:5D48B5A686EF3CCD828F2B23DBD365564D4193F3DC5EA98EB696641F8C8CFC17
Number of CPU thread: 4
3.016 MK/s (GPU 0.000 MK/s) (2^28.15) [P 1.92%][50.00% in 00:57:02][0]

With this version 1.7 the increase of the speed on my old material is still impressive + 30%
result version 1.7

Start Mon Mar 11 13:38:57 2019
Difficulty: 15318045009Search: 1testr
Base Key:EF61AC731BD4EAA239646EC88F3F3538D39BBA9B2A8C580276CB9AFAE849ECFE
Number of CPU thread: 4
4.395 MK/s (GPU 0.000 MK/s) (2^26.09) [P 0.47%][50.00% in 00:45:09][0]

The i option works very well at home ... I dream of compatibility with CUDA 8.0 to enjoy my old GPU GT520M  Grin

Start Mon Mar 11 13:53:03 2019
Ignoring prefix "1CPuID" (0, I, O and l not allowed)
Search: 3 prefixes (Lookup size 3)
Base Key:C24307039526A5A5EA9DA60EB6C67A3E9F60BC32BA44E8337171A53751AA3A12
Number of CPU thread: 4
4.192 MK/s (GPU 0.000 MK/s) (2^26.91) [P 37.96%][50.00% in 00:00:14][0]

on the other hand I do not know what I'm doing wrong it only records the results of the first pattern
good job Jean_Luc  Smiley

sr. member
Activity: 462
Merit: 701
The new release (1.7) is ready.
I had to revue the inline decision of the compiler (Linux) to make the old hardware work but still not very clear.
I hope that it won't alter the performance for recent GPU. On my hardware, the reviewing on inline decision has improved a bit performance.
Thanks to test it Wink
sr. member
Activity: 462
Merit: 701
Hello,

Yes , the SLarkBoy's config is impressive. As for stivensons , I would suggest to free some CPU cores to see if the performance are better using the -t option.

Some news:

I will probably publish a new release today or tomorrow with a slight performance increase and the possibility to search several prefixes in one go. It was difficult to find a solution to not alter the performance due to the overhead of lookup tables but I manage to find good compromises.

However, I'm facing an issue with the Linux release and I don't really know at the moment from where it comes. It seems that it is related to my old hardware or to the old SKD (I'm not sure). The same code works perfectly on windows (CUDA SDK 10) but not on my Linux config (CUDA SDK 8.0). It compiles without errors or warnings in both cases, but the generated seems wrong and returns wrong results on Linux. If I remove some part of it, it works again, it seems that I reached a limit somewhere. I have to figure out where...

@arulbero:
I see that you make a test with a Quadra M2200, it was on Linux ? If yes, would it possible that you try to compile from the last source and execute VanitySearch -check to see if it works on your config. Thanks Wink

EDIT:
Concerning the issue, I'm speaking of the GPU code, the CPU code works great in both cases.
member
Activity: 131
Merit: 32
Nice speed  Grin

Code:
C:\VanitySearch>VanitySearch.exe -stop -gpu -gpuId 0,1,2,3 1SLarkBoyKEK
Start Sun Mar 10 22:26:19 2019
Difficulty: 583137945833742401536Search: 1SLarkBoyKEK
Base Key:7098934A348028B578A730116289AC3A6BB56AFF8664117F5CE69920A360A4E9
Number of CPU thread: 31
GPU: GPU #0 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #3 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #2 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #1 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
7260.449 MK/s (GPU 7212.931 MK/s) (2^36.56) [P 0.00%][50.00% in 1765.33y][0]


C:\VanitySearch>VanitySearch.exe -stop -gpu -gpuId 0,1 1SLarkBoyKEK
Start Mon Mar 11 02:56:57 2019
Difficulty: 583137945833742401536Search: 1SLarkBoyKEK
Base Key:B97C76053D7951498A9122DEBC2951EB803A51C309326D1C321DF35DF5FB79EE
Number of CPU thread: 19
GPU: GPU #0 GeForce GTX 1080 Ti (28x128 cores) Grid(224x128)
GPU: GPU #1 GeForce GTX 1080 Ti (28x128 cores) Grid(224x128)
1293.435 MK/s (GPU 1255.415 MK/s) (2^32.96) [P 0.00%][50.00% in 9909.36y][0]
ah yes impressive speed but the tesla config is crazy  Shocked ! it also allows to see that despite a software optimizes and the top of the technology it is still not possible to obtain a vanity address of 12 patterns or more in a reasonable delay alas  Undecided
it can already have good fun to create beautiful addresses of vanity if you do not climb so high in the number of pattern remaining within a reasonable time ...
member
Activity: 114
Merit: 11
Nice speed  Grin

Code:
C:\VanitySearch>VanitySearch.exe -stop -gpu -gpuId 0,1,2,3 1SLarkBoyKEK
Start Sun Mar 10 22:26:19 2019
Difficulty: 583137945833742401536Search: 1SLarkBoyKEK
Base Key:7098934A348028B578A730116289AC3A6BB56AFF8664117F5CE69920A360A4E9
Number of CPU thread: 31
GPU: GPU #0 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #3 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #2 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #1 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
7260.449 MK/s (GPU 7212.931 MK/s) (2^36.56) [P 0.00%][50.00% in 1765.33y][0]


C:\VanitySearch>VanitySearch.exe -stop -gpu -gpuId 0,1 1SLarkBoyKEK
Start Mon Mar 11 02:56:57 2019
Difficulty: 583137945833742401536Search: 1SLarkBoyKEK
Base Key:B97C76053D7951498A9122DEBC2951EB803A51C309326D1C321DF35DF5FB79EE
Number of CPU thread: 19
GPU: GPU #0 GeForce GTX 1080 Ti (28x128 cores) Grid(224x128)
GPU: GPU #1 GeForce GTX 1080 Ti (28x128 cores) Grid(224x128)
1293.435 MK/s (GPU 1255.415 MK/s) (2^32.96) [P 0.00%][50.00% in 9909.36y][0]
jr. member
Activity: 82
Merit: 1
Hello,

I published a new release (1.6).
No new feature, just performance increase (16% GPU, 50% CPU on my hardware).
The performance increase are mainly due to a best ECC calculations ( many thanks to arulbero Wink )
It affects less the GPU because the GPU has no SIMD instructions to speed up the SHA, so the resource goes mainly to it and much less to ECC calculations.

Next Step:
- Add support for multi prefix search and (-i input.txt)
- Optimize CPU/GPU exchange
- Add missing ECC optimizations (some symmetries and endomorphism)
- Add support for GPU funnel shift that should speed up SHA (but I need to find a board with compute capability >3.5, mine is 3.0).

Thanks for testing it Smiley

I almost reached the same performance with my CPU alone (Intel Core i7-4770 3.4GHz) than oclvanitygen with my GPU (GTX 645) Cheesy
but still 10 days of calculation to reach to prefix I want.


good speed increase  Wink

Code:
G:\vanitysearch>vanitysearch -stop -t 0 -gpu -gpuId 0,1,2,3,4,5,6 1Testtttt
Start Thu Mar  7 22:17:16 2019
Search: 1Testtttt
Difficulty: 2988734397852221
Base Key:AC4A942372FF30E640421B959E6BE9EA97DC872B03041A59F73D9C19A5902F7B
Number of CPU thread: 0
GPU: GPU #2 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #1 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)y]
GPU: GPU #6 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
GPU: GPU #0 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #3 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #5 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #4 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
1043.526 MK/s (GPU 1043.526 MK/s) (2^34.80) [P 0.00%][50.00% in 23.1d]
legendary
Activity: 1948
Merit: 2097
I published a new release (1.6).
No new feature, just performance increase (16% GPU, 50% CPU on my hardware).
The performance increase are mainly due to a best ECC calculations ( many thanks to arulbero Wink )
It affects less the GPU because the GPU has no SIMD instructions to speed up the SHA, so the resource goes mainly to it and much less to ECC calculations.

On my pc:

VanitySearch -stop -u -t 1 1tryme --> 1,2 MKeys/s

my ecc library  --> 2,0 MKeys/s  (17 M Public keys/s)

Now (Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz):
 

VanitySearch -stop -u -t 1 1tryme --> 2,078 MKeys/s

VanitySearch -stop  -t 1 1tryme --> 2,771 MKeys/s

VanitySearch -stop  -t 8 1tryme --> 10,758 MKeys/s

EDIT:


Search: 1Happpppy
Difficulty: 51529903411245
Base Key:89D6DCD4B58447BB26F7FAFC99C12612B4ADB97E8A0CC5133253E3CB74B6734E
Number of CPU thread: 6
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
98.840 MK/s (GPU 88.068 MK/s) (2^31.39) [P 0.01%][50.00% in 4.3d]


For a comparison with Bitcrack:

./cuBitCrack  -b 128 -t 256 -p 256 1FshYsUh3mqgsG29XpZ23eLjWV8Ur3VwH
Quadro M2200     568/4038MB | 1 target 61.75 MKey/s (807,927,808 total) [00:00:21]


 Cheesy
member
Activity: 131
Merit: 32
Tested Jean_Luc nice, on my old gear the improvement is 65% CPU .... for the GPU I can not enjoy it for the moment CUDA 8.0
Start Wed Mar  6 15:55:46 2019
Search: 1testr
Difficulty: 15318045009
Base Key:5D48B5A686EF3CCD828F2B23DBD365564D4193F3DC5EA98EB696641F8C8CFC17
Number of CPU thread: 4
3.016 MK/s (GPU 0.000 MK/s) (2^28.15) [P 1.92%][50.00% in 00:57:02][0]
sr. member
Activity: 462
Merit: 701
Hello,

I published a new release (1.6).
No new feature, just performance increase (16% GPU, 50% CPU on my hardware).
The performance increase are mainly due to a best ECC calculations ( many thanks to arulbero Wink )
It affects less the GPU because the GPU has no SIMD instructions to speed up the SHA, so the resource goes mainly to it and much less to ECC calculations.

Next Step:
- Add support for multi prefix search and (-i input.txt)
- Optimize CPU/GPU exchange
- Add missing ECC optimizations (some symmetries and endomorphism)
- Add support for GPU funnel shift that should speed up SHA (but I need to find a board with compute capability >3.5, mine is 3.0).

Thanks for testing it Smiley

I almost reached the same performance with my CPU alone (Intel Core i7-4770 3.4GHz) than oclvanitygen with my GPU (GTX 645) Cheesy
but still 10 days of calculation to reach to prefix I want.
sr. member
Activity: 462
Merit: 701
Can VanitySearch look for more than 1 vanity prefix at a time?

Not yet, I will add this in the next release.
As I said, in a previous post, this feature need a refurbishment of the code.
This refurbishment should also allow an optimization of data transfer between GPU and CPU.

Some news:
After very interesting exchanges with arulbero (by PM), we should see significant performance increase in the next release.
Pages:
Jump to: