Pages:
Author

Topic: VanitySearch (Yet another address prefix finder) - page 56. (Read 32966 times)

legendary
Activity: 2758
Merit: 6830
Is this really legil in  asian countries like India HuhHuhHuh
Why would a Bitcoin address generator be ilegal anywhere?
newbie
Activity: 4
Merit: 0
Hello,

I would like to present a new bitcoin prefix address finder called VanitySearch. It is very similar to Vanitygen.
The main differences with Vanitygen are that VanitySearch is not using the heavy OpenSSL for CPU calculation and that the kernel is written in Cuda in order to take full advantage of inline PTX assembly.
On my Intel Core i7-4770, VanitySearch runs ~4 times faster than vanitygen64. (1.32 Mkey/s -> 5.27  MK/s)
On my  GeForce GTX 645, VanitySearch runs ~1.5 times faster than oclvanitygen. (9.26 Mkey/s -> 14.548 MK/s)
If you want to compare VanitySearch and Vanitygen result, use the -u option for searching uncompressed address.
VanitySearch may not compute a good gridsize for your GPU, so make several tries using -g options in order to find best performances.
Using compressed addresses is roughly 20% faster.

VanitySearch is available from https://github.com/JeanLucPons/VanitySearch

There is still lots of improvement to do.
Feel free to test it and to submit issue.

Thanks.
Sorry for my bad English.
Jean-Luc

Is this really legil in  asian countries like India HuhHuhHuh
sr. member
Activity: 462
Merit: 701
Linux binary are available for download here (experimental).
They are compiled with CUDA SDK10.
Thanks to test them Wink

http://zelda38.free.fr/VanitySearch/
sr. member
Activity: 462
Merit: 701
Hello,

it ran, but just closed after finding it
did it generate the private keys into a file?
I am confused

To output the key in a file, use the -o option.
Code:
VanitySearch -stop -gpu -o key.txt 1stortz

Many thanks stivensons for the report Smiley
jr. member
Activity: 82
Merit: 1
if you post a release windows , I can test it too  Smiley

You can test with the release you have.
You can try:
Code:
VanitySearch -gpuId 0 -check 
VanitySearch -gpuId 6 -check (On the 3GB)
Thanks Wink


Tomorow, I will try to set up cuda sdk 10 on a recent hardware (Linux) and see If I can reproduce the issue.





cuda 10

Code:
G:\vanitysearch>vanitysearch   -gpuId 0 -check
GetBase10() Results OK
Add() Results OK : 567.189 MegaAdd/sec
Mult() Results OK : 38.169 MegaMult/sec
Div() Results OK : 4.410 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 281.352 KiloInv/sec
IntGroup.ModInv() : 8.365 MegaInv/sec
ModMulK1() : 10.770 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
Seed: 1853432973
296.742 MegaKey/sec
ComputeKeys() found 1947 items , CPU check...
GPU/CPU check OK

Code:
G:\vanitysearch>vanitysearch   -gpuId 6 -check
GetBase10() Results OK
Add() Results OK : 556.067 MegaAdd/sec
Mult() Results OK : 35.273 MegaMult/sec
Div() Results OK : 4.104 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 260.561 KiloInv/sec
IntGroup.ModInv() : 7.773 MegaInv/sec
ModMulK1() : 9.881 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #6 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
Seed: 2205931314
260.131 MegaKey/sec
ComputeKeys() found 1752 items , CPU check...
GPU/CPU check OK
jr. member
Activity: 40
Merit: 15
I tried your program with the parameters as shown in the sample + my username
Code:
-stop -gpu 1stortz

it ran, but just closed after finding it
did it generate the private keys into a file?
I am confused
sr. member
Activity: 462
Merit: 701
if you post a release windows , I can test it too  Smiley

You can test with the release you have.
You can try:
Code:
VanitySearch -gpuId 0 -check 
VanitySearch -gpuId 6 -check (On the 3GB)
Thanks Wink


Tomorow, I will try to set up cuda sdk 10 on a recent hardware (Linux) and see If I can reproduce the issue.



jr. member
Activity: 82
Merit: 1
Ok Thanks, could you try to run cuda-memcheck on the release version.


if you post a release windows , I can test it too  Smiley

legendary
Activity: 1948
Merit: 2097
Ok Thanks, could you try to run cuda-memcheck on the release version.



Code:
~/VanitySearch-1.8$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 123.457 MegaAdd/sec
Mult() Results OK : 23.148 MegaMult/sec
Div() Results OK : 5.208 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 341.317 KiloInv/sec
IntGroup.ModInv() : 9.130 MegaInv/sec
ModMulK1() : 12.968 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 223215
95.697 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
Expected item not found 3412bb65 cb39a716 67dcd486 209b19df c65e364c
Expected item not found fefea644 d535267a 46308e46 c579e91b 0aad3ee2
Expected item not found 3412726b 9830f325 9c5f0d95 a99e2a9b 6c473922
Expected item not found 341292e1 b4a39d2c 59e34f3d 38725b42 dfc2e801
Expected item not found fefeba57 c1209e3d 1b79200c b9529018 de0e35e4
Expected item not found fefe4aaa 34f02402 4ed76c83 a1d60efc 8c79f7a6
Expected item not found fefe8742 63e9b7bc b13a08f1 28229fd8 30987ed3
CPU found 22 items
========= ERROR SUMMARY: 0 errors
sr. member
Activity: 462
Merit: 701
Ok Thanks, could you try to run cuda-memcheck on the release version.
legendary
Activity: 1948
Merit: 2097
I committed a new Makefile with debug option.

Code:
make clean
make gpu=1 debug=1 all

In debug mode no inlining is done.

But, obviously it is much slower.
So launch

Code:
pons@linpons:~/VanitySearch$ ./VanitySearch -g 1 -check


Code:
./VanitySearch -g 1 -check
GetBase10() Results OK
Add() Results OK : 108.696 MegaAdd/sec
Mult() Results OK : 10.684 MegaMult/sec
Div() Results OK : 1.656 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 132.041 KiloInv/sec
IntGroup.ModInv() Results OK : 2.222 MegaInv/sec
ModMulK1() Results OK : 3.661 MegaMult/sec
ModMulK1order() Results OK : 1.700 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 888394
193.110 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
GPU/CPU check OK


Code:
~/VanitySearch$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 109.890 MegaAdd/sec
Mult() Results OK : 10.695 MegaMult/sec
Div() Results OK : 1.818 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 130.572 KiloInv/sec
IntGroup.ModInv() Results OK : 2.182 MegaInv/sec
ModMulK1() Results OK : 3.602 MegaMult/sec
ModMulK1order() Results OK : 1.684 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 781110
15.061 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors

Code:
~/VanitySearch$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 32 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 80.000 MegaAdd/sec
Mult() Results OK : 10.030 MegaMult/sec
Div() Results OK : 1.883 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 130.924 KiloInv/sec
IntGroup.ModInv() Results OK : 2.221 MegaInv/sec
ModMulK1() Results OK : 3.659 MegaMult/sec
ModMulK1order() Results OK : 1.704 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(32x128)
Seed: 639838
59.308 KiloKey/sec
ComputeKeys() found 721 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors
sr. member
Activity: 462
Merit: 701
I committed a new Makefile with debug option.

Code:
make clean
make gpu=1 debug=1 all

In debug mode no inlining is done.

But, obviously it is much slower.
So launch

Code:
pons@linpons:~/VanitySearch$ ./VanitySearch -g 1 -check
sr. member
Activity: 462
Merit: 701
Could you try this:
Code:
pons@linpons:~/VanitySearch$ /usr/local/cuda/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check

On my Linux it does not work (too old hardware) but on windows it ends like this.

Code:
C:\C++\VanitySearch\x64\ReleaseSM30>cuda-memcheck --tool memcheck VanitySearch.exe -g 1 -check
...
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 GeForce GTX 645 (3x192 cores) Grid(1x128)
Endianness: Little
Seed: 1006346800
401.220 KiloKey/sec
ComputeKeys() found 46 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors
legendary
Activity: 1948
Merit: 2097
Just to try.
Try to reduce the number of thread  per block from 128 to 64.
And if it works to double the number of block per grid using -g

GPUEngine.h:28

Code:
#define NB_TRHEAD_PER_GROUP 64

There is a typo in the code Wink

The errors remain.
sr. member
Activity: 462
Merit: 701
Just to try.
Try to reduce the number of thread  per block from 128 to 64.
And if it works to double the number of block per grid using -g

GPUEngine.h:28

Code:
#define NB_TRHEAD_PER_GROUP 64

There is a typo in the code Wink
sr. member
Activity: 462
Merit: 701
OK it confirms what I'm thinking.
It seems that this code is now near the limit of what CUDA (or nvcc) can do.
May be CUDA SDK 10 can help.
I'll try (for other users also) to make things work for CUDA 10 under Linux.
I'll try also to reduce the code size.
legendary
Activity: 1948
Merit: 2097

I would try:
 __noinline__ _ModMult,
__noinline__  ModNeg256


Problem solved!!!

Code:
__device__ __noinline__ void ModNeg256(uint64_t *r, uint64_t *a) {
__device__ __noinline__ void ModNeg256(uint64_t *r) {
__device__ __noinline__ void ModSub256(uint64_t *r, uint64_t *a, uint64_t *b) {
__device__ __noinline__ void ModAdd256(uint64_t *r, uint64_t *b) {
__device__ __noinline__ void ModSub256(uint64_t *r, uint64_t *b) {
__device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a, uint64_t *b) {
__device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a) {

Code:
./VanitySearch -g 16 -check
GetBase10() Results OK
Add() Results OK : 256.410 MegaAdd/sec
Mult() Results OK : 21.186 MegaMult/sec
Div() Results OK : 4.785 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 327.826 KiloInv/sec
IntGroup.ModInv() Results OK : 8.977 MegaInv/sec
ModMulK1() Results OK : 12.876 MegaMult/sec
ModMulK1order() Results OK : 6.280 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(16x128)
Endianness: Little
Seed: 120744
85.474 MegaKey/sec
ComputeKeys() found 394 items , CPU check...
Expected item not found fefea433 b7c86941 0c9e4746 90f5216a 5c48b7db (thread=1534, incr=510, endo=1)
CPU found 395 items
GPU: point   correct [70/70]
GPU: endo #1 correct [75/76]
GPU: endo #2 correct [69/69]
GPU: sym/point   correct [58/58]
GPU: sym/endo #1 correct [58/58]
GPU: sym/endo #2 correct [64/64]

The speed now is about 145 MKeys/s vs 162 Mkeys/s.

Thanks!!!

EDIT:

I think I have to add some other __noinline__...

Code:
ComputeKeys() found 380 items , CPU check...
Expected item not found 3412f1c5 0b1d320d 010f9de1 08deea41 d42a2b22 (thread=1070, incr=-514, endo=0)
Expected item not found fefe61da f2af1a6e c20ea91b 56ebc050 be432b01 (thread=1922, incr=511, endo=1)
CPU found 378 items
GPU: point   correct [67/67]
GPU: endo #1 correct [54/55]
GPU: endo #2 correct [56/56]
GPU: sym/point   correct [67/68]
GPU: sym/endo #1 correct [69/69]
GPU: sym/endo #2 correct [63/63]
sr. member
Activity: 462
Merit: 701
After the mark, calculation are 50% wrong.
On my 2 configs, all is working fine.
It really looks like the weird problem I had last time.

The _GetHash160Comp is ok, it is also tested alone by the check function.
The _ModMult is heavily used during ecc calculation.
The CHECK_POINT() works 100% the in first case.

I would try:
 __noinline__ _ModMult,
__noinline__  ModNeg256
Remove the whole lookup32 test in CheckPoint() (not used here)

I will add more info...


Code:
__device__ __noinline__ void CheckHashComp(prefix_t *prefix, uint64_t *px, uint64_t *py,
  int32_t incr, uint32_t tid, uint32_t *lookup32, uint32_t *out) {

  uint32_t   h[20];
  uint64_t   pe1x[4];
  uint64_t   pe2x[4];

  _GetHash160Comp(px, py, (uint8_t *)h);
  CHECK_POINT(h, incr, 0);    <-- 100% Ok up to here, means that (px,py) is good
  _ModMult(pe1x, px, _beta);
  _GetHash160Comp(pe1x, py, (uint8_t *)h); <-- 50% Wrong from here
  CHECK_POINT(h, incr, 1);
  _ModMult(pe2x, px, _beta2);
  _GetHash160Comp(pe2x, py, (uint8_t *)h);
  CHECK_POINT(h, incr, 2);

  ModNeg256(py);

  _GetHash160Comp(px, py, (uint8_t *)h);
  CHECK_POINT(h, -incr, 0);
  _GetHash160Comp(pe1x, py, (uint8_t *)h);
  CHECK_POINT(h, -incr, 1);
  _GetHash160Comp(pe2x, py, (uint8_t *)h);
  CHECK_POINT(h, -incr, 2);

}
sr. member
Activity: 462
Merit: 701
Ok Thanks i will investigate.
That's quite strange the point seems OK but all sym and endo are sometimes right sometimes wrong.
Funny bug Smiley
legendary
Activity: 1948
Merit: 2097
The result are constant ?
I mean, for instance,  GPU: point   correct [252/252] is always at 100% ?


I just tried another couple of time, yes, the results seem to be constant:

Code:
CPU found 1591 items
GPU: point   correct [269/269]
GPU: endo #1 correct [176/268]
GPU: endo #2 correct [159/271]
GPU: sym/point   correct [124/240]
GPU: sym/endo #1 correct [186/280]
GPU: sym/endo #2 correct [150/263]


CPU found 1529 items
GPU: point   correct [266/266]
GPU: endo #1 correct [131/204]
GPU: endo #2 correct [167/277]
GPU: sym/point   correct [117/251]
GPU: sym/endo #1 correct [159/249]
GPU: sym/endo #2 correct [165/282]

With -g option:
Code:
./VanitySearch -g 32 -check

.......
Expected item not found fefebfce 53aa8cce 65882d23 4011c288 eb720401 (thread=4072, incr=714, endo=2)
Expected item not found fefee494 9ce9c032 3ba7fa2d f43b9cfe deeb468e (thread=4075, incr=-469, endo=0)
Expected item not found fefe11e2 3c549d6b 110a42e1 d3f22532 4439072e (thread=4092, incr=-888, endo=0)
CPU found 751 items
GPU: point   correct [139/139]
GPU: endo #1 correct [80/130]
GPU: endo #2 correct [58/117]
GPU: sym/point   correct [52/118]
GPU: sym/endo #1 correct [89/131]
GPU: sym/endo #2 correct [68/116]
Pages:
Jump to: