Pages:
Author

Topic: VanitySearch (Yet another address prefix finder) - page 57. (Read 32072 times)

sr. member
Activity: 462
Merit: 701
Yes 11111 is quite difficult (leading 0).
May be there is also a bug with this particular case. I'll check that.

I restored the old group size.
Could try to update your git repo
And try again.
Many thank for your help.
legendary
Activity: 1932
Merit: 2077
Ok

Could you try to noiline to 2 ModMult, also in GPU/GPUEngine.cu:

Code:
Line 510: __device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a, uint64_t *b) {
Line 560: __device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a) {

It seems that I reached a limit with CUDA....
I had similar problem with the last release...

Same error again...

Very strange, look at the time:  5.00779e+06y to get the 50% ...
Code:
:~/VanitySearch$ ./VanitySearch -stop  -t 7 -gpu  11111
Start Sat Mar 16 15:26:49 2019
Difficulty: 33822000858357062172672
Search: 11111
Base Key:F2A543066C493024B85715EFFE2486460C80009A931F6D8B1614FC20573DC40B
Number of CPU thread: 7
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
148.447 MK/s (GPU 130.846 MK/s) (2^30.47) [P 0.00%][50.00% in 5.00779e+06y][0]
Pub Addr: 111113auryoueJApjac3VvMcmPcL1W5Co
Prv Addr: 5KKbZb2q6ds9MKJ7RP5pdx7bFmazSU7arhkBetsUYBEJyNKHRQj
Prv Key : 0xC63F8239D2073D242F6263099770C17426D95822647B78C56E5A72AC6542C113
Check   : 1MRXFX8Vmw6n4GLSWVr6Jp8WGjLm74Cuj4
Check   : 1DJ7SW3Tp6wkFvUGhAfLNcrCy5pNSQTEtp (comp)
sr. member
Activity: 462
Merit: 701
Ok

Could you try to noiline to 2 ModMult, also in GPU/GPUEngine.cu:

Code:
Line 510: __device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a, uint64_t *b) {
Line 560: __device__ __noinline__ void _ModMult(uint64_t *r, uint64_t *a) {

It seems that I reached a limit with CUDA....
I had similar problem with the last release...
And no warning at all !
legendary
Activity: 1932
Merit: 2077
Ok
Could you try to edit GPUEngine.cu and to change the stackSize to 49152 line (48K) line 1371.
I doubled the group size and I missed this.
Does it improves something ?

Code:
  size_t stackSize = 49152;
  err = cudaDeviceSetLimit(cudaLimitStackSize, stackSize);
  if (err != cudaSuccess) {




Same error:

Code:
/usr/local/cuda-8.0/bin/nvcc -maxrregcount=0 --ptxas-options=-v --compile --compiler-options -fPIC -ccbin g++ -m64 -O2 -I/usr/local/cuda-8.0/include -gencode=arch=compute_50,code=sm_50 -o obj/GPU/GPUEngine.o -c GPU/GPUEngine.cu
ptxas info    : 0 bytes gmem, 33320 bytes cmem[3]
ptxas info    : Compiling entry function '_Z9comp_keysjPtPjPmS0_' for 'sm_50'
ptxas info    : Function properties for _Z9comp_keysjPtPjPmS0_
    32936 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 126 registers, 360 bytes cmem[0], 116 bytes cmem[2]
ptxas info    : Function properties for _Z10CheckPointPjiiPtjS_S_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z11_GetHash160PmS_Ph
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z13CheckHashCompPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z15CheckHashUncompPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z15_GetHash160CompPmS_Ph
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z7_ModInvPm
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z9CheckHashjPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
Making VanitySearch...
g++ obj/Base58.o obj/IntGroup.o obj/main.o obj/Random.o obj/Timer.o obj/Int.o obj/IntMod.o obj/Point.o obj/SECP256K1.o obj/Vanity.o obj/GPU/GPUGenerate.o obj/hash/ripemd160.o obj/hash/sha256.o obj/hash/sha512.o obj/hash/ripemd160_sse.o obj/hash/sha256_sse.o obj/GPU/GPUEngine.o -lpthread -L/usr/local/cuda-8.0/lib64 -lcudart -o VanitySearch
sr. member
Activity: 462
Merit: 701
Ok
Could you try to edit GPUEngine.cu and to change the stackSize to 49152 line (48K) line 1371.
I doubled the group size and I missed this.
Does it improves something ?

Code:
  size_t stackSize = 49152;
  err = cudaDeviceSetLimit(cudaLimitStackSize, stackSize);
  if (err != cudaSuccess) {

legendary
Activity: 1932
Merit: 2077
Your git clone is up do date ?
git pull

Did you clean before making ?
make clean
and make gpu=1 all

On my config the -check is ok.
It looks like the problem I had last time when the GPU code was wrongly generated.


Code:
~$ git clone https://github.com/JeanLucPons/VanitySearch.git
Cloning into 'VanitySearch'...
remote: Enumerating objects: 147, done.
remote: Counting objects: 100% (147/147), done.
remote: Compressing objects: 100% (89/89), done.
remote: Total 472 (delta 90), reused 101 (delta 58), pack-reused 325
Ricezione degli oggetti: 100% (472/472), 287.95 KiB | 299.00 KiB/s, done.
Risoluzione dei delta: 100% (301/301), done.
~$ cd VanitySearch
~/VanitySearch$ git pull
Already up-to-date.


Code:
~/VanitySearch$ make clean
Cleaning...
~/VanitySearch$ make gpu=1 all
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Base58.o -c Base58.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/IntGroup.o -c IntGroup.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/main.o -c main.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Random.o -c Random.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Timer.o -c Timer.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Int.o -c Int.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/IntMod.o -c IntMod.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Point.o -c Point.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/SECP256K1.o -c SECP256K1.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/Vanity.o -c Vanity.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/GPU/GPUGenerate.o -c GPU/GPUGenerate.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/hash/ripemd160.o -c hash/ripemd160.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/hash/sha256.o -c hash/sha256.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/hash/sha512.o -c hash/sha512.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/hash/ripemd160_sse.o -c hash/ripemd160_sse.cpp
g++ -DWITHGPU -m64  -Wno-write-strings -O2 -I. -I/usr/local/cuda-8.0/include -o obj/hash/sha256_sse.o -c hash/sha256_sse.cpp
/usr/local/cuda-8.0/bin/nvcc -maxrregcount=0 --ptxas-options=-v --compile --compiler-options -fPIC -ccbin g++ -m64 -O2 -I/usr/local/cuda-8.0/include -gencode=arch=compute_50,code=sm_50 -o obj/GPU/GPUEngine.o -c GPU/GPUEngine.cu
ptxas info    : 0 bytes gmem, 33320 bytes cmem[3]
ptxas info    : Compiling entry function '_Z9comp_keysjPtPjPmS0_' for 'sm_50'
ptxas info    : Function properties for _Z9comp_keysjPtPjPmS0_
    32936 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 126 registers, 360 bytes cmem[0], 116 bytes cmem[2]
ptxas info    : Function properties for _Z10CheckPointPjiiPtjS_S_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z11_GetHash160PmS_Ph
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z13CheckHashCompPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z15CheckHashUncompPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z15_GetHash160CompPmS_Ph
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z7_ModInvPm
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for _Z9CheckHashjPtPmS0_ijPjS1_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
Making VanitySearch...
g++ obj/Base58.o obj/IntGroup.o obj/main.o obj/Random.o obj/Timer.o obj/Int.o obj/IntMod.o obj/Point.o obj/SECP256K1.o obj/Vanity.o obj/GPU/GPUGenerate.o obj/hash/ripemd160.o obj/hash/sha256.o obj/hash/sha512.o obj/hash/ripemd160_sse.o obj/hash/sha256_sse.o obj/GPU/GPUEngine.o -lpthread -L/usr/local/cuda-8.0/lib64 -lcudart -o VanitySearch


Code:
~/VanitySearch$ ./VanitySearch -check
GetBase10() Results OK
Add() Results OK : 158.730 MegaAdd/sec
Mult() Results OK : 25.063 MegaMult/sec
Div() Results OK : 4.566 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 337.671 KiloInv/sec
IntGroup.ModInv() : 9.041 MegaInv/sec
ModMulK1() : 12.934 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
Seed: 543374
107.806 MegaKey/sec
ComputeKeys() found 1499 items , CPU check...
Expected item not found fefe44b8 49f68f1f 096013ff e41e6d50 08263cf2
Expected item not found 3412f45b 16a98f1d bb7ee0f4 0278da41 07cb3ee6
Expected item not found fefe8197 02abccb1 39fcfe5c 7aa3a90a 33963bf2
Expected item not found fefedadb ac585a72 40bd3909 ecb1d075 e20484b9
Expected item not found 341249c5 f84bed21 07fb6121 631ee85d e5ef57c8
Expected item not found fefefa4a 23595a11 852a65fb 50ce146e 4540d8e8
Expected item not found fefe9430 b605f7f4 89981d6d 7b7a5088 a66b1f00
Expected item not found fefe9c16 400e3e2a 48c89b5a 3e52f969 918b7959
Expected item not found fefe5d8a 6d85d7df 48167af2 d4fc3a4d efa938e8
Expected item not found 3412485f e021867b f1397c5f ca679dba 77534406
Expected item not found 3412c6cd aecad837 247d170b f2eb91f4 81651450
Expected item not found 3412f092 9a76aa08 cb118f9e d0a04cc4 d03f89b6
Expected item not found 341299c8 ee36c5d2 17302781 6f96b5ab 71866549
Expected item not found 3412f7f1 1ce0cb01 8f52536b 55c48bbb 2f3b72f2
Expected item not found 3412b56c 3002219b bcce9cde 5e06bb74 5bd598c4
Expected item not found fefe8771 ac005e51 7cb4b3c4 f9a61416 e2e836bf
Expected item not found fefe8615 7ed9f4aa f16a2ceb ad9ffe89 5354f3f9
Expected item not found fefe25e1 afd03d00 c0c479c6 cd713bae b2e5cfa7
Expected item not found fefe334e 61a2a96e 0af5f876 357dab48 f67f214b
Expected item not found fefec35a 2ef5d92a b92dbe26 3c797bc6 5f35b539
Expected item not found fefe9221 f20eb58f 8a6f69c1 785431a7 d0730d12
Expected item not found 3412f913 ab809651 91539b93 232da11a ef41b61c
........
sr. member
Activity: 462
Merit: 701
@arulbero

Your git clone is up do date ?
git pull

Did you clean before making ?
make clean
and make gpu=1 all

On my config the -check is ok.
It looks like the problem I had last time when the GPU code was wrongly generated.

mmm....
legendary
Activity: 1932
Merit: 2077
Definitely there is something wrong.


Code:
~/VanitySearch$ ./VanitySearch -stop  -t 7 -gpu  1WantF
Start Sat Mar 16 12:50:16 2019
Difficulty: 15318045009
Search: 1WantF
Base Key:2AC8B932E4C55F53C4EC771105BB1647610917E7BE4DFA0606FD0F12F260CF12
Number of CPU thread: 7
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
159.335 MK/s (GPU 144.687 MK/s) (2^35.80) [P 97.97%][99.00% in 00:01:08][0]
Pub Addr: 1WantFvhmcdyuQ8thJfcVCMSQk1CjKpEZ
Prv Addr: 5KHCP5S2MW1vUBaiZz4VTcAMypVMxg6kLy79nKYttFbixnaxVbV
Prv Key : 0xC0CCA06A1AA34E266E561B97342FC6C17054CF0CB9D423D037C1A0C66DC68CC9
Check   : 1M5b4XcRva6QC8XLLFYEWYqmDr7MYNLcfd
Check   : 1CCWKXGcp8T7wvy8n9HJYve4WQYj93Srtx (comp)

If I check with my software this private key I can't get "1WantF"
Code:
genadd.py 0xC0CCA06A1AA34E266E561B97342FC6C17054CF0CB9D423D037C1A0C66DC68CC9
Private key : c0cca06a1aa34e266e561b97342fc6c17054cf0cb9d423d037c1a0c66dc68cc9
Public key  : c30ab092497a5e929b48f8da738fd2898ba2b1b40e6a819ae0d6da6a4c64e365 b5252302d125296a23e38997e155c31e3b0c4994c93b39155696d5db85d87ded
 
PrKey WIF u.: 5KHCP5S2MW1vUBaiZz4VTcAMypVMxg6kLy79nKYttFbixnaxVbV
Address u.  : dc409cb063463acaba996650713fca025e843631
Address u.  : 1M5b4XcRva6QC8XLLFYEWYqmDr7MYNLcfd

PrKey WIF c.: L3gVESZSyV54zyhpUarQcSCKbncBWjaNK8e9m2sn1DSGCqWqif8Q
Address c.  : 7ad66f0aa3bacef69b5871c4bbc460db29ca6b4f
Address c.  : 1CCWKXGcp8T7wvy8n9HJYve4WQYj93Srtx
sr. member
Activity: 462
Merit: 701
The increase of stivensons seems rather normal, but 350% seems strange !
May be there is a problem...
Does it work fine ?
If you try to search the same prefix several times, you note each time the percentage when the prefix is found, the average should be around 50%.
It is the case ?
legendary
Activity: 1932
Merit: 2077
From 132 MKey/s to 162 MKeys/s (+22/%)  Wink


EDIT:

But

Code:
./VanitySearch -check
GetBase10() Results OK
Add() Results OK : 123.457 MegaAdd/sec
Mult() Results OK : 24.272 MegaMult/sec
Div() Results OK : 5.495 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 340.092 KiloInv/sec
IntGroup.ModInv() : 9.236 MegaInv/sec
ModMulK1() : 13.040 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
Seed: 238931
144.693 MegaKey/sec
ComputeKeys() found 1560 items , CPU check...
Expected item not found fefe63f6 5ec693df 4cf9dadf 8b133b8f 8ccb0b70
Expected item not found 3412d19f 24a9ac14 6df4d827 b4d00364 ffdc11fa
Expected item not found 3412cd98 63fb6b03 eb3ddc78 3567f880 f98116de
Expected item not found fefe9309 d74017a9 6fc7c59e 4b7a2e63 a5a7abe7
Expected item not found 3412c765 0461b6b2 45570bb9 bbf80b0a 76c044ca
Expected item not found fefe99cf 982b08f5 a1dbde42 3f1f1f7a 2605fc83
Expected item not found 3412af23 b7a2cc9b 23525699 c12e9fff aa796ff1
Expected item not found 34126e76 729dc708 3d230591 1c140750 124f346a
Expected item not found fefe34a8 660725ae 40199cc0 dcf71566 23a797f3
Expected item not found fefee4ef a6247e15 b1717fd0 7a2b9635 1ec512bd
Expected item not found 3412488a 65f18231 c989f62b a3e4cece 6dbadd3e
Expected item not found fefe07b3 49378b5b 906ba7e7 887cc096 46914976
Expected item not found fefebf36 c23ee151 84bf8019 ddbe8489 18e77a6d
Expected item not found fefe85d7 9fa6f63a 7852ffec 3ad06c98 3d69a9c2
Expected item not found 34127f2c 4739d46d c501b09c d8b0d575 9e4877df
Expected item not found 341226ac f6746e8b c914a6a5 59b6947d 2f91d039
Expected item not found fefe1422 7692c225 d825ad31 2c8d22d8 d28cc7a3
Expected item not found fefe36a1 5d81f5c0 2122069d abe39143 d907eb6b
Expected item not found fefec283 cd96406d d29ab56e 0557767e 73144a83
Expected item not found 3412dbb7 ad500fd0 2cf59df0 cf0b6000 ffe47a23
Expected item not found 34128eca 162c602e ecfeacda a765e8a8 2bcbd091
Expected item not found 341207a9 ce5bfe83 2788e24a f29e5787 e909b931
Expected item not found fefe346f 5071aae8 dc717b8a 7c51fb02 cc06327e
Expected item not found fefe2b49 b25a7c52 66a4ddd7 893af5ee d5e659de
Expected item not found 3412ac9d 1910b3ee a9d770c3 ff81c805 048c78cd
Expected item not found fefefa6c 7af23a65 6f2c255f ee59f412 8e08c8ca
Expected item not found 341214d1 f3c70a70 4aa28f70 afed5d8b baa5eccd
Expected item not found 3412b4c3 a97478f6 acf9234d 5c73bd94 854bbcb3
Expected item not found 341271ce 858d15ab 2b19b081 16c158b0 e7beb38d
Expected item not found 34128660 fd689f46 c36fa455 700bdf04 9b660db5
Expected item not found 3412828e 0a9eced3 af87044f e471ce01 6930171b
Expected item not found fefeb5e0 9742288e 1d8c0f13 f7dabe34 4e5a1e03
Expected item not found fefeaf1c d1be1ea2 73d07ebb e1215670 79246fbd
Expected item not found 341296c9 33487919 0635c476 d878fe39 7a1ffe9e
Expected item not found 341293e4 db0eecd3 ec46040d c3332486 df884a7a
Expected item not found 34124652 f63f2749 e6a9d5de a37ce326 2121510f
Expected item not found fefe00b7 3729c679 252fcc5a 7822be32 8c5bcbdf
Expected item not found fefeef83 6a2217d1 a95874a4 b0c1a28f 8e753c1a
..............
member
Activity: 117
Merit: 32
 Shocked Beautiful increase stivensons me compared to the first version on only CPU go 10 times faster!
I’m not going to slow down but still amazing

By comparing the respective vanitygen/ cubitcrack speeds (compressed only) / vanitysearch on my CPU only

Vanitygen : Difficulty: 15318045009
[331.74 Kkey/s][total 2865408][Prob 0.0%][50% in 8.9h]

cubitcrack: GeForce GT 520M  531/1024MB | 1 target 3.18 MKey/s (17,301,504 total) [00:00:03]

vanitysearch: Start Sat Mar 16 12:13:44 2019
Difficulty: 15318045009
Search: 1testr
Base Key:30704D4B3275DE9A2D2F8B9AD01DB22F5694E0A58F0C0A0E21D1B1A110305637
Number of CPU thread: 4
15.261 MK/s (GPU 0.000 MK/s) (2^26.12) [P 0.48%][50.00% in 00:14:34][0]
jr. member
Activity: 82
Merit: 1
Code:
G:\vanitysearch>vanitysearch -o done6.txt -t 0 -gpu -gpuId 0,1,2,3,4,5,6 1C3J4uW
Start Sat Mar 16 17:37:13 2019
Difficulty: 15318045009
Search: 1C3J4uW
Base Key:91DCE68637F7B992C3F0C6927E5DD81121A319C8018CE3CF8703CA9AD27ECD6B
Number of CPU thread: 0
GPU: GPU #6 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
GPU: GPU #5 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #4 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #0 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #2 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #3 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
GPU: GPU #1 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
2155.100 MK/s (GPU 2155.100 MK/s) (2^38.99) [P 100.00%][99.00% in 00:00:00][32]
member
Activity: 117
Merit: 32
versions 1.2 to 1.3 looked on my old CPU 1.8 MK / s
For version 1.6 it was already a nice increase in speed, a nice optimization Jean_Luc nice!

Tested Jean_Luc nice, on my old gear the improvement is 65% CPU .... for the GPU I can not enjoy it for the moment CUDA 8.0
Start Wed Mar  6 15:55:46 2019
Search: 1testr
Difficulty: 15318045009
Base Key:5D48B5A686EF3CCD828F2B23DBD365564D4193F3DC5EA98EB696641F8C8CFC17
Number of CPU thread: 4
3.016 MK/s (GPU 0.000 MK/s) (2^28.15) [P 1.92%][50.00% in 00:57:02][0]

With this version 1.7 the increase of the speed on my old material is still impressive + 30%
result version 1.7

Start Mon Mar 11 13:38:57 2019
Difficulty: 15318045009Search: 1testr
Base Key:EF61AC731BD4EAA239646EC88F3F3538D39BBA9B2A8C580276CB9AFAE849ECFE
Number of CPU thread: 4
4.395 MK/s (GPU 0.000 MK/s) (2^26.09) [P 0.47%][50.00% in 00:45:09][0]

The i option works very well at home ... I dream of compatibility with CUDA 8.0 to enjoy my old GPU GT520M  Grin

Start Mon Mar 11 13:53:03 2019
Ignoring prefix "1CPuID" (0, I, O and l not allowed)
Search: 3 prefixes (Lookup size 3)
Base Key:C24307039526A5A5EA9DA60EB6C67A3E9F60BC32BA44E8337171A53751AA3A12
Number of CPU thread: 4
4.192 MK/s (GPU 0.000 MK/s) (2^26.91) [P 37.96%][50.00% in 00:00:14][0]

on the other hand I do not know what I'm doing wrong it only records the results of the first pattern
good job Jean_Luc  Smiley



Good Job Jean_Luc
but this time I don’t get a raise on my equipment it’s the same

Start Fri Mar 15 12:03:33 2019
Difficulty: 15318045009Search: 1testr
Base Key:EABBFA78AB6FB34A0D274DF6A909167A0CC8A231DE815525743C84097A632B86
Number of CPU thread: 4
4.290 MK/s (GPU 0.000 MK/s) (2^25.94) [P 0.42%][50.00% in 00:44:45][0]

wow then I don’t know if it’s normal but here on my home the new release gives a speed increase of 350% Shocked Huh

Start Sat Mar 16 11:37:12 2019
Difficulty: 15318045009
Search: 1testr
Base Key:14BD6650FC4CCE72930A6395ABCB9B716C7986E0F331C5DB14A7A8BB940B4AA0
Number of CPU thread: 4
15.273 MK/s (GPU 0.000 MK/s) (2^26.46) [P 0.60%][50.00% in 00:11:29][0]
sr. member
Activity: 462
Merit: 701
Hello,

The new release 1.8 is available for download.
~20% global speed increase on my hardware.
Thanks for testing and reporting issues Wink

https://github.com/JeanLucPons/VanitySearch/releases
legendary
Activity: 1932
Merit: 2077
I ended the implementation of endomorphisms and their symmetrics (CPU only).
The code is committed to GitHub for those who want to test.
On my hardware, I observe a ~20% speed increase (compressed addresses), the hash functions (SSE) takes now 76% of the CPU.
GPU implementation is coming...

A 20% speed increase on my cpu too, from 13.7 to 16.4 MKeys/s.

On GPU from 130 to 132 MKeys/s.
member
Activity: 117
Merit: 32
Good Job Jean_Luc
but this time I don’t get a raise on my equipment it’s the same

I didn't published yet the release as executable downlaod, if you want to test it you have to clone the git repository and compile by yourself.
The new release is coming, I'm currently working on GPU code Wink

Ok actually I was thinking it’s weird too Grin
sr. member
Activity: 462
Merit: 701
Good Job Jean_Luc
but this time I don’t get a raise on my equipment it’s the same

I didn't published yet the release as executable downlaod, if you want to test it you have to clone the git repository and compile by yourself.
The new release is coming, I'm currently working on GPU code Wink
member
Activity: 117
Merit: 32
versions 1.2 to 1.3 looked on my old CPU 1.8 MK / s
For version 1.6 it was already a nice increase in speed, a nice optimization Jean_Luc nice!

Tested Jean_Luc nice, on my old gear the improvement is 65% CPU .... for the GPU I can not enjoy it for the moment CUDA 8.0
Start Wed Mar  6 15:55:46 2019
Search: 1testr
Difficulty: 15318045009
Base Key:5D48B5A686EF3CCD828F2B23DBD365564D4193F3DC5EA98EB696641F8C8CFC17
Number of CPU thread: 4
3.016 MK/s (GPU 0.000 MK/s) (2^28.15) [P 1.92%][50.00% in 00:57:02][0]

With this version 1.7 the increase of the speed on my old material is still impressive + 30%
result version 1.7

Start Mon Mar 11 13:38:57 2019
Difficulty: 15318045009Search: 1testr
Base Key:EF61AC731BD4EAA239646EC88F3F3538D39BBA9B2A8C580276CB9AFAE849ECFE
Number of CPU thread: 4
4.395 MK/s (GPU 0.000 MK/s) (2^26.09) [P 0.47%][50.00% in 00:45:09][0]

The i option works very well at home ... I dream of compatibility with CUDA 8.0 to enjoy my old GPU GT520M  Grin

Start Mon Mar 11 13:53:03 2019
Ignoring prefix "1CPuID" (0, I, O and l not allowed)
Search: 3 prefixes (Lookup size 3)
Base Key:C24307039526A5A5EA9DA60EB6C67A3E9F60BC32BA44E8337171A53751AA3A12
Number of CPU thread: 4
4.192 MK/s (GPU 0.000 MK/s) (2^26.91) [P 37.96%][50.00% in 00:00:14][0]

on the other hand I do not know what I'm doing wrong it only records the results of the first pattern
good job Jean_Luc  Smiley



Good Job Jean_Luc
but this time I don’t get a raise on my equipment it’s the same

Start Fri Mar 15 12:03:33 2019
Difficulty: 15318045009Search: 1testr
Base Key:EABBFA78AB6FB34A0D274DF6A909167A0CC8A231DE815525743C84097A632B86
Number of CPU thread: 4
4.290 MK/s (GPU 0.000 MK/s) (2^25.94) [P 0.42%][50.00% in 00:44:45][0]
sr. member
Activity: 462
Merit: 701
Hello,

I ended the implementation of endomorphisms and their symmetrics (CPU only).
The code is committed to GitHub for those who want to test.
On my hardware, I observe a ~20% speed increase (compressed addresses), the hash functions (SSE) takes now 76% of the CPU.
GPU implementation is coming...


Many thanks again to arulbero for these precious tips concerning symmetries and to all for you for helping to make this software better Wink
sr. member
Activity: 462
Merit: 701
it seems to me that ModInv should be much lower than 50% of ModMulk1, are you sure you don't take significative advantage from using more than 256 elements for each batch? Why don't try with 1024 or 4096?

May be there is a confusion between IntGroup::ModInv(256*3 ModMulK1) and Int::ModInv (The true ModInv).
Look only a the column on the right (Self CPU).

ModInv is taking ~2% (using compressed address) so if I multiply by 2 the group size I can expect a ~1% speed increase for the CPU release. I did the test on 1 core and as expected, the key rate goes from 3.4MKey/s to 3.44MKey/s. Of course for other applications where you do not need to hash, you can expect a more significant speed increase.
I attach a new CPU profile with SSE disabled (-nosse option) and using compressed address, this profile should be close enough to the GPU profile, there is no SIMD instruction on GPU to speed up hash functions.



Here the ModInv fall to 1%.

For VanitySearch, having a smaller group size is better (This is a reason why I worked a lot on this DRS62 ModInv implementation). I can double the size of the group (I will definitely do it) but not more. The GPU kernel performs one group per thread and send back hash160 to the CPU. If the group size is too large, memory transfer and allocation become a problem. Divide and rule Wink

It's amazing how much progress is being made on this software so quickly.  Great work!

Thanks Wink
Pages:
Jump to: