Vanitygen: Vanity bitcoin address generator/miner [v0.22] - page 179.

Jonathan Ryan Owens

donator

Activity: 392

Merit: 252

Quote from: defxor on August 09, 2011, 04:31:58 AM

Quote from: Jonathan Ryan Owens on August 09, 2011, 04:09:50 AM

1. compiling and running oclvanitygem on one of my rigs

Linux? You should be fine just running

Code:

make vanitygen oclvanitygen
./oclvanitygen -d0 -i 1bounty

(play with the options for optimum performance)

Quote

2. talking me through options for importing private keys, that i might pass on to my users.

See the pywallet thread - https://bitcointalksearch.org/topic/pywallet-22-manage-your-wallet-update-required-34028

Code:

$ ./pywallet.py --importprivkey=5privatekeystuffyougetfromvanitygen

It's much more simple than you might think, just try it. Give yourself the reward when you've succeeded Wink

git clone https://github.com/samr7/vanitygen.git

defxor

hero member

Activity: 530

Merit: 500

Quote from: Jonathan Ryan Owens on August 09, 2011, 04:09:50 AM

1. compiling and running oclvanitygem on one of my rigs

Linux? You should be fine just running

Code:

make vanitygen oclvanitygen
./oclvanitygen -d0 -i 1bounty

(play with the options for optimum performance)

Quote

2. talking me through options for importing private keys, that i might pass on to my users.

See the pywallet thread - https://bitcointalksearch.org/topic/pywallet-22-manage-your-wallet-update-required-34028

Code:

$ ./pywallet.py --importprivkey=5privatekeystuffyougetfromvanitygen

It's much more simple than you might think, just try it. Give yourself the reward when you've succeeded Wink

Jonathan Ryan Owens

donator

Activity: 392

Merit: 252

I find this all very interesting. What type of hashing power would it take to get out to 5 - 6 pattern match and/or regex with 1 - 1.1 GH/s? How long?

Furthermore, which one of you can compile this to run on an up to date linuxcoin rig?

Bounty.. I'd like to pregenerate receiving addresses for my server, bitcoinduit.com. I'm adding a bounty feature.

I'd like to be able to advertise 1bounty or 1BoUNTy. i want to pregenerate them when a game starts, and add them to rounds once they've reached a particular threshold. The private key will be part of the game, though I haven't completely decided how yet. I do know that "jackpot / bounty" will be paid to the 1bounty addresses, and the private key will be awarded. There's no really convenient way to import private keys, but I have to imagine that there will be more options soon, and in the meantime, if they want their bitcoin, they'll probably have to learn some command line fu?

The part I'll need help with is this:

1. compiling and running oclvanitygem on one of my rigs
2. talking me through options for importing private keys, that i might pass on to my users.

I appreciate the help, and yours will be rewarded.

Regards,
Jonathan

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: Rassah on August 09, 2011, 12:35:47 AM

Safe mode does the same ting (never finds anything)
Running oclvanitygen -d0 1 finds something instantly
Running oclvanitygen -d0 -vkr -w128 "^1B" returns
...
followed by more GPU idles. Not sure if that CPU matcher is right. Maybe that -r isn't working for me? Undecided

Ah! Perhaps it's repeating the same address indefinitely, or maybe for all results of a particular batch. Try:

-d0 -vkr -w128 "^1"

Rassah

legendary

Activity: 1680

Merit: 1035

Safe mode does the same ting (never finds anything)
Running oclvanitygen -d0 1 finds something instantly
Running oclvanitygen -d0 -vkr -w128 "^1B" returns

Quote

Device: Cypress
Vendor: Advanced Micro Devices, Inc. (1002)
Driver: CAL 1.4.1417 (VM)
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (650.9)
Max compute units: 14
Max workgroup size: 256
Global memory: 838860800
Max allocation: 209715200
OpenCL compiler flags: -DDEEP_PREPROC_UNROLL -DVERY_EXPENSIVE_BRANCHES -DDEEP_VLIW -DAMD_BFI_INT
Loading kernel binary ce56d53554537b697380a368231a1646.oclbin
Grid size: 896x512
Modular inverse: 3584 threads, 128 ops each
WARNING: Using CPU pattern matcher
GPU idle: 99.34%
GPU idle: 99.75%

followed by more GPU idles. Not sure if that CPU matcher is right. Maybe that -r isn't working for me? Undecided

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: Rassah on August 09, 2011, 12:01:15 AM

Though I'm noticing for simple keys, like 1pop for example, vanitygen64 finds a key almost instantly, wile oclvanitygen seems to flash estimated time remaining, and then sit there, just counting up the total. What's preventing it from finding the key I wonder? Is it only useful for more difficult keys?

Oh no, it definitely shouldn't be doing this.

Ideally it would have a comprehensive test suite that you could run and validate that the OpenCL kernel is running correctly. Today, there is something less comprehensive. You can try:

oclvanitygen -d0 -vkr -w128 "^1B"

It should scroll a list of addresses fairly quickly. If any of them don't start with 1B, then it's producing incorrect addresses on the GPU. If this test passes, something is wrong with the GPU prefix checker.

Edit: Does it also fail to find any matches to 1Pop in safe mode?

Rassah

legendary

Activity: 1680

Merit: 1035

Quote from: samr7 on August 08, 2011, 02:41:56 PM

New version posted. This includes a lot of changes.

Oclvanitygen #pragma unroll has been removed, for Rassah. If you experienced a crash with a Radeon card, but safe mode (-S) worked, this should fix the problem.

Runs great without crashing now! Thanks!
Though I'm noticing for simple keys, like 1pop for example, vanitygen64 finds a key almost instantly, wile oclvanitygen seems to flash estimated time remaining, and then sit there, just counting up the total. What's preventing it from finding the key I wonder? Is it only useful for more difficult keys?
I ran oclvanitygen -d0 -v 1Pop and output is

Quote

Prefix difficulty: 77178 1Pop
Difficulty: 77178
Device: Cypress
Vendor: Advanced Micro Devices, Inc. (1002)
Driver: CAL 1.4.1417 (VM)
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP-SDK-v2.4 (650.9)
Max compute units: 14
Max workgroup size: 256
Global memory: 838860800
Max allocation: 209715200
OpenCL compiler flags: -DDEEP_PREPROC_UNROLL -DVERY_EXPENSIVE_BRANCHES -DDEEP_VLIW -DAMD_BFI_INT
Loading kernel binary ce56d53554537b697380a368231a1646.oclbin
Grid size: 1792x1024
Modular inverse: 3584 threads, 512 ops each
Using OpenCL prefix matcher
GPU idle: 11.30%
GPU idle: 1.78%
GPU idle: 1.26%
[19.42 Mkey/s][total 776208384]

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: dinox on August 08, 2011, 05:05:46 PM

One question tho: Why do I get 2*0.9kKeys/s when throwing two threads of oclvanitygen at the same GPU, and only 1.6MKeys/s when running one thread?

Edit: Adding one more thread (for a total of 3) gives me 2M+, but 4 threads only gets about 1.6M.

Interesting!

If you run with -v, does it report high GPU idle numbers?

This could be an issue with poor multithreading in oclvanitygen. Currently the CPU based work is done in one thread, the OpenCL work is dispatched to the GPU by a separate thread, and there is too much synchronization between the two, to the point of adding a lot of turnaround latency to the dispatcher. It's not surprising that multiple instances improve it. But I'm not sure why three instances would show even more improvement, unless Apple is doing something very interesting like allowing simultaneous NDRange jobs to run on the GPU at the same time.

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: ctoon6 on August 08, 2011, 03:18:11 PM

x86: 800-850
x64: 1300-1350

That's an impressive amount of hardware you threw at it! What CPU gets you 1300??

Quote from: jackjack on August 08, 2011, 04:21:30 PM

Maybe you should analyse the base58 key, and print the hex private key if the base58 key is not valid

This is a good idea, running the encoded private key back through the decoder for verification. I'll try making an online assertion test out of it. At the very least, it's going into the offline test suite.

Quote from: defxor on August 08, 2011, 07:45:18 PM

Unfortunately it's not friendly towards my MBP any longer :/ First time I tried oclvanitygen with default settings after having downloaded it I got a segfault after some error output about running out of CL resources. It was repeatable. I then tried to "fix it" by playing around with the parameters. -w256 made no difference, and -w256 -g256x512 caused oclvanitygen (the CL compilator I assume, the only thing that got displayed was the difficulty) to hang my graphics card completely. Had to hard reboot.

After the reboot, even the default parameters hung the computer. Got two error lines but it never reached the segfault. I've now hard rebooted again and sorry to say won't try to get you the debug output Wink

Or is this where I should try the safe switch?

The meaning of -w changed to be a bit more intuitive. It specifies the number of addresses to calculate per hardware thread in a work unit. It constrains the grid size and saves you from having to manually tweak it. Try -w1 and increase in powers of two.

If it still doesn't work, you can get the old behavior with -t256 -g256x512.

You make a good point about safe mode. Its only effect right now is to work around issues with optimizations, but given that the work unit problem caused your computer to crash, the work unit problem deserves equal attention and safe mode should also constrain it.

defxor

hero member

Activity: 530

Merit: 500

Quote from: samr7 on August 08, 2011, 02:41:56 PM

New version posted. This includes a lot of changes.

Oclvanitygen work size autoconfiguration now sets more reasonable limits for smaller GPUs.
Numerous performance tweaks to oclvanitygen, including AMD BFI_INT instruction patching.

Unfortunately it's not friendly towards my MBP any longer :/ First time I tried oclvanitygen with default settings after having downloaded it I got a segfault after some error output about running out of CL resources. It was repeatable. I then tried to "fix it" by playing around with the parameters. -w256 made no difference, and -w256 -g256x512 caused oclvanitygen (the CL compilator I assume, the only thing that got displayed was the difficulty) to hang my graphics card completely. Had to hard reboot.

After the reboot, even the default parameters hung the computer. Got two error lines but it never reached the segfault. I've now hard rebooted again and sorry to say won't try to get you the debug output Wink

Or is this where I should try the safe switch?

(0.16 still works as before)

dinox

full member

Activity: 126

Merit: 100

Quote from: samr7 on August 08, 2011, 02:41:56 PM

New version posted. This includes a lot of changes.

Bugfix for private key encoder. It was possible for it to output a shorter-than-expected private key. If you have a private key that does not start with a 5, or is less than 51 characters long, don't use it. If you have such a private key with a bitcoin balance, contact me for assistance.
Oclvanitygen #pragma unroll has been removed, for Rassah. If you experienced a crash with a Radeon card, but safe mode (-S) worked, this should fix the problem.
Oclvanitygen work size autoconfiguration now sets more reasonable limits for smaller GPUs.
Numerous performance tweaks to oclvanitygen, including AMD BFI_INT instruction patching.
Windows binary package now includes x64 build: vanitygen64.exe. This binary runs about 50% faster, if you have a 64-bit edition of Windows.

I never knew just how slow the 32-bit Windows build was until recently, and must apologize for not posting a 64-bit build sooner. 64-bit arithmetic makes a huge difference. The Windows binaries for 0.17 are also built and linked with more MSVC optimizations enabled, and now include statically-linked OpenSSL. Despite all of this, the x64 Windows build seems to run about 10% slower than an equivalent Linux build.

Upgrade to OS X Lion fixed my problems, getting 1.6MKeys/s now with my Radeon 6490M.

One question tho: Why do I get 2*0.9kKeys/s when throwing two threads of oclvanitygen at the same GPU, and only 1.6MKeys/s when running one thread?

Edit: Adding one more thread (for a total of 3) gives me 2M+, but 4 threads only gets about 1.6M.

ctoon6

sr. member

Activity: 350

Merit: 251

6870
gpu 940mhz

23.5-23.7 Mkeys/s

Code:

@echo off
oclvanitygen.exe -ik -p 0 -d 0 -o ocl 1
pause

jackjack

legendary

Activity: 1176

Merit: 1280

May Bitcoin be touched by his Noodly Appendage

Quote from: samr7 on August 08, 2011, 02:41:56 PM

Bugfix for private key encoder. It was possible for it to output a shorter-than-expected private key. If you have a private key that does not start with a 5, or is less than 51 characters long, don't use it. If you have such a private key with a bitcoin balance, contact me for assistance.

Maybe you should analyse the base58 key, and print the hex private key if the base58 key is not valid

ctoon6

sr. member

Activity: 350

Merit: 251

x86: 800-850
x64: 1300-1350

Code:

@echo off
vanitygen64.exe -qik -o file3 1ctoon6
pause

pretty good stuff

are Mkeys/s 1024 based or just 1000

and are hashes the same way, always wondered

1090T
8gigs of ram
windows 7 ultimate x64

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

New version posted. This includes a lot of changes.

Bugfix for private key encoder. It was possible for it to output a shorter-than-expected private key. If you have a private key that does not start with a 5, or is less than 51 characters long, don't use it. If you have such a private key with a bitcoin balance, contact me for assistance.
Oclvanitygen #pragma unroll has been removed, for Rassah. If you experienced a crash with a Radeon card, but safe mode (-S) worked, this should fix the problem.
Oclvanitygen work size autoconfiguration now sets more reasonable limits for smaller GPUs.
Numerous performance tweaks to oclvanitygen, including AMD BFI_INT instruction patching.
Edit: Meaning of oclvanitygen -w flag has changed: It now specifies number of work items per hardware thread per job, and constrains the grid size. -w1 now specifies a smaller overall work size. The previous effect can be done with -t.
Windows binary package now includes x64 build: vanitygen64.exe. This binary runs about 50% faster, if you have a 64-bit edition of Windows.

I never knew just how slow the 32-bit Windows build was until recently, and must apologize for not posting a 64-bit build sooner. 64-bit arithmetic makes a huge difference. The Windows binaries for 0.17 are also built and linked with more MSVC optimizations enabled, and now include statically-linked OpenSSL. Despite all of this, the x64 Windows build seems to run about 10% slower than an equivalent Linux build.

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: Rassah on August 06, 2011, 12:42:01 PM

It worked with the -S flag. Getting 3.12Mkey/s on my 900Mhz 5830. I hope I'm not losing much speed. Still says [50% in 1.8d] for the address I chose with Difficulty: 888446610538 (better than ~50 days that I was getting with my CPU)

That sounds about right for safe mode. If you manage to get it to run with the optimizations turned on, with a 900 clock, you should be able to get >20Mkey/s out of it.

On the topic of crashing, I managed to get similar behavior to what you described, with a Win7 x64 VM. Obviously not with a GPU. But, attempting to compile any sort of OpenCL kernel with a #pragma unroll would cause a crash. This affected both KernelAnalyzer and oclvanitygen on the CPU device. It initially had Catalyst 11.6 and APP SDK 2.4. Upgrading didn't fix it. I didn't connect it to your problem until a couple days ago when it needed to be reinstalled. Lo and behold, with a clean install of APP SDK 2.5, #pragma unroll no longer causes everything to crash! Now, to get it back to the broken state.

Rassah

legendary

Activity: 1680

Merit: 1035

Quote from: samr7 on August 05, 2011, 12:23:15 AM

Quote from: Rassah on August 04, 2011, 11:06:58 PM

Code:

Driver: CAL 1.4.1417 (VM)

Well that's embarassing, Radeon 5830 is supposed to be one of the best tested cards! Unfortunately I'm not sure how well tested it is with Catalyst 11.6 on either Windows or Linux, which would appear to be the driver you're using?

I'm going to look into this.

There are two things you can try:

Safe mode (-S flag), which disables loop unrolling optimizations, but also kills performance.
Different versions of the Catalyst driver. My test environment is 11.5 on Linux.

It worked with the -S flag. Getting 3.12Mkey/s on my 900Mhz 5830. I hope I'm not losing much speed. Still says [50% in 1.8d] for the address I chose with Difficulty: 888446610538 (better than ~50 days that I was getting with my CPU)

ctoon6

sr. member

Activity: 350

Merit: 251

Quote from: samr7 on August 06, 2011, 07:58:26 AM

Quote from: ctoon6 on August 06, 2011, 07:26:06 AM

i put the pthreadVC2.lib file in C:\pthreads-win32 from making pthreads

Quote

the vanitygen make file is the original. i am very confused Huh

Your pthreads dir certainly looks complete. Try renaming c:\pthreads-win32 to c:\pthreads-w32.

If you're up for it, send me a /msg on freenode, same nickname.

that worked

now i get

Code:

LINK : fatal error LNK1181: cannot open input file 'C:\pthreads-w32\pthread.lib'

this is probably from me now knowing how to change it to use the other lib file.

samr7

full member

Activity: 140

Merit: 430

Firstbits: 1samr7

Quote from: ctoon6 on August 06, 2011, 07:26:06 AM

i put the pthreadVC2.lib file in C:\pthreads-win32 from making pthreads

Quote

the vanitygen make file is the original. i am very confused Huh

Your pthreads dir certainly looks complete. Try renaming c:\pthreads-win32 to c:\pthreads-w32.

If you're up for it, send me a /msg on freenode, same nickname.

ctoon6

sr. member

Activity: 350

Merit: 251

i put the pthreadVC2.lib file in C:\pthreads-win32 from making pthreads and it complains about

Code:

vanitygen.c(24) : fatal error C1083: Cannot open include file: 'pthread.h': No s
uch file or directory

Code:

 Directory of C:\pthreads-win32
12/19/2006  10:00 PM            15,359 FAQ
08/06/2011  04:46 AM              include
08/06/2011  04:46 AM              lib
12/21/2006  10:00 PM            89,608 libpthreadGC2.a
12/21/2006  10:00 PM            89,614 libpthreadGCE2.a
12/19/2006  10:00 PM                99 MAINTAINERS
12/21/2006  10:00 PM               132 md5.sum
12/21/2006  10:00 PM            38,603 NEWS
12/19/2006  10:00 PM               142 PROGRESS
12/19/2006  10:00 PM            43,162 pthread.h
12/21/2006  10:00 PM            60,273 pthreadGC2.dll
12/21/2006  10:00 PM           112,556 pthreadGCE2.dll
12/21/2006  10:00 PM            86,070 pthreadVC2.dll
08/06/2011  05:18 AM            75,986 pthreadVC2.lib
12/21/2006  10:00 PM            77,879 pthreadVCE2.dll
12/21/2006  10:00 PM            29,400 pthreadVCE2.lib
12/21/2006  10:00 PM            86,071 pthreadVSE2.dll
12/21/2006  10:00 PM            29,400 pthreadVSE2.lib
12/19/2006  10:00 PM             4,844 sched.h
12/19/2006  10:00 PM             4,429 semaphore.h
12/19/2006  10:00 PM             7,904 WinCE-PORT

the vanitygen make file is the original. i am very confused Huh

Topic: Vanitygen: Vanity bitcoin address generator/miner [v0.22] - page 179. (Read 1153876 times)