Author

Topic: Vanitygen: Vanity bitcoin address generator/miner [v0.22] - page 184. (Read 1153743 times)

full member
Activity: 140
Merit: 430
Firstbits: 1samr7
  • nVidia GTS 250 + Core i5 750 @2.67 GHz: 1.54 Mkey/s, 110% CPU / how to measure GPU?

The GTS 250 beats my core i5 by a factor of 3.

That's awesome!!

The GPU idle statistic is a mess right now, but to get it, use "-v".
hero member
Activity: 756
Merit: 502
  • nVidia GTX 285 + C2D 6600: 3.5 Mkey/s, 100% CPU / 90% GPU
  • AMD 5830 + Sempron 140: 5.5 Mkey/s, 100% CPU / 60% GPU


Thank you for your dedicated hard work. Benchmarked this on Ubuntu 9.04 with latest release-version nVidia driver for 32bit Linux.

  • nVidia GTS 250 + Core i5 750 @2.67 GHz: 1.54 Mkey/s, 110% CPU / how to measure GPU?

The GTS 250 beats my core i5 by a factor of 3.
hero member
Activity: 695
Merit: 502
PGP: 6EBEBCE1E0507C38
got this without useing vanity gen

1YCqF14gcummp4Gj4GGdqXFLn7qAssvju

any fighter pilots making a movie?
legendary
Activity: 1400
Merit: 1005
Nice!  Good to hear of the progress on using video cards for the search...  definitely a vast improvement over using the CPU.  Maybe I'll finally find 1SgtSpike!
full member
Activity: 140
Merit: 430
Firstbits: 1samr7
To report on the current state of oclvanitygen, I've made some progress optimizing it for various GPUs.  So far it has been depending on the CPU to compare addresses with prefix targets.  The performance with higher-end GPUs has progressed to the point that the CPU has become the bottleneck, and the prefix checking will certainly need to be offloaded next.  And don't even think about using regular expressions!

Obligatory performance numbers, searching for a single case-sensitive prefix:

  • nVidia GTX 285 + C2D 6600: 3.5 Mkey/s, 100% CPU / 90% GPU
  • AMD 5830 + Sempron 140: 5.5 Mkey/s, 100% CPU / 60% GPU

I found a really cool address with this thing.  As it would turn out though, somebody already claimed it on 1stbits two days ago. Tongue  Regardless, 7- and 8-character prefixes are no longer out of reach for a single gaming system.
legendary
Activity: 1400
Merit: 1005
How risky do you guys consider it to be for me to download the executable form the first post and run it on my machine?
It is trivial for someone to make a slightly-modified executable that appeared in all respects to work correctly but such that given the account name, they could deduce the private key. That said, I think it's highly unlikely that anyone has bothered.

You can actually check for such a thing. Unless someone specifically thought that someone would test for this, the most obvious way to do that would leave a tell. Create a specific vanity key, say "1Dag". Then create that same vanity key again. And then do it one more time. If any of your keys are the same, you have a sabotaged binary. If not, then you either don't have a sabotaged binary or the sabotage is very subtle. (The obvious way to sabotage it would be to use a defined pattern of keys rather than a random one.)
Oh, I don't doubt there's ways to subtly hide all sorts of malicious activities.  I'm just talking from the standpoint of samr7's reputation here.  I trust him, based on the work that he has released, the manner in which he released it, and the respect with which he conducts himself on this forum.  Perhaps I am too trusting of people, but I get no indication of a bad apple.  And since he's the one releasing the source code and builds, I also trust those builds.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
How risky do you guys consider it to be for me to download the executable form the first post and run it on my machine?
It is trivial for someone to make a slightly-modified executable that appeared in all respects to work correctly but such that given the account name, they could deduce the private key. That said, I think it's highly unlikely that anyone has bothered.

You can actually check for such a thing. Unless someone specifically thought that someone would test for this, the most obvious way to do that would leave a tell. Create a specific vanity key, say "1Dag". Then create that same vanity key again. And then do it one more time. If any of your keys are the same, you have a sabotaged binary. If not, then you either don't have a sabotaged binary or the sabotage is very subtle. (The obvious way to sabotage it would be to use a defined pattern of keys rather than a random one.)
legendary
Activity: 1400
Merit: 1005
How risky do you guys consider it to be for me to download the executable form the first post and run it on my machine?
Considering the work that samr7 has put into optimizing the code (and seeing those optimizations actually help), I would say zero risk.
newbie
Activity: 42
Merit: 0
How risky do you guys consider it to be for me to download the executable form the first post and run it on my machine?

I hope it's not too risky Wink

I have been running it on my windows machine for a few days with no complaints.
hero member
Activity: 616
Merit: 500
Firstbits.com/1fg4i :)
How risky do you guys consider it to be for me to download the executable form the first post and run it on my machine?
full member
Activity: 140
Merit: 430
Firstbits: 1samr7
Version 0.13 is up.  The only major change is to display hints when impossible address prefixes are entered, suggested by a user via email.  It's not worth downloading if you already have 0.12.

Under the hood, the source tree has been reorganized a bit, and a new OpenCL version, oclvanitygen, is now present.

Regarding the current state of oclvanitygen:
  • It isn't built by default, you need to run: make oclvanitygen.  Build it on Windows at your own peril.
  • It isn't optimized at all.  Specifically, it can't outperform the CPU with AMD hardware, and while it is faster with nVidia hardware, the profiler claims 25% occupancy.
hero member
Activity: 756
Merit: 502
Hmm, these nVidia GPUs might not be particularly fast at those jobs. Last time I imlemented a 1024 bit (non modular) bignum multiplication on nVidia GTX 260 using CUDA, it turned out to be only as fast as two CPU cores combined.

One possible avenue to expore, would be using Intel AVX instructions ("Sandy bridge"). In principle these would allow to perform eight modular multiplications or additions in parallel. However you really need some up to date Intel silicon and an OpenSSL library with support for AVX.

full member
Activity: 140
Merit: 430
Firstbits: 1samr7
I've implemented this in python with hashlib as only dependency and yes, it's indeed working. I want to translate this to a opencl kernel but I figured out we have a core problem to solve first.

The problem is that address generating is pipelined, i.e. you cant multi-thread the problem since next worker need the work of last worker to start computing. Let's say you have some workers and a base public key. Then worker 1 does public key += base_ec_point and worker 2 has to wait for this until it can do public key += base_ec_point and so on... How do you solve this?

The trick is to remember that: pub_key + (N+M)*base_ec_point == pub_key + N*base_ec_point + M*base_ec_point

Parallelism is possible by computing:
  • A row of sequential EC points (pub_key + k*base_ec_point) for k=0,1,2,...N
  • A column of EC points i*N*base_ec_point for i=0,1,2,...
..and then add each possible combination of row and column member in parallel.

The trickiest part of implementing this in OpenCL is the bigint arithmetic.  I'm actually testing a kernel right now.  It's monstrous, easily dwarfing the various miner kernels in compiled size, and only moderately fast -- on my GTX 285, it can produce ~1.1Mkey/sec.
full member
Activity: 126
Merit: 100
I haven't got every thing about ECC clear yet, but what I have understood you can do

Public_key = some_int*base_EC_point
Priv_key = some_int

Then

New_pub_key = Public_key + another_int*base_EC_point
New_priv_key = priv_key + another_int

Now if you set another_int=1 you only have to do one add each iteration, and if there is a match another add to get the private key. This should speed up the calc a lot if my maths is correct.
You're sort of correct, but samr7 has been doing this since his earliest released version. It's one of the reasons his tool is rather fast Smiley

I've corrected some stuff in green. The base_ec_point is a parameter that is known to all parties; it's defined in the protocol/cryptology parameters. It's parameter G.
Thanks!

I've implemented this in python with hashlib as only dependency and yes, it's indeed working. I want to translate this to a opencl kernel but I figured out we have a core problem to solve first.

The problem is that address generating is pipelined, i.e. you cant multi-thread the problem since next worker need the work of last worker to start computing. Let's say you have some workers and a base public key. Then worker 1 does public key += base_ec_point and worker 2 has to wait for this until it can do public key += base_ec_point and so on... How do you solve this?
hero member
Activity: 714
Merit: 504
^SEM img of Si wafer edge, scanned 2012-3-12.
I haven't got every thing about ECC clear yet, but what I have understood you can do

Public_key = some_int*base_EC_point
Priv_key = some_int

Then

New_pub_key = Public_key + another_int*base_EC_point
New_priv_key = priv_key + another_int

Now if you set another_int=1 you only have to do one add each iteration, and if there is a match another add to get the private key. This should speed up the calc a lot if my maths is correct.
You're sort of correct, but samr7 has been doing this since his earliest released version. It's one of the reasons his tool is rather fast Smiley

I've corrected some stuff in green. The base_ec_point is a parameter that is known to all parties; it's defined in the protocol/cryptology parameters. It's parameter G.
member
Activity: 112
Merit: 10
Firstbits: 1yetiax
This goes way back to the beginning of the month when there was no vanitygen. http://blockexplorer.com/address/1HnNPy6wtM8pG9TpF1dapvuLTPpdHwiYek

Somebody had figured this out earlier, activated them in Firstbits with tx of 1 Satoshi each, not released the code and now keeps all the addresses to sell them later (although that remains to be seen if people would buy private keys from somebody).

That's why I can't be 1yeti and that's why we can't have nice things!
legendary
Activity: 1974
Merit: 1030
Oh my goodness, and I thought my 45 vanity addresses were far too many...
pc
sr. member
Activity: 253
Merit: 250
It would appear that vanitygen is now the preferred tool of transaction spammers:

Yeah, between vanitygen and firstbits, there's been quite a bit of trying to get some nice addresses. I've certainly thrown a couple addresses into the chain myself just to get the vanity firstbits address, but nothing quite like that… But at least that transaction has a fee to pay for it, so it could almost be considered the system working just fine.
staff
Activity: 4284
Merit: 8808

It would appear that vanitygen is now the preferred tool of transaction spammers:

http://blockexplorer.com/tx/60d7988fd2bc22ce764f9651b20fc3e7418ab6ab57c7057a16dfedd22e837b11

full member
Activity: 126
Merit: 100
I tried to find an example code of how EC multiplication and add works but couldn't find one. I think we should grab some and translate it to a OCL kernel for running this on a GPU instead.

I've been messing around with this.  More than 1/3 of the CPU time spent by the current algorithm is in Montgomery multiplication -- to produce each output address, it must do 14 of them.  I have an OpenCL kernel that can correctly perform the Montgomery multiplication.  Unfortunately, the performance so far isn't very encouraging.

I haven't got every thing about ECC clear yet, but what I have understood you can do

Public_key = some_int*Priv_key

Then

New_pub_key = Public_key + another_int*another_ec_point
New_priv_key = priv_key + another_int

Now if you set another_int=1 you only have to do one add each iteration, and if there is a match another add to get the private key. This should speed up the calc a lot if my maths is correct.
Jump to: