Ok, as promised.
So first of all: Why? Why pick on Brain Wallets?
I find large data problems and their solutions fascinating. I also love the design of the Bitcoin network, because it relies on the fact that randomly generated numbers are 'good enough' for a public addressing scheme. Brain Wallets are a kink in that otherwise unblemished idea, and while there's been people who have experimented with these techniques before, I wanted to take it to the next level and really see what could be done in terms of exposing weaknesses so I would have a better understanding of what would be required to actually hack random brain wallets. I can tell you that despite finding a few, I didn't take any of the coins. The amounts were small and they're probablly someone's mining earnings, and the amount you would get by taking them is about 1/100000th the cost of repeating the process here yourself. I've only provided the code, actually performing the experiment as described is left as an exercise to the reader.
So, without further adieu, here's the code:
https://github.com/TSavo/BitKeyGrinderWhat this code does is uses Hadoop to break down a large list of possible brain wallets into smaller chunks, and distributes those chunks for transformation into their equivalent brain wallet public key. There's another function for loading the block chain and scanning it for outputs that match from the resulting map/reduce job.
This is necessary because the function to compute the public key from the private one is prohibitively expensive... much more expensive than any of the hashing functions involved. On my Intel i5, I could get about 130 keys a second crunched, which meant I could do a 100 megabyte dictionary file in about a day and a half. Using Amazon's Elastic Map Reduce and the code linked above, I searched 15 gigabytes worth of public ally available dictionary files in about the same amount of time. Finding the keys in the blockchain took less than an hour on my i5 once everything had been downloaded again.
So, did I find anything?
Yeah, I did. It wasn't much though. Less than 1 BTC total. I left them there. And if you want to spend the money to go crunch 15 gigabytes of dictionary files to find that less than 1 BTC, good luck.
But let this be a cautionary tale: I churned 15 gigabytes of dictionary files in one day without exceeding my EC2 limit of 20 instances. If someone wanted to, they could start doing this full time, permuting 3 and four word combinations, and looking for brain wallets. Would it be economically viable? Nope. Would it potentially yield results? It would depend on how much CPU power you had... Food for thought.
Questions, comments, bit tips, and screams of agony are always welcome. I am a starving artist, and all tips will go to destroying my liver faster, and the occasional Bitcoin coding frenzy.
-Kevlar