I finished up my blockchain parser and produced some interesting data output that I would like to share in the hopes that (a) other people find it useful and (b) other people do some interesting analysis/graphing/reporting on it and (c) to get some bitcoin tips for my effort.
The parser which produced this data is completely open source and consists of a *single* C++ file that is only about 4,000 lines of code that I wrote over Christmas break.
Here are the links to the data.
The top 150,000 bitcoin addresses by balance and sorted by balance.
https://drive.google.com/file/d/0BwdyTvSh6bUkc2V3RUprclhCSUU/edit?usp=sharingThe top 150,000 bitcoin addresses by balance and sorted by age in days.
https://drive.google.com/file/d/0BwdyTvSh6bUkNWF1ZzdmQVhVOXM/edit?usp=sharingSome summary statistics about the blockchain measured over time month by month:
https://drive.google.com/file/d/0BwdyTvSh6bUkTm56bll1czk5MjA/edit?usp=sharingThe source code that produced this data can be found here:
https://code.google.com/p/blockchain/To parse the blockchain you need only two source files:
BlockChain.h
https://code.google.com/p/blockchain/source/browse/trunk/BlockChain.hBlockChain.cpp
https://code.google.com/p/blockchain/source/browse/trunk/BlockChain.cppThe source that demonstrates how to use the parser is at:
https://code.google.com/p/blockchain/source/browse/trunk/main.cppI have not tested this version on Linux yet. I will try to get to that soon. If someone else wants to make sure the Linux version still builds let me know and I will give you access to the project. If you want to parse the blockchain only, that should work on a 32 bit machine. However, if you want to process all transactions in the blockchain to produce these reports, you need a 64 bit machine as it allocates about 10gig of memory to assemble all of the data.
The blog post I made associated with this can be found here:
http://codesuppository.blogspot.com/Over the next few days I plan to do some documentation and cleanup and a few other things.
At any rate, I hope you guys find the data useful.
The comma-separated-value text files contain the following information about each bitcoin address:
*Days since this address was used for a spend output.
*The current value at this address.
*The first time this address was ever used.
*The Last time bitcoins were sent to this address.
*The last time bitcoins were spent from this address (if ever)
*The total number of bitcoins spent from this address.
*The total number of bitcoins received to this address.
*The total number of transactions on this address.
*The public-key of the address.