Pages:
Author

Topic: Parse the blockchain for all addresses with positive balance (Read 4922 times)

newbie
Activity: 14
Merit: 0

Sure, that would be awesome. Is this something hosted on one of your servers or something i can host locally? i would really feel bad eating up your server resources running a bunch of complex queries over time.

I have MySQL on our local server but will probably convert it to standard sql using SSMA if it is something we can host locally as all the information we have databased already is housed on the SQL EE server that we have. My web server and sql server have hundreds of gigs of ECC ram and multiple processors so i dont think it will be too taxing resource wise.

I really appreciate your help, if you need some help writing the documentation side of things, although not a developer, i am VERY good at tinkering and figuring things out so as i use it i can help with some documentation and instructions over time. All the software i have now was things i chopped and screwed to make work the way i want over time, i cannot code, but i can read code (Java, PHP, C#, C++) well enough to understand what is going on and how to make it work or how to copy paste a bunch of crap together to get the job done, not much experience with Python yet, but this would be a good learning experience.

I will be releasing the source. It runs locally and connects to a local coin daemon. Db, RPC connections and some misc. parameters are set in a config file.
You should be able to adapt it to use other databases as I used raw SQL queries (through PyMySql) rather then a library like SQLalchemy, but some change in the coding might be involved.

Thanks dreamwatcher,

i will keep my eye out for the source code and get to work on getting it all installed as soon as i get it. Once i get it SQL friendly and working through a few large querys to ensure it works i will send back a branch for standard SQL DB's in the event it proves useful to you in the future.
legendary
Activity: 1064
Merit: 1000

Sure, that would be awesome. Is this something hosted on one of your servers or something i can host locally? i would really feel bad eating up your server resources running a bunch of complex queries over time.

I have MySQL on our local server but will probably convert it to standard sql using SSMA if it is something we can host locally as all the information we have databased already is housed on the SQL EE server that we have. My web server and sql server have hundreds of gigs of ECC ram and multiple processors so i dont think it will be too taxing resource wise.

I really appreciate your help, if you need some help writing the documentation side of things, although not a developer, i am VERY good at tinkering and figuring things out so as i use it i can help with some documentation and instructions over time. All the software i have now was things i chopped and screwed to make work the way i want over time, i cannot code, but i can read code (Java, PHP, C#, C++) well enough to understand what is going on and how to make it work or how to copy paste a bunch of crap together to get the job done, not much experience with Python yet, but this would be a good learning experience.

I will be releasing the source. It runs locally and connects to a local coin daemon. Db, RPC connections and some misc. parameters are set in a config file.
You should be able to adapt it to use other databases as I used raw SQL queries (through PyMySql) rather then a library like SQLalchemy, but some change in the coding might be involved.
newbie
Activity: 14
Merit: 0
Coming late to the thread on this one, but after reading through it sounds interesting.

While not a 'coder', I've got some background in data processing and would be interested in helping out the OP, if help is wanted.  Where are you planning on running the data collector?

baldpope,

I have a pretty substantial amount of old servers and data miners laying around from previous work as a data center manager. i was able to keep a lot of stuff that the companies were throwing away. So for now i am running the data collection locally on my own local network, although i have several static IP's and T1 & T3 lines so i could easily make it accessible remotely. Feel free to message me in your spare time, i am always looking for a helping hand from someone with experience. especially in data processing as that is the bulk of what the project is about.
full member
Activity: 144
Merit: 100
Coming late to the thread on this one, but after reading through it sounds interesting.

While not a 'coder', I've got some background in data processing and would be interested in helping out the OP, if help is wanted.  Where are you planning on running the data collector?
newbie
Activity: 14
Merit: 0
The CCE database loader builds a database that has much of that information. The web server portion of CCE limits queries for ease of use and to avoid overloading the server.

I am making a previous version of the loader (3.0) available to barwizi. I want to clean up the code a bit and write some basic instructions, but I can make it available to you also.

The basic version I am providing will build a database with tables for:
blocks, txins, txouts, and address balances. You can use the coin daemons block notify to trigger an update. During the update the loader also checks for orphan blocks within the last 250 blocks and corrects blocks/tx and address balances when an orphan is detected.
It is written in Python and uses Mysql.

If you want to check it out,I will shoot you a PM when ready with the private Github information.

Sure, that would be awesome. Is this something hosted on one of your servers or something i can host locally? i would really feel bad eating up your server resources running a bunch of complex queries over time.

I have MySQL on our local server but will probably convert it to standard sql using SSMA if it is something we can host locally as all the information we have databased already is housed on the SQL EE server that we have. My web server and sql server have hundreds of gigs of ECC ram and multiple processors so i dont think it will be too taxing resource wise.

I really appreciate your help, if you need some help writing the documentation side of things, although not a developer, i am VERY good at tinkering and figuring things out so as i use it i can help with some documentation and instructions over time. All the software i have now was things i chopped and screwed to make work the way i want over time, i cannot code, but i can read code (Java, PHP, C#, C++) well enough to understand what is going on and how to make it work or how to copy paste a bunch of crap together to get the job done, not much experience with Python yet, but this would be a good learning experience.
legendary
Activity: 1064
Merit: 1000
The CCE database loader builds a database that has much of that information. The web server portion of CCE limits queries for ease of use and to avoid overloading the server.

I am making a previous version of the loader (3.0) available to barwizi. I want to clean up the code a bit and write some basic instructions, but I can make it available to you also.

The basic version I am providing will build a database with tables for:
blocks, txins, txouts, and address balances. You can use the coin daemons block notify to trigger an update. During the update the loader also checks for orphan blocks within the last 250 blocks and corrects blocks/tx and address balances when an orphan is detected.
It is written in Python and uses Mysql.

If you want to check it out,I will shoot you a PM when ready with the private Github information.
newbie
Activity: 14
Merit: 0
I am curious as to why you are trying to parse the block chain file instead of querying the coin daemon for the information

When I wrote the CCE 3 block explorer software, I completely abandoned the ABE style of parsing the block chain files as the daemon will give you all the information you need without having to adjust to the different block structures of various chains.

I might be willing to give you limited license to the CCE 3 database loader, if you would give me a better idea of what you are ultimately trying to do.

Thanks dreamwatcher,

Ultimately i am working on a FULL bitcoin stats website that can pretty much tell you anything anybody would want to know about bitcoin at any given time. There are a lot of sites that give you a specific data set for a specific inquiry but really none that will give you everything all in one place.

Seeing as the one thing that a lot of people i have found ask for but is not readily available at any given website is "a list of all bitcoin addresses with a positive balance"  i wanted to start there. With that information there are several sub sets of info that can be derived from that. Beyond being able to list every bitcoin address with a positive balance, i can then publish a statistic in number form that is easy to track. ( X# of bitcoin addresses in use today) and also (X# of bitcoin addresses in use yesterday) as well as (average # of bitcoin addresses in use this week, month, year, etc.) Also with that i can create an interface (php website) which links every one of those Xmillion addresses to our insight server (like blockchain) so people can view the statistics of each of them.

For now that is Ultimately what i am working on, in the total picture there is much more which i will list below, but if you were only interested in what i want the data i was asking for, that is why.

In overall other statistics that i will work on finding ways to gather in the end game are. Address with the Most TX's for the day, week, month, year, all time. # of addresses that have not been touched in a month, 3 months, 6 months, 1 year, and longest untouched addresses (no TX out).

And in the VERY, VERY, VERY long term, probably the last part of the project, i will work on a way to calculate the average transaction value for the hour, day, week, month, year, and all time. This will be last as it involves the most amount of work, coding, math, and formulation to achieve.

But for now, i just want to start with finding a way to get a list of all addresses with unspent outputs carved down to a .txt file 1 addy per line that i can import into the SQL database, then i will work on a way of updating it with each block with the on block notify feature so it will be a live statistic and output.
legendary
Activity: 996
Merit: 1013
on closer inspection, it seems all the PoS blocks in the chain look like that. Help please.

Try inserting a break statement after the nonstandard case.

Code:
case "nonstandard":
            $mysqli->query("insert into outputs (block_num, block_hash, transaction_hash, sequence, address, balance) values (".$i.", '".$block["hash"]."', '".$txid."', ".$tx_out["n"].", '** ".$scripttype." **', ".$tx_out["value"].")");
     break;

Where did you get this php code?

Quote
Would have posted in the Alt section but it was either gonna drown in the torrent of mostly meaningless posts or not get an answer. Would be nice if they added development subforum there.

Seems to have ended up here in the sewers anyway..
+1 for the subforum suggestion.
legendary
Activity: 1064
Merit: 1000
I am curious as to why you are trying to parse the block chain file instead of querying the coin daemon for the information

When I wrote the CCE 3 block explorer software, I completely abandoned the ABE style of parsing the block chain files as the daemon will give you all the information you need without having to adjust to the different block structures of various chains.

I might be willing to give you limited license to the CCE 3 database loader, if you would give me a better idea of what you are ultimately trying to do.
legendary
Activity: 882
Merit: 1000
blk-files have very simple and easy-to-parse structure
Have a look at my C++/Qt code. Sorry, I do not speak PHP

Code:
#include 
#include

#include "BlockChain.h"
#include "Util.h"
#include "Chainer.h"
#include "Target.h"

BlockChain::BlockChain ( QObject* parent ) : QFile ( parent ), blkFile ( 0 )
{
  QTimer::singleShot ( 0, this, SLOT ( start ( ) ) );
}
//--------------------------------------------------------------
void BlockChain::start ( )
{
  setFileName ( blkFileName ( blkFile++ ) );
  if ( !open ( QIODevice::ReadOnly ) )
  {
    _trace ( QString ( "cant open [%1]" ).arg ( fileName ( ) ) );
    getParent ( ).block ( QByteArray ( ) ); // this is finish signal
    deleteLater ( );                        // delete self
  }
  else
  {
    _trace ( QString ( "processing [%1]" ).arg ( fileName ( ) ) );
    QTimer::singleShot ( 0, this, SLOT ( next ( ) ) );
  }
}
//--------------------------------------------------------------
void BlockChain::next ( )
{
  if ( pos ( ) < size ( ) )
  {
    quint32 magic;
    quint32 size ( read ( (char*)&magic, 4 ) );
    xassert ( ( ( magic == MAGIC_ID ) || !magic ) && ( size == 4 ) );
    if ( magic )
    {
      read ( (char*)&size, 4 );
      xassert ( size > HEADER_SIZE && size <= MAX_BLOCK_SIZE );
      getParent ( ).block ( read ( size ) );             // process block data here
      QTimer::singleShot ( 0, this, SLOT ( next ( ) ) ); // schedule next block in file
      return;
    }
  }
  close ( );
  QTimer::singleShot ( 0, this, SLOT ( start ( ) ) );     // schedule next file
}
//--------------------------------------------------------------
const QString BlockChain::blkFileName ( const int i )
{
  return
    ( i < 10 ) ? QString ( DATA_ROOT "\\blk0000%1.dat" ).arg ( i ) :
    ( i < 100 ) ? QString ( DATA_ROOT "\\blk000%1.dat" ).arg ( i ) :
    QString ( DATA_ROOT "\\blk00%1.dat" ).arg ( i );
}


Cool, C++ i can do. mind giving me a link to the whole project?
legendary
Activity: 1260
Merit: 1019
blk-files have very simple and easy-to-parse structure
Have a look at my C++/Qt code. Sorry, I do not speak PHP

Code:
#include 
#include

#include "BlockChain.h"
#include "Util.h"
#include "Chainer.h"
#include "Target.h"

BlockChain::BlockChain ( QObject* parent ) : QFile ( parent ), blkFile ( 0 )
{
  QTimer::singleShot ( 0, this, SLOT ( start ( ) ) );
}
//--------------------------------------------------------------
void BlockChain::start ( )
{
  setFileName ( blkFileName ( blkFile++ ) );
  if ( !open ( QIODevice::ReadOnly ) )
  {
    _trace ( QString ( "cant open [%1]" ).arg ( fileName ( ) ) );
    getParent ( ).block ( QByteArray ( ) ); // this is finish signal
    deleteLater ( );                        // delete self
  }
  else
  {
    _trace ( QString ( "processing [%1]" ).arg ( fileName ( ) ) );
    QTimer::singleShot ( 0, this, SLOT ( next ( ) ) );
  }
}
//--------------------------------------------------------------
void BlockChain::next ( )
{
  if ( pos ( ) < size ( ) )
  {
    quint32 magic;
    quint32 size ( read ( (char*)&magic, 4 ) );
    xassert ( ( ( magic == MAGIC_ID ) || !magic ) && ( size == 4 ) );
    if ( magic )
    {
      read ( (char*)&size, 4 );
      xassert ( size > HEADER_SIZE && size <= MAX_BLOCK_SIZE );
      getParent ( ).block ( read ( size ) );             // process block data here
      QTimer::singleShot ( 0, this, SLOT ( next ( ) ) ); // schedule next block in file
      return;
    }
  }
  close ( );
  QTimer::singleShot ( 0, this, SLOT ( start ( ) ) );     // schedule next file
}
//--------------------------------------------------------------
const QString BlockChain::blkFileName ( const int i )
{
  return
    ( i < 10 ) ? QString ( DATA_ROOT "\\blk0000%1.dat" ).arg ( i ) :
    ( i < 100 ) ? QString ( DATA_ROOT "\\blk000%1.dat" ).arg ( i ) :
    QString ( DATA_ROOT "\\blk00%1.dat" ).arg ( i );
}
newbie
Activity: 14
Merit: 0
I was talking about bitcoin, and i'm getting a little closer to a solution, i have gotten Insight up and running, and now i just need to figure out how to access the info in Level DB as at face value it seems that that info should already be parsed in there some where.
legendary
Activity: 882
Merit: 1000
Code:
    [blockhash] => 0fcc89dcc850cbf7d7149b8f1534b2464a7902991d19a9367dae0751b3a25be0

Are we talking about bitcoin or altcoin?

Altcoin.

Would have posted in the Alt section but it was either gonna drown in the torrent of mostly meaningless posts or not get an answer. Would be nice if they added development subforum there.
legendary
Activity: 1260
Merit: 1019
Code:
    [blockhash] => 0fcc89dcc850cbf7d7149b8f1534b2464a7902991d19a9367dae0751b3a25be0

Are we talking about bitcoin or altcoin?
legendary
Activity: 882
Merit: 1000
on closer inspection, it seems all the PoS blocks in the chain look like that. Help please.
legendary
Activity: 882
Merit: 1000
I am trying to do the same, i got my hands on a php script but it is choking up on some blocks.

Code:
foreach ($tx["vout"] as $tx_out)

        {

          $scripttype=$tx_out["scriptPubKey"]["type"];

          switch ($scripttype)

          {

          case "pubkeyhash":

          case "pubkey":

          case "scripthash":

            $mysqli->query("insert into outputs (block_num, block_hash, transaction_hash, sequence, address, balance) values (".$i.", '".$block["hash"]."', '".$txid."', ".$tx_out["n"].", '".$tx_out["scriptPubKey"]["addresses"][0]."', ".$tx_out["value"].")");

            break;

          case "scriptPubKey":

          case "multisig":

            $mysqli->query("insert into outputs (block_num, block_hash, transaction_hash, sequence, address, balance) values (".$i.", '".$block["hash"]."', '".$txid."', ".$tx_out["n"].", '** ".$scripttype." **', ".$tx_out["value"].")");

            break;

          case "nonstandard":

            $mysqli->query("insert into outputs (block_num, block_hash, transaction_hash, sequence, address, balance) values (".$i.", '".$block["hash"]."', '".$txid."', ".$tx_out["n"].", '** ".$scripttype." **', ".$tx_out["value"].")");

          default:

            throw new Exception("don't know how to handle ".$scripttype." scripts in transaction ".$txid);

            continue;

It gives me this error

Code:
PHP Fatal error:  Uncaught exception 'Exception' with message 'don't know how to handle nonstandard scripts in transaction 521fdd0f3bd589de759af578cf6bfc595854ba578eb709fe7f7f42d43b2c7960' in /home/build/nrsbala/update-database.php:91
Stack trace:
#0 {main}
  thrown in /home/build/blockparser/update-database.php on line 91


for this block :-

Code:
Array
(
    [hex] => 010000008b636053010000000000000000000000000000000000000000000000000000000000000000ffffffff0c02a25a0105062f503253482fffffffff010000000000000000000000000000
    [txid] => 521fdd0f3bd589de759af578cf6bfc595854ba578eb709fe7f7f42d43b2c7960
    [version] => 1
    [time] => 1398825867
    [locktime] => 0
    [vin] => Array
        (
            [0] => Array
                (
                    [coinbase] => 02a25a0105062f503253482f
                    [sequence] => 4294967295
                )

        )

    [vout] => Array
        (
            [0] => Array
                (
                    [value] => 0
                    [n] => 0
                    [scriptPubKey] => Array
                        (
                            [asm] =>
                            [hex] =>
                            [type] => nonstandard
                        )

                )

        )

    [blockhash] => 0fcc89dcc850cbf7d7149b8f1534b2464a7902991d19a9367dae0751b3a25be0
    [confirmations] => 46496
    [blocktime] => 1398825867
)

I need help figuring a way for these txs to be handled. Already i can see the problems with the block, but i need it to be able to handle these without coughing up.
full member
Activity: 157
Merit: 100
In the end i am going to try and hire someone to code a simple program that just does this then parses each new block #onblocknotify to keep the list updated, but until that day comes i would like to find the fastest way to do it manually.

Please if you know the program that does it, also include the command line to get it to give the desired result as my coding abilities are pretty much chop&screw.

flatfly gave a reference and you should probably go ahead and hire someone: since there's no "command line to get it to give the desired result as my coding abilities are pretty much chop&screw."
staff
Activity: 4326
Merit: 8951
What are you trying to accomplish— at a macroscopic level?  We might have better advice if we knew what you were trying to accomplish.
newbie
Activity: 14
Merit: 0
This is the most commonly used one. (Linux only!)
Instructions are included in the repo.

I have no experience with it, though.


https://github.com/znort987/blockparser

Thanks flatfly, My experience with anything linux is 0 and i wouldn't have the slightest idea on where to begin or how to make it work.
legendary
Activity: 1120
Merit: 1016
090930
This is the most commonly used one. (Linux only!)
Instructions are included in the repo.

I have no experience with it, though.


https://github.com/znort987/blockparser
Pages:
Jump to: