How do API services or block explorers like blockr.io, bchain.info, blockchain.info, blocktrail.com and so on, efficiently parse the blockchain for *any* address or transaction?
They start with the first block and keep track via an external database.
I've got some full nodes running. And especially since Bitcoin Core 0.10 where you can import watch only addresses, it's relatively easy to import a random address and watch its balance and history. It's not clear to me how people did this before 0.10 - I understand the data is in the blockchain obviously, but I wouldn't know how to parse it.
getbalance was added in 0.10.0 but it does not return correct values for arbitrary addresses on none of my nodes. I suspect this is because the node running 24/7 has no wallet and the other one does not maintain a full transaction index. I suspect that getbalance is only working on address you imported (whether watch only or via private key) and that 0.10.0 isnt keeping track of all balances in the background.
But in order to come up with any address' balance or tx history in a matter of seconds, how is that done?
You have to trade calculation time for storage. If you expect to make these requests on a regular basis it makes sense to create a database that does the work once for all addresses, stores the results and gets updated with every new block found.
Is there a script or library of sorts to efficiently traverse the raw blockchain data for any particular address or txid?
AFAIK the best you can get from bitcoin core is to enable
txindex which allows you to get information on any TX.
Does it require a special version of Bitcoin Core with extra tracking information or something? Do they actually keep a dedicated database for this, explicitly listing ALL addresses and txids?
Note that this is not related to any 'wallet', or specific addresses I already own or know. I'm trying to understand how this is done for random addresses or txs. That includes non-existent addresses or txids (where the result is 'does not exist' or 'never used' or balance zero).
Due to the amount of possible addresses I suspect that addresses that have never appeared on the blockchain have no database entry.