[ANNOUNCE] Abe 0.7: Open Source Block Explorer Knockoff - page 43.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on January 22, 2012, 09:04:04 AM

But then Abe can import, however once its done skipping rows, it fails because it tries to insert a block at id 1, if I rerun it, it ties as 2,3,4 etc.

There must be a problem with the identifier sequences. You can save the database by setting them to correct values.

For portability, Abe supports several methods of ID generation (called "sequences" in some DBMSs). You can find out which implementation it chose with:

Code:

mysql> select configvar_value from configvar where configvar_name = 'sequence_type';

I assume this value is 'mysql' in your case. The 'mysql' sequence implementation associates an empty table with just one column (an auto_increment) with each sequenced table. For example, the next `block_seq`.`id` would become the next `block`.`block_id`. Apparently, the dump/load process did not preserve the tables' internal counters.

This script might fix things for you.

Code:

INSERT INTO block_seq (id) SELECT MAX(block_id) FROM block;
DELETE FROM block_seq;
INSERT INTO magic_seq (id) SELECT MAX(magic_id) FROM magic;
DELETE FROM magic_seq;
INSERT INTO policy_seq (id) SELECT MAX(policy_id) FROM policy;
DELETE FROM policy_seq;
INSERT INTO chain_seq (id) SELECT MAX(chain_id) FROM chain;
DELETE FROM chain_seq;
INSERT INTO datadir_seq (id) SELECT MAX(datadir_id) FROM datadir;
DELETE FROM datadir_seq;
INSERT INTO tx_seq (id) SELECT MAX(tx_id) FROM tx;
DELETE FROM tx_seq;
INSERT INTO txout_seq (id) SELECT MAX(txout_id) FROM txout;
DELETE FROM txout_seq;
INSERT INTO pubkey_seq (id) SELECT MAX(pubkey_id) FROM pubkey;
DELETE FROM pubkey_seq;
INSERT INTO txin_seq (id) SELECT MAX(txin_id) FROM txin;
DELETE FROM txin_seq;

If you have a chance to try it, please let us know the result.

MORA

full member

Activity: 127

Merit: 100

Yes, I added it to the bottom of the import script.
Also ran it again after fixing the views, to make it rescan and then fail :/

slush

legendary

Activity: 1386

Merit: 1097

MORA, did you updated the file pointer in the DB, as someone stated here?

MORA

full member

Activity: 127

Merit: 100

Silly me, of cause it was documented

I tried the sql file from the torrent today, since the VPs was having a hard time catching up.
It didnt work out too well, after importing the SQL, one needs to manually recreate all views, because they contain a security definer pointing to an invalid user.

But then Abe can import, however once its done skipping rows, it fails because it tries to insert a block at id 1, if I rerun it, it ties as 2,3,4 etc.

I gave up in the end and started the import over again, poor VPS

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on January 13, 2012, 07:31:35 AM

How do you support multiple chains ?
I would like to add LTC and NMC to my database, but as I understand it, I would have to run their fork of bitcoind, which would make a new dir of block chain files.

See the comments about "datadir" in the sample abe.conf.

MORA

full member

Activity: 127

Merit: 100

How do you support multiple chains ?
I would like to add LTC and NMC to my database, but as I understand it, I would have to run their fork of bitcoind, which would make a new dir of block chain files.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: slush on January 05, 2012, 11:59:11 PM

Today I tried to understand Abe's source code and although I'm still confused, I may understand it a bit more than before. From what I see, Abe is parsing blockfile and is reconstructing blockchains and transactions in the SQL with many checks. What happen when block stored in the blockfile is orphaned or blockchain is forked? Does Abe solve such issues correctly? Afaik blockfile is just dumb store of block structures, so it already should do all validations.

Abe has logic to attach orphaned blocks and reorganize a forked chain. As far as I know it works, but it is the area I would most like to test when I have time. Relevant code: adopt_orphans and _offer_block_to_chain in Abe/DataStore.py.

Quote from: slush on January 05, 2012, 11:59:11 PM

Why I'm asking:

I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes). For running Stratum servers (where I want to use Abe internally, at least for initial implementation), I need as small footprint as possible, to have a possibility to run Stratum server also on cheap VPS.

I have already some experience with communicating over Bitcoin P2P, so I have an idea of patching Abe for loading blocks and transactions directly from the network. In this case, Abe will need only (trusted?) bitcoin node to connect to port 8333. Unfortunately, my networking code does not do any block/transaction validation, it just receive messages and parse them into python objects. So my question is related to this; when I'll feed Abe with those deserialized data from P2P network, will Abe check everything necessary to have consistent index in the database?

Abe does not validate blocks beyond what's needed to "checksum" a chain up to a trusted current-block hash. Complete block validation is very hard and not on my priority list, though I might add hooks to use external logic. (Wrapping Abe.DataStore.import_block with a subclass might suffice.)

I don't expect a problem feeding Abe deserialized data. You would need a structure like that created by Abe.deserialize.parse_Block with one extra element: 'hash' whose value is the block header hash as a binary string. The structure is based on Gavin's BitcoinTools. You would pass that structure "b" to store.import_block(b, frozenset([1])). (chain_id 1 = main BTC chain) Abe.DataStore.import_blkdat does this for every block in blk0*.dat that was not previously loaded.

One way to shrink the footprint would be to add support for binary SQL types. Abe supports the SQL 1992 BIT type and tests for it at installation, but the only database that passes the test is SQLite, which is unsuitable for large servers. On MySQL and all the others, Abe falls back to binary_type=hex and stores scripts and hashes in hexadecimal, wasting half the bytes. Relevant code is in DataStore: configure_binary_type, _set_sql_flavour (beneath the line "val = store.config.get('binary_type')"), and _sql_binary_as_hex, where Abe translates DDL from standard BIT types to CHARs of twice the length.

Another improvement would be to remove unneeded features (or, ideally, make them optional) such as the Coin-Days Destroyed calculation (block_tx.satoshi_seconds_destroyed etc.) and the unused pubkey.pubkey column.

slush

legendary

Activity: 1386

Merit: 1097

Quote from: MORA on January 06, 2012, 02:26:51 AM

Hmm, I run it on a (good) VPS, granted it uses quite a bit of disk spare, but other than that it works fine, the Abe block viewer is a bit slow, but I dont think that would change, just by moving the bitcoind files off the server ?

The key is RAM; it's pretty slow for you, because database don't fit the RAM and every request is spinning the HDD a lot. In the ideal world, full Abe's database should be loaded into the memory, which is pretty hard to achieve on VPS (MySQL database actually have around 4.5GB), but at least database indexes should fit into the memory (around 1.5 GB), which is doable. Smaller server memory will provide poor performance, exactly as you're reporting. Move bitcoind outside the machine can save around 200 MB of RAM and significant portion of disk I/O usage.

Your idea with mounting blockfile over the net would probably works, you're right. But it is still more like a hack than real solution; you still need disk access to blockchain and handling failover of NFS mount is much harder than providing the pool of trusted P2P nodes to connect. If John confirm that my idea with feeding from P2P network will work, I'll try it. Otherwise will setup NFS mounts...

MORA

full member

Activity: 127

Merit: 100

Quote from: slush on January 05, 2012, 11:59:11 PM

I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes).

Hmm, I run it on a (good) VPS, granted it uses quite a bit of disk spare, but other than that it works fine, the Abe block viewer is a bit slow, but I dont think that would change, just by moving the bitcoind files off the server ?

up 2 days, 11 min, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 78 total, 1 running, 77 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1027308k total, 1013608k used, 13700k free, 15900k buffers
Of those 707MB is cache, so real usage is just 316MB mem.

You could split it out on several servers, but ofcause that would not make it cheaper

Abe just needs the files, it should not care if bitcoind is infact running on the machine, ie. network mounted share.
And Abe will connect to a MySQL server other than localhost withot problems.

slush

legendary

Activity: 1386

Merit: 1097

Today I tried to understand Abe's source code and although I'm still confused, I may understand it a bit more than before. From what I see, Abe is parsing blockfile and is reconstructing blockchains and transactions in the SQL with many checks. What happen when block stored in the blockfile is orphaned or blockchain is forked? Does Abe solve such issues correctly? Afaik blockfile is just dumb store of block structures, so it already should do all validations.

Why I'm asking:

I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes). For running Stratum servers (where I want to use Abe internally, at least for initial implementation), I need as small footprint as possible, to have a possibility to run Stratum server also on cheap VPS.

I have already some experience with communicating over Bitcoin P2P, so I have an idea of patching Abe for loading blocks and transactions directly from the network. In this case, Abe will need only (trusted?) bitcoin node to connect to port 8333. Unfortunately, my networking code does not do any block/transaction validation, it just receive messages and parse them into python objects. So my question is related to this; when I'll feed Abe with those deserialized data from P2P network, will Abe check everything necessary to have consistent index in the database?

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: slush on January 02, 2012, 02:17:07 PM

John, will Abe check database consistency after startup from initial db import and check if db and local blockchain in bitcoind are the same? I mean - isn't blindly using export from some unknown entity potential attack vector?

Abe verifies proof of work and, as of 0.6, transaction Merkle trees on import. Yes, an export/import tool like this should come with caveats about trust. There's a verify.py script (possibly out of date) that verifies the Merkle roots already loaded, and it would be simple to add proof-of-work checks there or as part of an import tool. Of course, if it is part of a system for fast loading of a local, known good block chain, it's not so vulnerable.

Edit: By "verifies proof of work" I do not mean checking hashes against the target or difficulty, just verifying that the "previous block hash" is indeed the hash of the previous block's header. Adding a target check would be nice, though challenging for alternative chains that represent target and proof differently.

slush

legendary

Activity: 1386

Merit: 1097

John, will Abe check database consistency after startup from initial db import and check if db and local blockchain in bitcoind are the same? I mean - isn't blindly using export from some unknown entity potential attack vector?

At least torrent file have a checksum, so anybody who trust me can trust the torrent download, too. But would be nice to know that Abe is checking it by self...

John Tobey

hero member

Activity: 481

Merit: 529

Yup, or someone with time to spare might write export and import functions, dumping and loading the data in a bitcoin-specific, db-neutral format. If that runs pretty fast, write a translator from block files to that format, and it might approach the speed of torrent+mysql for the initial load. The main thing, I suspect, is to create indexes after the tables have data.

MORA

full member

Activity: 127

Merit: 100

So those of us that have Abe running on a hosted machine with BW to spare could offer a "live" dump to speed up things

Or we could just seed on the torrent.

John Tobey

hero member

Activity: 481

Merit: 529

It's not that the dumps would be different, but the file offset stored in the datadir table would probably not be at a block boundary in the local block file. The solution is to reset the pointer. Abe should have a command-line option for this or even do it automatically, but currently we do it with:

Code:

UPDATE datadir SET blkfile_number = 1, blkfile_offset = 0;

The next run will spend a few(?) minutes scanning the block file, skipping blocks already loaded via the dump.

Thanks for the torrent!

slush

legendary

Activity: 1386

Merit: 1097

MORA, actually I don't know how Abe handles bitcoind's blockchain, but I would be really surprised if two initial imports of blockchain should lead to two different database structures.

MORA

full member

Activity: 127

Merit: 100

Have you tested if this dump works ?
I read somewhere in the documentation that Abe does not handle changes to the block file too well, are the block files on 2 systems guaranteed to be identical, ie. will the data point to the right location in the files generated on a different system ?

slush

legendary

Activity: 1386

Merit: 1097

Two weeks ago I installed Abe to one of my VPS because of running Electrum server. Although it's quadcore Xeon, it has poor I/O performance, so initial indexing took around four days. Few days ago, my database crashed and Abe turned into inconsistent state for some reason, which forced me to reindex whole blockchain (=another four days of waiting) again.

Because of such experience, I decided to provide a MySQL dump of Abe to public, as a torrent file. If you want to install Abe, feel free to download following file, it's clean blockchain index up to block 160095: http://mining.bitcoin.cz/media/download/abe-160095.sql.gz.torrent

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on December 28, 2011, 02:44:02 PM

I plan to release the monitor part and announce the website at the same time, and then after 1-2weeks of public testing, I will release the website code itself.

If you stick to this schedule, there is no problem.

Quote from: MORA on December 28, 2011, 02:50:03 PM

But since the only interface is indeed the database, both parts of the solution can be replaced and still work.

In theory, yes, but I would not be surprised if a court still considered the combination a work based on Abe. RMS discussed this in 1992 in regard to linking executables and libraries. I think the AGPL is designed to reproduce the situation in the context of online services.

Quote from: Richard Stallman

What the lawyer said surprised me; he said that judges would consider
such schemes to be "subterfuges" and would be very harsh toward
them. He said a judge would ask whether it is "really" one program,
rather than how it is labeled.

But it seems to me you intend to stay within the spirit of the license, so I apologize for going off topic.

MORA

full member

Activity: 127

Merit: 100

Quote from: John Tobey on December 28, 2011, 01:26:07 PM

Now, it gets a little hairy if you offer a proprietary service based on Abe's tables, and it needs a running Abe to keep those tables up to date. Maybe the law would consider that a "work based on" Abe even though the service only directly reads the tables. If in doubt, describe your plan to me. If I find it in keeping with the spirit of collaboration and the goals of Abe and Bitcoin, I will write a license exception giving it explicit permission to use Abe.

Yes, since it needs the data, one could argue that its work based on Abe.
But since the only interface is indeed the database, both parts of the solution can be replaced and still work.

However I would considder the schema work to be covered by the AGPL and unique at the time it was published.

Topic: [ANNOUNCE] Abe 0.7: Open Source Block Explorer Knockoff - page 43. (Read 221099 times)