An optimal engine for UTXO db | Bitcointalksearch.org

2112

legendary

Activity: 2128

Merit: 1074

Quote from: jl777 on March 13, 2016, 12:21:23 AM

If it has nothing to do with DBMS, then why did you keep doing the database class thing?

Because the title of this thread is about DBMS a.k.a. DB engines.

Because of the fact that the data stored is actually about money.

You just have zero experience dealing with financial data processing, and actually less-than-zero as far as legally-responsible fiduciary-duty-bound handling of money for other people.

Therefore what you are doing are just educational toys. I have nothing against you designing and playing with those toys. But if you thinking that you are making something that will become a real financial product you set yourself for a CIYAM-level disappointment.

https://bitcointalksearch.org/topic/m.13995422

Quote from: CIYAM

The CIYAM project has got a lot of criticism about being extremely hard to understand and I'd like to explain exactly why this is.

Initially the project was a closed source one originating from Australia (where I grew up) but once I moved to China in late 2006 I decided that due to the problems of enforcing copyright in China (i.e. no chance) that I would need to make the software "hard to understand" (a sort of security by obscurity approach).

So when I created the initial open source version of CIYAM it inherited years of such development that I had worked on in China (around 6 years).

I no longer actually support the ideas of patent and copyright so I actually don't care too much now about that - what I worked out how to do was to just make a software system so difficult to understand that those things are irrelevant (no-one has even bothered to try and fork my project and modify it).

hhanh00

sr. member

Activity: 467

Merit: 267

To answer the OP, it is difficult to decide what makes an optimal engine for the UTXO set because as usual one has to make design trade offs.

1. If you are interested in super fast import, i.e. processing the block chain in shortest time, you are dealing with a large amount of insert/delete of key value pairs where the key contains a crypto hash. They are essentially random values and will fall all over the place. However, the net amount fits in an average development machine so you can work entirely in RAM. However, bitcoin core can't assume that you have several GB of RAM available!

Here, the main problem of leveldb is that its engine uses skip lists. Their average search complexity is ln(N). You'd be better off with a hash table which is O(1). For the record, skip lists and B-Trees are used when we need to retrieve the keys in order and it's not the case here.

Because blocks depend on previous blocks, processing them in parallel isn't easy. I'm not sure it helps even. (Block validation excluded)

2. If you are building a browser, then you can precalculate the hell out of your data set. The usual database tricks apply: lineralize the key space, compress values, use bloom filters, partition your data, etc. You are working with data than changes only every 10 mn, so very quite static. If you need the unconfirmed tx, add them on top. This case isn't what a db is really used for anyway. A lot of effort is put into concurrency and durability. When they are not crucial, faster implementations exist - especially if you can choose the hardware.

3. If it's for checking transactions in a full node, there are relatively few of them. Once you have extracted the UTXO, we are talking about a few million records which is peanuts in today's standards.

In conclusion, leveldb is fine for its usage it bitcoin core. The choice of a skip list vs hash table engine is questionable, but I don't think that makes a big difference for most of the users.

Maybe, you could describe your usage?

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: 2112 on March 13, 2016, 12:13:25 AM

Quote from: jl777 on March 12, 2016, 11:59:41 PM

If you think I am ignoring overall system speed, then you are not really reading my posts.
I also like kokoto's requirements and my dataset is designed to get good overall system performance for those, and also to be able to do blockexplorer level queries from the same dataset for lower level blockchain queries.

Not sure what your point is? Maybe you can try to run a DB system written in JS? I dont know of any that is anywhere fast enough.

It is not a choice between DB vs no DB, but rather no-DB vs nothing, as my requirement is for a system that works from a chrome app

James

You are making 2 unrelated mistakes:

1) mixing hardware speed with software speed. On fast hardware it is OK to have slow software.

2) your tests are only including sync speed (both initial and re-sync after getting temporary offline). You don't test continuous operation nor chain reorganization.

This has nothing to do with using or not using a DBMS. It is a more fundamental question what your system will be capable of when finished.

1) of course on fast hardware it is ok to have slow software. but slow software on slow hardware is not acceptable. And if you have to write fast software for the slow system, there is no point to write slow software is there?

2) initial sync and re-sync are very important. continuous operation will be running out of RAM directly, combining data from the readonly dataset. And it is not so easy to test things before they are running. I am assuming that chain reorgs wont go past 100 blocks very often. In the event it does, you would have a few minutes of downtime to regenerate the most recent block.

If it has nothing to do with DBMS, then why did you keep doing the database class thing?

As far as performance of iguana when it is complete, it will be fast for things that typically are very slow, ie. importprivkey.

2112

legendary

Activity: 2128

Merit: 1074

Quote from: jl777 on March 12, 2016, 11:59:41 PM

If you think I am ignoring overall system speed, then you are not really reading my posts.
I also like kokoto's requirements and my dataset is designed to get good overall system performance for those, and also to be able to do blockexplorer level queries from the same dataset for lower level blockchain queries.

Not sure what your point is? Maybe you can try to run a DB system written in JS? I dont know of any that is anywhere fast enough.

It is not a choice between DB vs no DB, but rather no-DB vs nothing, as my requirement is for a system that works from a chrome app

James

You are making 2 unrelated mistakes:

1) mixing hardware speed with software speed. On fast hardware it is OK to have slow software.

2) your tests are only including sync speed (both initial and re-sync after getting temporary offline). You don't test continuous operation nor chain reorganization.

This has nothing to do with using or not using a DBMS. It is a more fundamental question what your system will be capable of when finished.

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: 2112 on March 12, 2016, 11:37:05 PM

Quote from: jl777 on March 05, 2016, 11:12:57 AM

Otherwise we would still be using the slowest CPU that will eventually complete the tasks.

This is a self-refuting argument.

Hardware gets faster therefore people are willing to put up with slower software.

Do not confuse hardware speed and software speed when discussing speed of the whole system.

Edit: A good example of clear thinking from somebody who isn't professionally doing software development:

Quote from: watashi-kokoto on March 05, 2016, 05:38:09 PM

Quote from: piotr_n on March 04, 2016, 03:17:49 PM

I think it's very important to be able to browse through all the records in a shortest possible time.

I disagree, the major requirements

verify utxo spend
verify utxo amount
remove used txo
add new utxos from block
reorganize revert utxo

If you think I am ignoring overall system speed, then you are not really reading my posts.
I also like kokoto's requirements and my dataset is designed to get good overall system performance for those, and also to be able to do blockexplorer level queries from the same dataset for lower level blockchain queries.

Not sure what your point is? Maybe you can try to run a DB system written in JS? I dont know of any that is anywhere fast enough.

It is not a choice between DB vs no DB, but rather no-DB vs nothing, as my requirement is for a system that works from a chrome app

James

2112

legendary

Activity: 2128

Merit: 1074

Quote from: jl777 on March 05, 2016, 11:12:57 AM

Otherwise we would still be using the slowest CPU that will eventually complete the tasks.

This is a self-refuting argument.

Hardware gets faster therefore people are willing to put up with slower software.

Do not confuse hardware speed and software speed when discussing speed of the whole system.

Edit: A good example of clear thinking from somebody who isn't professionally doing software development:

Quote from: watashi-kokoto on March 05, 2016, 05:38:09 PM

Quote from: piotr_n on March 04, 2016, 03:17:49 PM

I think it's very important to be able to browse through all the records in a shortest possible time.

I disagree, the major requirements

verify utxo spend
verify utxo amount
remove used txo
add new utxos from block
reorganize revert utxo

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: amaclin on March 10, 2016, 08:42:13 AM

Quote from: jl777 on March 10, 2016, 06:55:34 AM

Who said there was a scaling problem with bitcoin?

me. and I am ready to repeat it again

please be specific

What part of bitcoin makes it not able to scale?

I have solved the "takes too long to sync" issue. I think with interleaving, a much greater tx capacity is possible. Using BTC as the universal clock for all the other cryptos will let smaller chains deal with lower value transactions. And now a way to use GPU for blockchain operations.

James

amaclin

legendary

Activity: 1260

Merit: 1019

Quote from: jl777 on March 10, 2016, 06:55:34 AM

Who said there was a scaling problem with bitcoin?

me. and I am ready to repeat it again

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: piotr_n on March 08, 2016, 04:00:54 PM

thanks. i like it.

if you want really fast speed, you can make a CUDA/OpenSSL version and use a dedicated core per bundle, among other optimizations.

With most all the data being read only, the biggest challenge of GPU coding (out of sync data between cores) is already solved.

It might take a few GPU cards to store all the dataset, but then all RPC to scan the entire blockchain happens in milliseconds

using the CPU for the latest bundle can create a system that keeps up with new blocks.

Who said there was a scaling problem with bitcoin?

James

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

thanks. i like it.

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: piotr_n on March 08, 2016, 03:18:36 PM

so say, @james you only have the blocks data - from 1 to 400000. stored close, on ssd disk or whatever. say ramdisk

how long does it "load" for you to have all the database content of the block 400000?

do you mean reprocessing all the raw data from scratch? With 8 cores that would take 30 minutes or less depending on if you validate all sigs from block0 or not, maybe a bit longer.

If you have all the processed bundle files, then in the time it takes to memory map 200 files, the data set is accessible to the CPU. So that would be up to the OS/HDD, but I am seeing it map all the files in a few seconds on a lowend laptop and less than a second on a VPS server.

Once the bundle files are mapped, then a block explorer (not blockchain) level data is available for all addresses, blocks, tx

James

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

so say, @james you only have the blocks data - from 1 to 400000. stored close, on ssd disk or whatever. say ramdisk

how long does it "load" for you to have all the database content of the block 400000?

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

this is my preferred expression.
thank you.

i think its nice and I will look into it more.

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: piotr_n on March 08, 2016, 02:44:44 PM

Quote from: jl777 on March 08, 2016, 01:13:25 PM

Assuming you have already the blockchain internally, it might be easier to just directly generate the bundle files. The key is the encoding and if interop is not a constraint, then you can just use a similar model without making it exactly compatible.

Sounds good, but can you express it in a way that I could understand, what does it actually mean to do for the software?

to encode the tx data in the following format:

struct iguana_txid { bits256 txid; uint32_t txidind,firstvout,firstvin,locktime,version,timestamp,extraoffset; uint16_t numvouts,numvins; } __attribute__((packed));

struct iguana_unspent { uint64_t value; uint32_t txidind,pkind,prevunspentind,scriptoffset; uint16_t hdrsi:14,type:4,vout:14; } __attribute__((packed));

struct iguana_spend { uint32_t spendtxidind,scriptoffset:31,coinbase:1; int16_t prevout; uint16_t numsigs:4,numpubkeys:4,p2sh:1,sighash:4,external:1,sequenceid:2; } __attribute__((packed));

struct iguana_pkhash { uint8_t rmd160[20]; uint32_t pkind,pubkeyoffset; } __attribute__((packed));

I describe this in the iguana child board in a lot more detail

Since I am not you, I do not know what expression requirements exist to allow proper parsing and semantic understanding by you.

Maybe you can ask the specific questions after reading the iguana threads?

James

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

Quote from: jl777 on March 08, 2016, 01:13:25 PM

Assuming you have already the blockchain internally, it might be easier to just directly generate the bundle files. The key is the encoding and if interop is not a constraint, then you can just use a similar model without making it exactly compatible.

Sounds good, but can you express it in a way that I could understand, what does it actually mean to do for the software?

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: piotr_n on March 08, 2016, 12:56:32 PM

James, can you please describe how you actually implemented your db?
I'm especially interested in the index part, that (according to you) fits into the CPU's cache.

I've been trying to read through your code, but its shit loads and I don't have that much patience.
I once implemented sipa's secp256k functions in Go, so I do have some patience... but it has its limits

Say, I'd like to implement the read-only database you made, in another language - what does it take?

You would need a high end CPU to fit it all in L caches and not all the "DB" would fit into CPU cache, unless you partitioned queries for the different datasets to different cores and strongly influenced the content of each CPU's cache

iguana_ramchain.c has most of what is needed, but right now it is a construction zone as I just change the data structures to support lossless encoding/decoding and quite tricky doing that with all the constraints.

Assuming you have already the blockchain internally, it might be easier to just directly generate the bundle files. The key is the encoding and if interop is not a constraint, then you can just use a similar model without making it exactly compatible. Of course, having an independent implementation would be a fantastic cross check, so if you are up for that we can work out a strict memory layout. Then you will know you made things bugfree if it makes the identical file and debugging is quite the chore without this...

I am starting to describe in some detail the internal data formats: https://bitco.in/forum/forums/iguana.23/

The codebase is highly tuned and optimized to do the parallel sync and due to this, I have each peer's thread create the first pass file(s). Then helper threads aggregate a bundle and generate the bundle file. the ramchain_iterate handles both the raw and expanded forms, so maybe a port of iguana_ramchain.c is a good way to port it.

The functions ending with PT indicate it is running in a peer's thread and not all functions are needed. the bundlesaveHT function runs from a helper thread and it is the one doing the combining

James

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

James, can you please describe how you actually implemented your db?
I'm especially interested in the index part, that (according to you) fits into the CPU's cache.

I've been trying to read through your code, but its shit loads and I don't have that much patience.
I once implemented sipa's secp256k functions in Go, so I do have some patience... but it has its limits

Say, I'd like to implement the read-only database you made, in another language - what does it take?

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: piotr_n on March 08, 2016, 11:47:31 AM

Quote from: amaclin on March 06, 2016, 01:27:11 PM

http://tarantool.org/

this looks like a right way to go, but..

1) I failed to install the software on my linux, following the docs. Or to build the sources....

2) I don't think TCP based API is suitable for the needs of a DB we're discussing here

how do you imagine browsing trough a content of the current 35M+ UTXO records whit this TCP based API?

If you just want an embeddable DB, http://sophia.systems/ is pretty good. I see they released a version 2 with hybrid in-memory, which is good as my feedback to them was having everything HDD made it too slow.

The sophia 1.2 was 10x smaller than BDB and much faster, so 2.0 version should be that much better.

but nothing would have the performance/footprint profile of the read-only memory mapped bundles. I just cant double the size of the code just to add a DB, especially as I could just use a deterministic reverse hash chain for a wallet's keypairs. Loss of the wallet passphrase would compromise the privkeys, but it seems the same result as storing independent privkeys in a DB encrypted with the same passphrase.

Making slow but steady progress with the debugging and documenting it here: https://bitco.in/forum/threads/debugging-status.950/

James

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

Quote from: amaclin on March 06, 2016, 01:27:11 PM

http://tarantool.org/

this looks like a right way to go, but..

1) I failed to install the software on my linux, following the docs. Or to build the sources....

2) I don't think TCP based API is suitable for the needs of a DB we're discussing here

how do you imagine browsing trough a content of the current 35M+ UTXO records whit this TCP based API?

jl777

legendary

Activity: 1176

Merit: 1134

Quote from: AlexGR on March 07, 2016, 02:17:48 PM

There are some intel chips with 128MB L4 cache (shared cpu/gpu). Obviously the further "away" it is from the cores, the slower it will be, but still you'd expect it to be at least twice as fast as the ram.

http://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an-essential-part-of-modern-chips

https://forums.aida64.com/topic/2864-i7-5775c-l4-cache-performance/

The 10x to 40x speed improvements creates very nonlinear performance boosts when the dataset is small enough

James

Topic: An optimal engine for UTXO db (Read 3171 times)