Looking into forking the core wallet to use parallel computing to verify blocks - page 2.

andytoshi

full member

Activity: 179

Merit: 156

-

It's not so much that they are far away from each other (though they are), it's the fact that they can be that hurts parallelizability because you need access to a lot of data. The way to handle this is to maintain an "output set" of all unspent outputs; you add every output for each transaction and delete every output that is referenced in each transaction. Maintaining this data structure (which requires random insert/delete access in roughly equal and enormous proportions, which is a nasty data structure problem ... not sure you can do better than a hashmap) is why you have to go through the song and dance I described.

To see what's involved with verifying the blockchain, you should check out my high level overview and also check out the Bitcoin Wiki pages on transactions and Script. It's a bit involved.

skeeterskeeter

full member

Activity: 160

Merit: 100

Andrew,

I think a much better question I now have. What are the steps involved in verifying the block chain?

1. Verify each transaction (highly parallelizable? )
2. Verify each block hash.. ~300k hashes, could be done in serial
3. ... ?

So a big issue you have encountered is that transactions many blocks away from each other that reference each other mean a breakdown in the parallelizability? (that is an awesome word) Because if you wish to verify a single block you might have to look far up or down the chain to find if that specific address really does have x btc to send to another address in the current block?

andytoshi

full member

Activity: 179

Merit: 156

-

Hi Skeeter,

I've played around with parallelizing verification (though not as part of Bitcoin Core), and once past 100k blocks I can pin all eight cores of my CPU constantly. So there is definitely benefit for parallel verification using a fairly large number of cores. There are a few complications though. Here are my thoughts:

Transactions refer to earlier transactions' outputs, so it is difficult to verify out-of-order unless you can do fast lookups (so memory bandwidth is gonna wreck your performance). Further, within a block, transactions are allowed to refer to earlier transactions' outputs but not vice-versa. So to be consensus-correct you need to verify them in order. You can solve this by doing two passes: one to make sure that all transaction references are valid (this can be done in about 30 minutes on my CPU with my crappy Rust code), and the other to actually do the signature verifications. However, this is actually what Bitcoin Core is already doing for the majority of the chain (full signature verification could take a couple of days), so you wouldn't be gaining anything vs the Bitcoin Core behaviour.

One thing you could do (and what I do in my own code) is to break consensus-compatibility and verify all transactions in each block in parallel. For blocks with hundreds or thousands of transactions there is a big benefit here. The way to do it is:
1. Scan the block and blindly add all transaction outputs to your output set in a linear pass. This makes sure that backreferences will work, but will also allow forward references. (It occurred to me while typing this that it's easy to label the outputs with an index into the block, so you can still easily detect forward refs. Oops

I will fix my code to do this.)
2. Split all transactions in the block into blocks of N_TRANSACTIONS/N_CORES transactions, and pass each one to a new thread for verification. Each thread will need to be able to lookup old outputs, so you will need to make sure your output set is immutable during this phase.
3. Scan the block and delete all spent outputs from your output set in a linear pass.

I expect that by using a parallel data structure for your output set you can merge some or all of these passes. There is room for experimentation and measurement here.

Andrew

skeeterskeeter

full member

Activity: 160

Merit: 100

I am taking a parallel computing course at my university and we are supposed to do a semester long project on anything related to the course material. I have decided that I wanted to do something with Bitcoin or cryptos.

My idea is to fork the current Bitcoin Core wallet. The fork will have a feature to verify the blockchain on a local GPU to offload the work of verifying blocks on the CPU. This is what miners do, though they are trying to find the hashes, I wish to verify.

This idea came about after watching my CPU sit at 100% for a few hours after doing a fresh install of the Core wallet on a new desktop and waiting for the blockchain to verify.

My questions,

So I want to know if this is a feasible idea, from everything I know it is, but I haven't dug into the code yet at all?

Would this be useful? I would assume as long as the chain is being downloaded fast enough to provide the GPU with work it would be. I think for testing I will just copy a blockchain data file to a new Core wallet install and see if my software or the original verifies faster to prove the idea.

Any general ideas or concerns?

EDIT: I had another idea. I think it would be a lot easier to just write a standalone chain verifier... Give it a file path of the chain you want verified, set some GPU parameters, and let it rip through it.

Topic: Looking into forking the core wallet to use parallel computing to verify blocks - page 2. (Read 3544 times)