Robustly processing transactions from bitcoind

cdhowie

full member

Activity: 182

Merit: 107

Quote from: realnowhereman on May 15, 2011, 03:23:48 AM

That's not true. The client must consider alternate chains as both at least potentially valid, and therefore both copies of that one transaction must be potentially valid as well. Until the network has voted which chain to build on, either one could be "the one".

Your algorithm breaks down because you are requiring a one-to-one link with transactions and blocks, but there is a one-to-many relationship (for a short time at least) between transactions and blocks.

No, that's wrong. The client cannot maintain multiple "block chain candidates," and there is no "voting procedure." The client will always consider one chain to be the main chain, and that is the longest chain, or in the case of two equally-long chains, the one it was aware of first. There is a one-to-one link with transactions and blocks that are in the main chain (the main chain from the perspective of the client, of course).

realnowhereman

hero member

Activity: 504

Merit: 502

Quote from: cdhowie on May 06, 2011, 02:19:43 PM

Quote from: realnowhereman on May 06, 2011, 01:42:41 PM

Therefore you can't have transaction lists include a block hash, as it's possible for them to be in more than one, at least temporarily. Since it is blocks that get confirmed, each of those transaction-block links would also have a different number of confirmations.

Yes, but your client sees only one of those conditions. The client will not consider a transaction to be in two blocks, not in any meaningful way. I'm not quite sure what you are getting at here -- exactly which part of either algorithm I have presented breaks down in the case where an already-processed transaction changes blocks?

That's not true. The client must consider alternate chains as both at least potentially valid, and therefore both copies of that one transaction must be potentially valid as well. Until the network has voted which chain to build on, either one could be "the one".

Your algorithm breaks down because you are requiring a one-to-one link with transactions and blocks, but there is a one-to-many relationship (for a short time at least) between transactions and blocks.

cdhowie

full member

Activity: 182

Merit: 107

Quote from: realnowhereman on May 06, 2011, 01:42:41 PM

In principle, transactions can be in multiple blocks can't they?

If two miners both include the same transcation in a block based on a common ancestor?

If this happens, the block chain split and one of the blocks will be orphaned later when the network decides which chain to take. It's impossible for one block to be a descendant of another and have them both include the transaction; the descendent block would be invalid.

Quote from: realnowhereman on May 06, 2011, 01:42:41 PM

The validity or not of a transaction is independent of whether its in a block or not. Eventually one of those blocks will be built on by the rest of the network and the new "true" block tip decided upon. Strictly, it is blocks that are "confirmed" not transactions.

This statement is correct on its surface, but there is one thing you are not taking into account -- double-spending. Just because you have transaction X that spends coins Z doesn't mean that there isn't another transaction Y floating around the other half of the network spending the same coins Z. Once one of them makes it into a block, clients that receive that block will look at the other transaction and conclude it is invalid since it spends already-spent coins, and drop it (or mark it invalid, but either way it won't be treated as valid). The deeper a transaction is in your client's view of the block chain, the less likely that there is an alternate block-chain in existence that spends the same coins. So six deep was decided as a reasonable amount to consider a transaction "fully confirmed," since it implies that about an hour has passed without seeing an alternate chain. This is more than enough time to reconcile legitimate orphan block chains. (If someone split from the network with the intent to create an alternate block chain that would invalidate a transaction included over six levels deep in the main block chain, they will either need more computing power than the rest of the network or a crazy amount of luck.)

Quote from: realnowhereman on May 06, 2011, 01:42:41 PM

Therefore you can't have transaction lists include a block hash, as it's possible for them to be in more than one, at least temporarily. Since it is blocks that get confirmed, each of those transaction-block links would also have a different number of confirmations.

Yes, but your client sees only one of those conditions. The client will not consider a transaction to be in two blocks, not in any meaningful way. I'm not quite sure what you are getting at here -- exactly which part of either algorithm I have presented breaks down in the case where an already-processed transaction changes blocks?

realnowhereman

hero member

Activity: 504

Merit: 502

In principle, transactions can be in multiple blocks can't they?

If two miners both include the same transcation in a block based on a common ancestor?

The validity or not of a transaction is independent of whether its in a block or not. Eventually one of those blocks will be built on by the rest of the network and the new "true" block tip decided upon. Strictly, it is blocks that are "confirmed" not transactions.

Therefore you can't have transaction lists include a block hash, as it's possible for them to be in more than one, at least temporarily. Since it is blocks that get confirmed, each of those transaction-block links would also have a different number of confirmations.

cdhowie

full member

Activity: 182

Merit: 107

Quote from: BitterTea on May 06, 2011, 01:07:00 PM

I'm not clear the goal of this change, nor what is changing. Is this regarding how a node deals with transactions from orphaned blocks, due to forks?

hmm... Nevermind, after looking at the github page I think I know what this is. Basically, if you have a system (like say Blockexplorer) that needs to look at all blocks/transactions, this is a way to make sure you get every one?

It's more geared towards ecommerce sites or sites that allow users to, say, deposit money into their account (think witcoin or betco.in). It's a way to, with one command and minimal effort, get a list of stuff you haven't seen yet so you can update your database.

BitterTea

sr. member

Activity: 294

Merit: 252

I'm not clear the goal of this change, nor what is changing. Is this regarding how a node deals with transactions from orphaned blocks, due to forks?

hmm... Nevermind, after looking at the github page I think I know what this is. Basically, if you have a system (like say Blockexplorer) that needs to look at all blocks/transactions, this is a way to make sure you get every one?

cdhowie

full member

Activity: 182

Merit: 107

After more conversation in #bitcoin-dev and some hacking, this has evolved into a different patch. The code changes are a bit bigger, but this will simplify reconciliation code tremendously, by requiring only one RPC call to obtain all of the transactions that need to be processed.

cdhowie

full member

Activity: 182

Merit: 107

I've been rolling around several algorithms for processing transactions in a robust way and have had several ideas, none of which were very good. Last night I talked about them in #bitcoin-dev and got some good feedback, and I think I've arrived at one that will work well and require only one minimal change to bitcoind. (In fact, I have already made it and tested the patch.)

The problem is how to reconcile transactions received by bitcoind with the state in another database, without even the possibility of missing a transaction or processing the same transaction twice. This solution takes into account that a previously-processed transaction can be removed from the block chain and re-inserted into another block later, if the bitcoind discovers that another, stronger chain exists. To support this detection, bitcoind must be modified to include the block hash in listtransactions output. Let me explain the process.

First, you must have some table where you store processed transactions. The table should contain bitcoin transaction ID and block ID (hash, not sequence number) columns. Then, you poll bitcoind periodically like so:

Request the last N transactions, where N is an arbitrary number.
Walk the returned list backwards (most recent transaction first) looking for transactions that you consider confirmed (1 or 3 or 6 confirmations, or whatever your requirements indicate). For each transaction:
- Look up the transaction ID in your transaction table. If it does not exist, this is a new transaction; process it and insert the transaction ID and block ID into the table. (Using transactions, of course, to force the processing to be atomic.)
- If the transaction ID does exist, compare the block hash in your table to the block hash in the transaction returned from bitcoind. If they match, then you have processed all new transactions and should now terminate/return. If they do not match, this is a transaction that has been processed already, but has been moved into a new block due to orphaning of the original block you saw the transaction in. Update the block hash ID in the table and continue processing, since it's possible that a new transaction still exists prior to this one in the block chain.
If you reach the beginning of the transaction list then you haven't found any transaction that you've already processed and that has the correct block hash. Increase N (add something to it or multiply it by something) and start the whole process over.

I believe this to be the simplest process that will correctly account for any changes in the block chain. If you can see a problem with this algorithm, or a way that it can be simplified (especially if changes to bitcoind aren't necessary) please let me know.

Topic: Robustly processing transactions from bitcoind (Read 1687 times)