There have been a bunch of discussions on IRC lately with people who want to build non-financial distributed quorum schemes on top of BitCoin, but don't understand how to leverage BitCoins mining ecosystem. The design for this was
laid out by Satoshi some time ago but not in sufficient detail for people to actually implement, so these people usually say they'll just stuff their unrelated data into the financial block chain regardless of what the community wants. It makes nobody happy.
In this post I'll try and lay out the rationale and specific work needed for multiple independent chains that share hash power. I'll also outline how to allow people to pay for things on alternative chains with BitCoins.
RationaleThe block chain is a large, complex data structure shared between many people. Verifying its integrity requires many slow (and thus expensive) operations like ECDSA verifications and disk seeks.
Everyone who takes part in BitCoin today runs a node and thus maintains the whole thing. Whilst in future end-users will probably use more efficient but slightly less secure modes (SPV), merchants and miners will probably always pay the full costs for the upkeep of this shared data structure. All these people today are united by a common interest in a new form of payments. They may or may not also be interested in schemes for distributed DNS, SSL certs, voting, secure timestamping and so on.
So one reason to keep the BitCoin chain for finance is it's unfair to externalize the costs of unrelated schemes onto people who are interested only in payments - the canonical example being merchants. Increasing the cost of accepting BitCoins for goods and services hurts everyone who uses the system by reducing the number of merchants or increasing their costs.
Another reason is that it's just bad engineering to cram every possible distributed quorum into the design constraints of BitCoin. The transaction format BitCoin uses is flexible but ultimately designed for finance. You cannot leave out the "value" field. It's always there, even if it'd make no sense for your particular application.
One final reason is that Satoshi was opposed to putting non-BitCoin related data into the main chain. As creator of the system, his opinion should carry a lot of weight with anyone serious about extending it.
Fortunately we can have our cake and eat it too.
Designing a new systemTo create a new network that uses Nakamoto block chains, you need to start by defining what a transaction in your new network means. BitCoin transactions split and combine value using scripts, but yours don't have to do that. They could be anything. Here's an example of a simple DNS style "transaction" described using
Google protocol buffers syntax.
message NameTx {
required string name = 1;
repeated bytes ip_addresses = 2;
required bytes pubkey = 3;
enum ClaimType {
NEW_NAME,
TRANSFER
}
required ClaimType type = 4;
// Only used if type == TRANSFER. The new owner of the name and signature proving current ownership.
optional bytes dest_pubkey = 5;
optional bytes signature = 6;
}
This is very different to a BitCoin style transaction. There's no concept of inputs, outputs, addresses, values or scripts. It's got what you need for a very simple DNS style scheme (that lacks abuse controls etc) and nothing more. There's also no concept of a "coinbase" transaction.
The block definition is also up to you. It does not have to be formatted in the same way as Satoshi chose, but you need most of the same conceptual parts. You also need to store something else, some data from BitCoin - we'll get to why in a moment. Here's a minimal example of what's needed:
message NameBlock {
required bytes prev_block_hash = 1;
required uint32 difficulty = 2;
required uint32 time = 3;
repeated NameTx transactions = 4;
}
message NameSuperBlock {
required NameBlock name_data = 1;
required BitCoinData bitcoin_data = 2;
}
It's important to observe what we don't have as well as what we do. We
don't have the following fields from Satoshis format:
- version: protobufs handles the versioning of binary data formats for us
- merkle root: you can choose to arrange your transactions into a merkle root if you like, so you can use the same disk space reclamation trick Satoshi did. But it's strictly optional. If you want to simplify things down you can skip it.
- nonce: unnecessary, we'll see why in a moment.
Sharing workThe bitcoin_data field is how you share work with miners today (assuming they opt in by installing your software alongside their bitcoin node). It's defined like this:
message BitCoinData {
// A BitCoin format block header. Because it's in Satoshis custom format we represent this as a byte array.
required bytes header = 1;
// The first transaction from the block.
required bytes coinbase_tx = 2;
// The merkle branch linking the coinbase transaction to the header.
required bytes merkle_branch = 3;
}
You could just store the entire block. But that's inefficient. All you need to share work is the above three things.
Here's a quick note about the last field. A merkle tree is a data structure in which each node in the tree is a hash. The leaf nodes are hashes of the things you want to include in the tree. The interior nodes are hashes of the concatenations of the child nodes.
Wikipedia has more detail if you need it. A merkle branch is a part of a merkle tree which allows you to cryptographically prove that something you're given was in the tree, without needing everything that was in the tree. They are calculated by the CBlock::GetMerkleBranch() function in the BitCoin code.
The merkle branch links the coinbase_tx to the block header, so you can prove it was in that block. The coinbase TX is a regular BitCoin coinbase (that makes new coins and claims fees), except the scriptSig on its input contains an extra entry that todays scriptSigs don't. That extra entry is a hash of the NameBlock structure. The coinbase scriptSig today is formatted like this:
void IncrementExtraNonce(CBlock* pblock, CBlockIndex* pindexPrev, unsigned int& nExtraNonce, int64& nPrevTime)
{
.....
pblock->vtx[0].vin[0].scriptSig = CScript() << pblock->nBits << CBigNum(nExtraNonce);
.....
}
Just the difficulty bits and the "extra nonce". But it's actually allowed to contain anything, as long as it's not too big. So in your new blocks it'd be:
To get the NameBlock hash into BitCoin you'd add a new RPC command, "setextrahashes" or something similar. It'd just update a global list and the IncrementExtraNonce function would include the extra hashes when the scriptSig is built. This is a very simple change to BitCoin.
A more complex change is the other new RPC. Because you have your own chain, it has its own difficulty. Most likely it'll be different to BitCoins own. So you need a way to tell BitCoin "when you find a block that matches an extradifficulty, please let me know" - like a hanging RPC or perhaps BitCoin could make an HTTP request by itself. The "getwork" protocol used by distributed workers might also need to be updated so miners can be given multiple difficulties to check ... or perhaps BitCoin can just do what Slushs pool already does and vend work of min(bitcoin difficulty, extra difficulty) then ignore found blocks that don't match BitCoins own difficulty.
Independent node software
Your new network has its own codebase. It can be written in whatever language you like, use whatever P2P protocol you like, store its data however you like and so on.
When a node on your network receives a message informing it about a new transaction, it verifies that transaction follows the rules of your network. To use our simple DNS example it would verify that if you're claiming a new name, it doesn't already exist and if you're transferring a name that the signatures are correct.
If the transaction is valid, it's added to the current NameBlock message. That message is serialized to binary (protobufs does this for you), hashed and then your node makes an RPC to BitCoin telling it what the current extra hash is. When BitCoin finds a bitcoin block of the right difficulty for your network, it informs your software and passes the block header, coinbase tx and merkle branch to it. Your node combines them together into a BitCoinData message, which is then glued together with the NameBlock. This "superblock" is then broadcast via your independent P2P network.
When a namenode receives a new superblock it does the following things:
- Verifies the NameBlock contents are correct, ie, that the transactions follow the rules.
- Verifies that the NameBlocks prev hash makes it fit somewhere in the chain and that the difficulty is correct.
- Hashes the NameBlock structure and then verifies that this hash appears in the BitCoinData coinbase scriptSig, in the right place.
- Extracts the merkle root of the BitCoin format block from the header and then verifies that the coinbase tx provided did, in fact, exist in that block (using the branch, root, tx and header together).
- Verifies that the hash of the BitCoin format block header is below the difficulty found in the NameBlock structure.
You've now verified that a sufficiently hard proof of work was done over the contents of the NameBlock structure. Your node can now relay the newly found block (or if you don't use a P2P network make it available on your server, etc).
On the other side when BitCoin finds a block that is correct for the BitCoin network and chain, it broadcasts it and obviously the hash of your NameBlock is included. Fortunately this is only an additional 33 bytes overhead so nobody cares about it - the "pollution" of the financial chain is both trivial and constant.
Handling your new block chain
You have to handle re-organizations. This is a standard part of the block chain algorithm. When a re-org occurs you have to verify that the state of your world still makes sense. For example in the new chain maybe somebody else now owns a name you thought you owned. It's up to you how to inform your users of these events and what app specific logic is appropriate here.
You can choose your own parameters for the new chain. As an example, Satoshi chose to target one new block every ten minutes as a tradeoff between latency and wasted work. You could choose something much larger (hours, days) or much faster if you're willing to tolerate more splits due to blocks being found simultaneously. BitCoin retargets the difficulty roughly every two weeks. You could choose some other length of time.
The BlockChain class in BitCoinJ shows how to implement these algorithms - it's not complicated. The code is largely independent of the BitCoin transaction format so it could maybe be reused for new networks.
To store the block chain, Satoshi chose to use Berkeley DB as a way to index from transactions to the blocks they appeared in (amongst other things). You could use any database you liked.
Paying for alternative resources with BitCoins
Note: I rewrote this section later to remove talk of burning coins as it's not necessary
A commonly cited reason for putting DNS related data into BitCoin is the desire to pay for names with BitCoins. You can do this with an independent chain too, because you make the rules and can link the two chains together as you see fit.
To implement this we extend our transaction message like this:
message NameTx {
required string name = 1;
repeated bytes ip_addresses = 2;
required bytes pubkey = 3;
enum ClaimType {
NEW_NAME,
TRANSFER
}
required ClaimType type = 4;
// Only used if type == TRANSFER. The new owner of the name and signature proving current ownership.
optional bytes dest_pubkey = 5;
optional bytes signature = 6;
// Only used if type == NEW_NAME.
optional bytes purchase_pubkey = 7;
optional bytes purchase_signature = 8;
}
To buy a name that costs 10 BTC, the user presses the "buy name" button in their DnsCoin GUI. It talks to their local BitCoin node via RPC and does the following:
- Creates a new BitCoin address (key).
- Sends 10 BTC to the new address (a send-to-self transaction).
- Extracts the key from BitCoin, meaning it's no longer available to spend in your wallet. I think sipa has patches that enable this functionality - it isn't possible with todays software though.
- Stores the key in a DnsCoin data file somewhere.
- Sets the purchase_pubkey field to the public form of the key (ecdsa pubkeys are derived from the private keys).
- Sets purchase_signature to zero, signs that copy of the NameTx, then puts the signature into the purchase_signature field.
The name purchase transaction is now linked to the BitCoin address with 10 BTC, in a crytographically provable way. Other DNS nodes verify that the BitCoin transaction exists and is confirmed before they agree you now own the name.
To release the name you re-import the key back into BitCoin and spend the coins. Because DNS nodes are linked to BitCoin nodes, they will quickly learn that the coins have been spent and will make the name available for purchasing once again.
Other schemes are possible, like pay to miner, if you want your network to interact with BitCoin in a different way.