Author

Topic: Merkle-tree service for coinbases (Read 1813 times)

legendary
Activity: 1120
Merit: 1152
October 09, 2012, 03:08:57 AM
#9
I've got the beginnings of something usable. It can accept submitted digests, create merkle trees of them, let people sign those trees, and then take the resulting merkle paths and turn them into completed timestamps. Verification is with PGP right now; I'll tackle adding bitcoin soon.

https://github.com/opentimestamps/opentimestamps-server

See README.md for instructions, and doc/design.md for rough design notes.

My apologies for creating Yet Another Binary Serialization.
legendary
Activity: 1120
Merit: 1152
September 27, 2012, 09:24:35 PM
#8
Please do keep us posted on any experiments.

We need $A_Solution for timestamping data that does not involve storing new data records (transactions) in the main bitcoin block chain.

My mental analogy is to a steam boiler.  Pressure is building from the community, to build all sorts of exciting digital notary / securely timestamped services.  We need a release valve, otherwise the bitcoin blockchain will simply boil over into a very expensive, inefficient method of instant messaging.



Agreed.

At the very least, even if enough pool op's don't get on board, if the first usable software to do this is written as a service, rather than a purpose-built client, hopefully using it in the former mode rather than the latter will be more common.
legendary
Activity: 1596
Merit: 1100
September 27, 2012, 12:27:05 AM
#7
Please do keep us posted on any experiments.

We need $A_Solution for timestamping data that does not involve storing new data records (transactions) in the main bitcoin block chain.

My mental analogy is to a steam boiler.  Pressure is building from the community, to build all sorts of exciting digital notary / securely timestamped services.  We need a release valve, otherwise the bitcoin blockchain will simply boil over into a very expensive, inefficient method of instant messaging.

legendary
Activity: 1120
Merit: 1152
September 27, 2012, 12:19:39 AM
#6
Anyway this can all be packaged up into a Python script and either it'd be run on your server, (is the webserver enough?) or you can have your merged-mine-proxy contact my server for new PoW instead. (after trying it all out on testnet locally obviously) Either way I intend to essentially a dead-simple "pool-op" version first with the minimum of external dependencies (flat-file database with simple expiration rules for instance) to act as a proxy between the pool and more sophisticated merkle servers to be written later.
Unfortunately, merged-mine-proxy blocks during aux chain refreshes, so I think I'll need something local, even if it's just a trivial caching proxy for now. Python3 would be best.

I'll keep that in mind. I'll let you known when I have something concrete to try out.


Credit where credit's due: chronobit https://github.com/goblin/chronobit does timestamping by inserting data into the p2pool share chain, which winds up in p2pool blocks. Unfortunately the share chain is a linear blockchain, so the resulting timestamps are much larger than optimal if you need to store the whole set of data required to verify them. Still, it's where I got the itch to find a better timestamping solution, and the additional side-chain application makes it all the more interesting.
legendary
Activity: 2576
Merit: 1186
September 26, 2012, 11:54:50 PM
#5
I was looking at your eloipool software, and I see it has a setworkaux RPC method. From what I can tell simply adds data directly to the coinbase.
Correct.

So I take it you're additionally running something like merged-mine-proxy from namecoin? Or a helper that calls setworkaux appropriately? I could write a daemon that speaks the RPC interface the merge mining proxy expects if this is what most pools ops use.
Yes, Eloipool works in parallel with a branch of the original merged-mine-proxy.

Anyway this can all be packaged up into a Python script and either it'd be run on your server, (is the webserver enough?) or you can have your merged-mine-proxy contact my server for new PoW instead. (after trying it all out on testnet locally obviously) Either way I intend to essentially a dead-simple "pool-op" version first with the minimum of external dependencies (flat-file database with simple expiration rules for instance) to act as a proxy between the pool and more sophisticated merkle servers to be written later.
Unfortunately, merged-mine-proxy blocks during aux chain refreshes, so I think I'll need something local, even if it's just a trivial caching proxy for now. Python3 would be best.
legendary
Activity: 1120
Merit: 1152
September 26, 2012, 11:47:57 PM
#4
I can readily add any reasonable merged-mining chain for Eligius. If I understand it correctly, there is no need to add anything new to the bitcoin coinbase itself, just adding a merged-mined node should do it.

Edit: Elaboration on IRC:
Quote
[02:05:25] Luke-Jr: what do you mean by "not in bitcoin coinbase, use a merged mining node"?
[02:06:29] jgarzik: the merged mining data in the coinbase is a merkle root of a "auxillery data" merkle tree
[02:07:01] jgarzik: so, the timestamping data could just as well be an element in that tree
[02:07:45] which means it wouldn't make the bitcoin coinbase any bigger than it already is (for merged mining), and all the existing software on the pool end can be used
[02:08:02] Luke-Jr: ok, thanks for the explanation.  agreed.
[02:08:28] Luke-Jr: you would need to be able to have a full merkle branch for the data chain
[02:08:51] (not in the chain, but, _somewhere_)
[02:09:04] sure
[02:09:45] you need that for the data anyway, I think?

Thanks! I was looking at your eloipool software, and I see it has a setworkaux RPC method. From what I can tell simply adds data directly to the coinbase. So I take it you're additionally running something like merged-mine-proxy from namecoin? Or a helper that calls setworkaux appropriately? I could write a daemon that speaks the RPC interface the merge mining proxy expects if this is what most pools ops use.

If I understand the namecoin and merged-mine-proxy code correctly what happens is that the proxy calls the namecoind RPC interface, which supports the extension getauxwork. This essentially returns what data the alt-chain needs linked to the proof of work, as well as the target difficulty for the alt-chain. In my case that is the top merkle hash and the target can be hardcoded at zero. The target difficulty can be set by copying the target seen in the Bitcoin block header, divided by 4+n to ensure we don't miss a difficulty change. (assuming we don't get told the most up-to-date Bitcoin block regularly, I'll have to check those semantics exactly)

Every time a solution is submitted that meets the target difficulty it looks like getworkaux is called again, this time with the full information needed to reconstruct the PoW. It's OK if the daemon always claims that the target was met. Internally though it'll check to see if a new Bitcoin block might have been generated, and if so take appropriate measures. Chain re-orgs are of course an issue; probably best to assume the usual six blocks deep rule before removing submitted hashes from the tree we're trying to insert. I'll look into getauxblock later; I think that can be just a pre-defined header given there isn't a concept of the previous block or anything else. I also gotta figure out what the hell is going on with that slot_id crap...

Anyway this can all be packaged up into a Python script and either it'd be run on your server, (is the webserver enough?) or you can have your merged-mine-proxy contact my server for new PoW instead. (after trying it all out on testnet locally obviously) Either way I intend to essentially a dead-simple "pool-op" version first with the minimum of external dependencies (flat-file database with simple expiration rules for instance) to act as a proxy between the pool and more sophisticated merkle servers to be written later.
legendary
Activity: 2576
Merit: 1186
September 26, 2012, 08:38:55 PM
#3
I can readily add any reasonable merged-mining chain for Eligius. If I understand it correctly, there is no need to add anything new to the bitcoin coinbase itself, just adding a merged-mined node should do it.

Edit: Elaboration on IRC:
Quote
[02:05:25] Luke-Jr: what do you mean by "not in bitcoin coinbase, use a merged mining node"?
[02:06:29] jgarzik: the merged mining data in the coinbase is a merkle root of a "auxillery data" merkle tree
[02:07:01] jgarzik: so, the timestamping data could just as well be an element in that tree
[02:07:45] which means it wouldn't make the bitcoin coinbase any bigger than it already is (for merged mining), and all the existing software on the pool end can be used
[02:08:02] Luke-Jr: ok, thanks for the explanation.  agreed.
[02:08:28] Luke-Jr: you would need to be able to have a full merkle branch for the data chain
[02:08:51] (not in the chain, but, _somewhere_)
[02:09:04] sure
[02:09:45] you need that for the data anyway, I think?
legendary
Activity: 1596
Merit: 1100
September 26, 2012, 08:24:02 PM
#2

If there is sufficient value attached, or potential for it, pool ops have been interested in the past.

Especially if a lot of well known developers were behind a data timestamping service standard.

legendary
Activity: 1120
Merit: 1152
September 26, 2012, 08:18:40 PM
#1
(putting this in it's own thread as I'm not really talking about sidechains anymore)

We shouldn't add extra data into Bitcoin's blockchain. It grows very fast. Imagine how much data it'll be necessary to transmit when Bitcoin usage reach 100.000 transaction per day.
Adding extra data to the coinbase transaction found in every block is just fine.  That is how merged mining is accomplished today, and it is a scalable method of adding unlimited amounts of data to each block.

I've heard this argument come up multiple times when stuff like timestamping or side-chains comes up. The problem is adding extra data is only possible at all if you are a miner, and if you want your data to be added in a reasonable amount of time, a pool op. Given that limitation I suspect both timestamping and sidechain concepts will wind up stuffing data into transactions, bloating up the blockchain. I've mentioned before how this can be done right now, without a special transaction type.


So would pool ops be at all willing to put a merkle tree head into the coinbase's they mine?


From their point of view we can make this pretty easy for them to do technically speaking. Lets suppose I wrote some software that would act as a merkle tree server. It'd accept hashes from anyone, build up a full merkle tree, and provide the top hash to pool ops for inclusion. The interface for them can be just a HTTP GET-able top-level hash that they put directly into their coinbase. My software could even look in coinbases of blocks mined to figure out what ones got into the blockchain. If they aren't willing to put the hash in the coin base directly, or want to stuff it into an existing merge-mine merkle tree, they'd have to additionally send back the merkle leaves required to get to the coinbase by either HTTP RPC back to the sender, or even just publishing the associated data automatically on their website. For redundancy the latter option is probably better anyway, because that way it's fine if multiple people are operating these servers, with pool ops determining what server's they're going to accept top-level hashes from. These merkle servers can submit hashes to each other too. If any server gets a hash into a coinbase, they all do, modulo the longer path get to the top hash. Equally, nothing is stopping a pool op from running the merkle server themselves.

Of course, this kinda looks like we're allowing arbitrary people to use pool hash power to do their own merge mining. I suggest that before a pool op includes the top-level hash into the coinbase or merge-mine tree they do a final "poisoning" step that consists of taking that hash, converting it to hex, pre-pending a known and fixed string, ("This hash is not a proof of work: ") and then hashing that whole string with sha256d. Since the poison step is always the same timestamp and similar uses don't have to record that extra bit of data in signatures, but at the same time it unambiguously distinguishes such merkle trees from legit merge mined ones.

I've been thinking about this stuff in terms of timestamps and have been recently working on a very compact, and extendable, way of storing the resulting merkle paths. The goal here is to be able to timestamp a file and save a path to one or more notaries who guarantee that a hash existed at a given time, with the option of later adding additional paths to other notary signatures as they become available. Notary being a PGP signature, Bitcoin block, commercial timestamping service, dead-tree publication etc. It's a field that doesn't seem to have much open-source software available, probably because the problem isn't really a software problem... but Bitcoin changes that. I've got some Python code (actually Cython) I'm working on in a really early stage; nothing set in stone yet. One key thing I want to do is ensure that servers running the timestamping version of this software widely witness each other with multiple notary methods; the blockchain is an excellent way of doing that if you include Bitcoin block hashes into your own merkle trees.

Regarding scalability lets suppose we would up with a smallish number of people running servers, 8 or 16 or so, that pool ops collectively were querying. If all 16 got into a block the merkle tree is now 4 levels deep. Lets assume each timestamp server operator is running their server on Amazon EC2's free tier, 15GB bandwidth out a month. With 1KiB out/request they can serve up to 6 timestamps/second (3600 in 10minutes) without running out of bandwidth. 2^12=4096 so the path is 16 long, 512bytes + block header overhead. Rate-limiting, if even required at all, can be reasonably done with API keys, CAPTCHAs and asking nicely, with the intention being that heavy users run their own servers. (they probably have their own notaries anyway, with bitcoin as a backup verification) Heck, maybe even monetary payments to deter abuse; if only there was a way to easily accept such payments...

In any case, if pool ops aren't interested, this kind of thing can equally work by dumping the data into specially marked transactions, although having the option of something centralized would be better provided enough pool ops are interested that getting your timestamp/sidechain in a reasonable amount of time is possible. The worst case is if timestamping and sidechains is a common use case, and everyone rolls their own transaction-based system bloating up the chain. As it is with the 1MiB block size limit and an average transaction size of around 256 bytes block's aren't going to have more than 4096 transactions in them anyway, so applications trying to limit the size of their trees are often going to have an incentive to use transactions directly, modulo transaction fees.
Jump to: