Pages:
Author

Topic: Formalised Bitcoin Protocol Standard (Read 10550 times)

legendary
Activity: 1596
Merit: 1100
March 13, 2013, 04:37:38 PM
#80
(copied from misterbigg's thread, by request)

As Pieter wrote on bitcoin-development list,

Quote
The protocol is whatever the network enforces - and that is some mix of versions of the reference client right now, but doesn't need to remain that way.

I would very much like to have a text book of rules that is authorative, and every client that follows it would be correct. Unfortunately, that is not how a consensus system works. All (full) clients validate all rules, and all must independently come to the same solution. Consensus is of utmost importance, more than some theoretical "correctness". If we'd have a specification document, and it was discovered that a lot of nodes on the network were doing something different than the document, those nodes would be buggy, but it would be the specification that is wrong.

Or restated:  The fundamental problem being solved by bitcoin at a technical level, on a daily basis, is the distributed consensus problem (link).

We fully support the writing of specifications and documentation, which you can see here
    https://en.bitcoin.it/wiki/Protocol_specification

And changes to the existing protocol are formally documented here,
    https://en.bitcoin.it/wiki/Bitcoin_Improvement_Proposals

Ultimately the operational definition of consensus comes from what the network accepts/expects, not a theoretical paper.  Specification practices are healthy as a manual, human-based method of achieving consensus on network protocol rules.  Alternate client implementations (c.f. heterogeneous environment) are another good practice.

But the collective software rules are always the final specification, by definition.  That is what bitcoin does, achieve consensus.

A few other observations:

Gnutella had a business and project environment with co-motivated individuals working on a few key codebases.  The reference codebase in bitcoin, in contrast, has one paid developer (Gavin@BF) and a few part time unpaid volunteers.

All the big bitcoin businesses seem to either (a) contribute to BF, (b) use bitcoind without contributing back any testing/dev/specification resources, or (c) do their own thing entirely, not contributing back any testing/dev/specification resources.

Bitcoin is a thing, an invention, not a funded project with a built-in set of professionals paid to ensure full spec/dev/test engineering effort.  If you want something, DO IT.  You cannot expect the engineering resources to do X to magically appear, just because you complained on an Internet forum.

In an unfunded open source project, arguing all day about the lack of full-engineering-team rigor is entirely wasted energy.  Blame the dev team if that is your favorite target, that will not magically create extra time in the day or extra manpower to accomplish these extra tasks being demanded by non-contributors.

The time spent whining about what an unfunded effort fails to do could be better spent, say, creating a test network of full nodes running all known bitcoind versions, 0.3 through present.  And test, test, test changes through that.

legendary
Activity: 2940
Merit: 1090
March 13, 2013, 03:28:17 PM
#79

I'd prefer "The Satoshi client is open source, regularly updated, peer reviewed and provides the basis for a self-certifying ecosystem, but does not itself actually meet the specification [...]


I would prefer it to be called a protocol description rather than specification until the majority of the network meets it.  

Sorry if I seem to be emulating MPOE-PR, but... No.

If it is just a description of what the current mess actually does, then the current mess already meets it, and in fact it would be more a case of it meeting the current mess than the current mess meeting it.

We can even start at the code level as I did with the BDB misconfiguration example above, in which I separated out the specified max block size as being a specification then checked the BDB configuration to see whether it actually properly met that specification.

On the other hand... semantics.

A specification is a description. It probably should first describe requirements before then going on to describe a protocol which it intends should meet the requirements but which could turn out itself to not meet the requirements if it itself turns out to be buggy.

It is styled a "specification" to indicate that it is a description of what it intends ought to be, not necessarily a description of something that has actually yet been accomplished / implemented in actual running code.

-MarkM-
full member
Activity: 166
Merit: 101
March 13, 2013, 03:20:54 PM
#78

I'd prefer "The Satoshi client is open source, regularly updated, peer reviewed and provides the basis for a self-certifying ecosystem, but does not itself actually meet the specification [...]


I would prefer it to be called a protocol description rather than specification until the majority of the network meets it. 
legendary
Activity: 2940
Merit: 1090
March 13, 2013, 02:43:56 PM
#77
From a business perspective, I suppose the bitcoin story sounds a lot better if one can say something like:

There is at least one high level documentation of the protocol (wiki run by the Bitcoin Foundation or else) AND any bitcoin client implementation can be tested for compliance with a single reference implementation called the Satoshi client.
The Satoshi client is open source, regularly updated, peer reviewed and provides the basis for a self-certifying ecosystem.

Through the reference implementation, bitcoin can free online merchants and payment processors from the burden of costly certification requirements imposed by proprietary, legacy payment schemes.

I'd prefer "The Satoshi client is open source, regularly updated, peer reviewed and provides the basis for a self-certifying ecosystem, but does not itself actually meet the specification at this time, due to certain bugs errors or omissions that have been discovered, and furthermore may contain more such bugs errors or omissions yet to be discovered. Nonetheless it is currently the most accurate rendition of the spec in implementation form that we currently have, and work proceeds apace on bringing it fully up to spec and correcting all bugs errors or omissions in as timely a manner as the need to allow users plenty of time to upgrade to the newer, more-correct versions - and our development and certification budget - allows."

Or something along such lines.

-MarkM-
legendary
Activity: 2940
Merit: 1090
March 13, 2013, 02:29:46 PM
#76
I'll chime in:  Mike Hearn is absolutely right.  

There's nothing wrong with writing "documentation" to help describe at a higher level what is going on.  As a developer of a non-validation client, such documentation has been quite useful.  I fully support "extra stuff the describes at a high level what's going on".  But you must understand that once you put the word "specification" on any such document, you are promising the reader that the document is sufficient for creating a compliant implementation of what is being described.  But with Bitcoin, anything short of the source code itself is not sufficient for implementing it.  And worse, the consequences of not doing so will result in people losing money -- because if there's an inefficiency in the system an attacker will exploit it, guaranteed (especially one where your target uses code that validates differently than much of the rest of the network).  This is why Satoshi did not support alternative implementations.


Write all the "documentation" you want, and put as many comments into the code as you want.  But do not use the word "specification" because nothing except the code can qualify for it.



This still seems like superstition.

You are basically stating that if the specification specifies that on zero day all your money is forfeit, then sorry, all your money is forfeit, look at the specification (the code) and see right there, the exact code the zero day hacker used? Its in the spec, so the hacker was correctly using the system and correctly took all your money exactly as the spec intended him or her to.

Bullshit.

Before even getting to the protocol specification part of the docs, the requirements specification should specify that zero day exploits are not intended to be part of the spec, regardless of which implemention or how many implementations, or how many users use such implementations or how much money rides on them.

So I guess top of the protocol spec should say "first, read the requirements specification carefully..."

-MarkM-
legendary
Activity: 2940
Merit: 1090
March 13, 2013, 02:23:25 PM
#75
Someone should just write a spec and stop debating it.

+1

If you want a spec then write something.  That's what's happening by default anyway on the bitcoin wiki.

But Mike Hearn makes two key points that are nonetheless valid:

  • You can write a spec, but you can't guarantee it actually matches the behaviour of the majority implementation.
  • The consequences of getting details wrong is far more severe than that which most programmers are used to; there is simply no comparison to, e.g., POP3.

If it were an IETF draft, I would add a section at the top that states "the reference implementation is canonical, this spec is subordinate. any differences are spec bugs."



I would prefer to assert that the reference implementation might not itself actually meet the spec, it is merely the closest thing we have so far to a correct implementation of the spec.

We already do not nitty-gritty verify blocks prior to the most-recent checkpoint, so all we need to be able to derive from pre-checkpoint spans of the blockchain is the unspent outputs, isn't it? Is there anything else in that midden of ancient archaeological relics that we need?

About the negative signatures thing, has OpenSSL already turned them positive by the time they get enshrined in the blockchain, or are they in the blockchain as negative?

-MarkM-
legendary
Activity: 2940
Merit: 1090
March 13, 2013, 01:57:20 PM
#74
and not much easier to read.
You can't possibly be serious.

To make a point I guess everyone familiar with the Bitcoin source can explain what this does and why something like it is needed for a "smart pointer" class?

Code:
class bool_test
{
   public:
   bool_test( ) { }

   private:
   void operator delete( void* );
};

operator bool_test*( ) const
{
   if( !p_instance )
      return 0;
   static bool_test test;
   return &test;
}

(hint: you might want to take a look at http://www.artima.com/cppsource/safebool2.html as even the above doesn't solve all the potential problems for a simple "bool" operator in a smart pointer class in C++)

Smiley


This mostly shows how a specific implementation of a spec can make it really hard to even figure out what the actual spec is that it is attempting to implement.

If we write an actual spec it does not matter whether verifying some prespectoric span of blocks from the dark ages needs to use some ancient ritual mumbo-jumbo to verify that it is indeed a span of blocks that totally violates the spec due to violating the spec having been superstitiously worshipped as heroic back in the dark ages. The spec will make it clear that in the dark ages, the blockchain was totally broken, which led to a shaman caste that made dark and ugly sacrifices to propitiate the invisible hand that brings the cargo upon which they prospered.

The number of locks in the Berkeley DataBase seems a modern example of the shamanic cargo-cult approach too. When you notice that some to whom the cargo is the all-important thing, the raison d'etre for the cult whether or not for the experiment it purports to administer, seem to consider that to be part of the spec instead of a clear bug, it becomes clear that the implementation cannot be the spec: the implementation clearly specifies in capital letters that blocks can be up to one megabyte in size, but clearly neglected to ascertain exactly how many BDB locks are actually in real life required in order that blocks of that size be incapable of needing more locks than are configured.

There we have a clear case of the spec, the in capital letters constant specifying the max block size the functions and classes and such are intended to enable, not actually being met by the code intended to meet it.

-MarkM-
hero member
Activity: 836
Merit: 1030
bits of proof
March 13, 2013, 01:05:02 PM
#73
I fully understand all the reasonable arguments made by all the devs made earlier in this thread, but come on.

We just experienced a chain fork due to a "failure to configure" a 3rd party library (BDB) ...

My point is: however convoluted the must-be-kept-forever bugs and rules are, however difficult the process of extracting the spec from C++ would be, do you REALLY wanna keep running the whole block chain on top of a monoculture of identical clones built from the same convoluted code base ?  

This would be like the entire web running on the same identical version of Appache's httpd and every time a Russian or Chinese hacker discovers a bug, all the websites in the world would become vulnerable to be hacked, defaced or shutdown.

Guys ... it's time to grow up and extract the spec from C++, then target that spec for 1.0 release, it's that or keep trusting ever increasing amounts of wealth to an army of "perfect" clones, seriously ...

+1

I went through the pain of extracting the spec from several code bases to pour it into my implementation, that started as a way to study the system for my own intellectual curiosity. After months of work in my free time I decided that it might have value for others and open sourced it.

I received some help and much more FUD and warnings for attempting to do the undoable by definition. I comprehend the arguments presented to be cautious to replicate every even untested "feature" of the Satoshi hairball and to look out for the unknown. I took them seriously, worked hard and am not done yet. Yet, I remain confident that capturing the behavior of the implementation is possible to the extent, that it can be used with similar confidence one is forced to put into any software that manages serious money.

It might be that my implementation is key to nothing more than that I understand the depth of the protocol. I wish that the collected knowledge of developers were more accessible to those after us. Unfortunately we all work hard toward goals that are apparently not aligned to this, at least because writing specs and tests is neither sexy nor rewarded.

People using an implementation or even upgrading to a new version of Satoshi know that there is a risk and they will not use an implementation or version if they decide the risk is above their appetite for the return it promises. Having standards, specs and tests is not a guarantee, but hey, they are what human came up with for similarly hard and potentially costly problems.

Can you imagine a bitcoin economy expanding by magnitudes served by a single implementation maintained by a chosen couple in a single version that fits all purposes?

Let's rather find a framework and funding to build the documentation and test suite that enables us to handle this protocol more confidently.


full member
Activity: 209
Merit: 100
March 12, 2013, 07:49:33 PM
#72
I fully understand all the reasonable arguments made by all the devs made earlier in this thread, but come on.

We just experienced a chain fork due to a "failure to configure" a 3rd party library (BDB) ...

My point is: however convoluted the must-be-kept-forever bugs and rules are, however difficult the process of extracting the spec from C++ would be, do you REALLY wanna keep running the whole block chain on top of a monoculture of identical clones built from the same convoluted code base ?  

This would be like the entire web running on the same identical version of Appache's httpd and every time a Russian or Chinese hacker discovers a bug, all the websites in the world would become vulnerable to be hacked, defaced or shutdown.

Guys ... it's time to grow up and extract the spec from C++, then target that spec for 1.0 release, it's that or keep trusting ever increasing amounts of wealth to an army of "perfect" clones, seriously ...


Cheers ...
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
March 12, 2013, 04:01:43 AM
#71
Wow, five days after I wrote this:

The Satoshi client already has this "problem", by the way -- it reacts to unusual conditions by going into safe mode.  If this really were concern, you should be more worried about people finding bugs in the Satoshi client that put it into safe mode and somehow extorting MtGox.

...it actually happened:

If you're on 0.7 or older, the client will likely tell you that you need to upgrade.

All the 0.7 Satoshi clients just went into safe mode when they saw the 0.8 clients build a longer chain.  And the world didn't end.  Some miners (probably including me) wasted, in aggregate, 625 BTC worth of hashpower.  They'll get over it.

So, I'll say it again regarding alternative fully-validating implementations: no risk except potential loss of revenue by those who mine with it.

Friendly reminder: no matter what client you use there are rare situations which may lead to you being unable to process transactions for a few hours at a time.  If your business can't handle this situation maybe bitcoin isn't right for you.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
March 12, 2013, 03:59:35 AM
#70
Yeah, but as I said before, there are services where being offline for even a few hours can lead to huge losses.
DeepBit used to be in this category, I assume most large pools are similar.

Hrm, I thought I was pretty clear about the fact that loss of mining revenue was a risk (the only one, in fact).  I've yet to see a non-mining example.
legendary
Activity: 1526
Merit: 1134
March 07, 2013, 06:09:58 AM
#69
Yeah, but as I said before, there are services where being offline for even a few hours can lead to huge losses. DeepBit used to be in this category, I assume most large pools are similar.

Zero-Day exploits also tend to have somewhat short lives, yet they still have value.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
March 06, 2013, 09:11:53 PM
#68
Some kinds of bug can cause a node to be permanently forked onto a side chain, which is essentially a DoS against that business. If you have a way to DoS a business you can try extorting money from them, things like that. There are a variety of ways.

No, they're only in safe mode until a human intervenes.  You can't deny service, only interrupt it once.

This won't work for extortion -- "send me money or else I'll "  Usually "stop flooding you with traffic" or "not flood you again" goes in the blank, which is what makes DDoS extortion work.  But to trigger this condition you need to have found a zero-day behavior-splitting bug.  To use it the splitting transaction must be mined into the chain, which advertises the exploit to the whole world including the alternative implementation's dev team -- it's a self-reporting bug and the exploit is one-time use only.  In other words, antifragile.

The Satoshi client already has this "problem", by the way -- it reacts to unusual conditions by going into safe mode.  If this really were concern, you should be more worried about people finding bugs in the Satoshi client that put it into safe mode and somehow extorting MtGox.

AFAICT the situation is still no risk except potential loss of revenue by those who mine with it.
sr. member
Activity: 476
Merit: 250
March 06, 2013, 01:24:40 PM
#67
When I started in computing in the seventies, it was mostly trying and failing, after that we got a development standard of 2 shelf meters, after that a binder, after that a thin brochure, and now we are back to trying and failing, which works best, after all.

The wisest man on the board.
legendary
Activity: 1526
Merit: 1134
January 19, 2013, 07:56:00 AM
#66
Some kinds of bug can cause a node to be permanently forked onto a side chain, which is essentially a DoS against that business. If you have a way to DoS a business you can try extorting money from them, things like that. There are a variety of ways.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
January 19, 2013, 06:45:25 AM
#65
If that's what you meant, then I think it takes the discussion a big step forward: done properly, alternative implementations place at risk only the mining revenue of people who choose to use those implementations for mining.

No. I'm not sure how to make this any clearer.

Perhaps you could give an example scenario where a non-mining user loses money as a result of using a fully-validating client with long invalid branch detection?  That would definitely make it clearer (at least for me).


All participants are put at risk by chain splitting bugs. … if you split a significant part  of mining power off onto a side chain the cost of a double spend drops significantly, which increases the risk for all other participants as well.

So, to restate, it seems like you're claiming that alternative fully-validating implementations create only two risks: (a) miners that choose to use these implementations may lose revenue and (b) the network hashrate might drop, possibly by 50% at the very most.  I actually have reason to believe that the latter will self-correct immediately (and in any case that the hashrate is already plenty high to prevent double-spends, by more than a factor of two) but before proceeding with that I just want to get this straight: are there other "people will lose money" outcomes you are concerned about?  Or is this only about mining revenue and network hashrate?

Sorry if I am being picky and obtuse here.  But I keep seeing these extremely broad comments about "people will lose money"… while there may be some element of truth to that I think the reality is very, very much narrower than that connotes.  So please forgive me for probing the details.
legendary
Activity: 1190
Merit: 1004
January 08, 2013, 09:56:02 AM
#64
I think it is definitely worth for full nodes to have some sort of safe mode triggered by detection of longer chains. It's simple enough and the problem becomes a disturbance in operation, rather than a problem of forking. And indeed mining software is most critically important when it comes to correct validation.
legendary
Activity: 1526
Merit: 1134
January 08, 2013, 09:28:26 AM
#63
If that's what you meant, then I think it takes the discussion a big step forward: done properly, alternative implementations place at risk only the mining revenue of people who choose to use those implementations for mining.

No. I'm not sure how to make this any clearer. All participants are put at risk by chain splitting bugs. As it happens the direct financial risk to miners who use that implementation is the largest, but obviously, if you split a significant part  of mining power off onto a side chain the cost of a double spend drops significantly, which increases the risk for all other participants as well.

Sounds complicated… simply noticing that more than half the computing power is devoted to a ruleset that doesn't match your own ought to be alarming enough!  I'd worry that attempting automated recovery from such a situation could open new attack vectors.

It's not simple to notice that the ruleset has changed if you aren't actually validating those rules! But if you do notice, then picking the correct chain should be a fairly simple consequence of the proving process.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
January 08, 2013, 04:33:24 AM
#62
I assume that the kerfuffle over "losing money" isn't about mining revenue.

Miners are the ones most exposed to losing money!

If that's what you meant, then I think it takes the discussion a big step forward: done properly, alternative implementations place at risk only the mining revenue of people who choose to use those implementations for mining.  This is a lot easier to agree with -- the risks of implementation diversity are limited to lost mining revenue and are a mining concern, not a concern for general users.  But I don't think that's the impression people got from this thread so far.


But it's definitely a distinct security class compared to SPV.

There are hybrid proposals floating around for SPV clients that, on receipt of a "proof of problems" from a node would do enough work to fully verify the chain. The proof would contain enough data to help the SPV clients figure out the rule violation without needing a full database.

Sounds complicated… simply noticing that more than half the computing power is devoted to a ruleset that doesn't match your own ought to be alarming enough!  I'd worry that attempting automated recovery from such a situation could open new attack vectors.
legendary
Activity: 1526
Merit: 1134
January 08, 2013, 04:25:24 AM
#61
I assume that the kerfuffle over "losing money" isn't about mining revenue.  Safe mode can be automatically deactivated if a valid chain once again becomes difficultywise-longest.

Miners are the ones most exposed to losing money! At some point DeepBit accidentally got split onto a side chain because of a database corruption and he lost many thousands of dollars Sad

Quote
I suppose clients that do this are still "second class" in the sense that if you find a chain-splitting bug you can get them to pause until a human intervenes.  But it's definitely a distinct security class compared to SPV.

There are hybrid proposals floating around for SPV clients that, on receipt of a "proof of problems" from a node would do enough work to fully verify the chain. The proof would contain enough data to help the SPV clients figure out the rule violation without needing a full database. It could work, somebody would need to try it.
Pages:
Jump to: