Author

Topic: Remove 4 Byte for version from header (Read 1504 times)

sr. member
Activity: 392
Merit: 268
Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ
August 21, 2015, 08:17:01 AM
#20

Thanks for taking the time, I will try read up on the matter and not waste anyones time on it.

Only one question, the "blk*.dat" files are the blockchain right?

If there is no possibility to save space and put more transactions in one block, why can I zip the file to half the size (without even removing any data or changing the layout)?

That's because they're not just the blockchain. They're a LevelDB database that contains the blocks themselves. The zipping is likely leveldb overhead as leveldb moves data between "levels" of access based on access patterns, and copies remain until compaction.
newbie
Activity: 25
Merit: 0
August 21, 2015, 08:02:17 AM
#19
- I can't believe that, older blocks are valid because they exist and have been confirmed everything else would contradict immutability.

You're not understanding the point. Older blocks are valid by virtue of the software using older rules. The version in the header specifies which set of rules to use. A version 1 block with version 1 in the header is still valid, even if version 1 blocks are not valid when labeled with a newer version.

Quote
- The version number is a variable(!), every miner is free to choose whatever she wants. If what you say is true, somebody could create quite a mess with sending an old version number. In such a case we should remove it as bug urgently.

So? Then their block is validated under those rules, assuming new blocks under old rules are even considered valid after a certain point.

Quote
Just looking at a set of blocks make we wonder why we can't map them in a way that we just cut of all the leading zeros. ++

That has its own overhead. I'm not sure how it would ever outweigh the drawbacks. 4 bytes isn't that much, unless you're still hand-soldering discrete 74XX76 flip-flops together for memory.

Thanks for taking the time, I will try read up on the matter and not waste anyones time on it.

Only one question, the "blk*.dat" files are the blockchain right?

If there is no possibility to save space and put more transactions in one block, why can I zip the file to half the size (without even removing any data or changing the layout)?
sr. member
Activity: 392
Merit: 268
Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ
August 20, 2015, 02:20:34 PM
#18
- I can't believe that, older blocks are valid because they exist and have been confirmed everything else would contradict immutability.

You're not understanding the point. Older blocks are valid by virtue of the software using older rules. The version in the header specifies which set of rules to use. A version 1 block with version 1 in the header is still valid, even if version 1 blocks are not valid when labeled with a newer version.

Quote
- The version number is a variable(!), every miner is free to choose whatever she wants. If what you say is true, somebody could create quite a mess with sending an old version number. In such a case we should remove it as bug urgently.

So? Then their block is validated under those rules, assuming new blocks under old rules are even considered valid after a certain point.

Quote
Just looking at a set of blocks make we wonder why we can't map them in a way that we just cut of all the leading zeros. ++

That has its own overhead. I'm not sure how it would ever outweigh the drawbacks. 4 bytes isn't that much, unless you're still hand-soldering discrete 74XX76 flip-flops together for memory.
staff
Activity: 3458
Merit: 6793
Just writing some code
August 20, 2015, 02:04:08 PM
#17
Let's not argue.

I am looking at some random blocks. What I see is that a single transactions is a
hash -> Don't know how to compress
Impossible without losing data.

index -> why not sort by time, input etc, something 'drawn' from the data?
What index and what sorting are you taking about?
We have inputs and outputs. For example, frequently we have inputs from the same origin, why not add them up for storage?
What do you mean add them up? New transactions need to be able to reference the inputs so simply "adding them up" won't allow clients to create transactions.
Just looking at a set of blocks make we wonder why we can't map them in a way that we just cut of all the leading zeros. ++
That could be possible but how much space will that really save?
staff
Activity: 3458
Merit: 6793
Just writing some code
August 20, 2015, 01:51:12 PM
#16
Version numbers also help to facilitate miner voting. Miners voted for using BIP66 rules by producing v3 blocks. BitcoinXT nodes vote for BitcoinXT by producing blocks with 0x20000007 set as the version.

- They can also facilitate forking, again, there are clearly different versions being developed right now.

Maybe, we have only a very narrow window open to make such changes to the block layout. Once there are 5 different implementations out there it will be impossible to even change such a small matter.
Clearly you have not read the documentation. Historical blocks of older version numbers are still valid and may be validated inner different rules. At a certain point old version numbers are considered invalid so blocks with older versions are not valid and discarded by the client.

Any version number that is not defined in the client will be validated as the current version which is why xt blocks can currently be accepted by Bitcoin core.


- How come? The header clearly not.
- Also, from what I read the transactions aren't either.
Thus, most of the volume is uncompressed.

However, you are right, compression would require more CPU cycles, but simple reductions could come at a very low cost of maybe 0.1% performance. Nothing in comparison to the saved bandwidth and HDD space.
The hashes cannot be losslessly compressed because hashes are random. I'd they are look say compressed them data is losses and hairs won't match.
newbie
Activity: 25
Merit: 0
August 20, 2015, 09:19:34 AM
#15
The protocol would be more complicated.  The protocol must be able to understand all of the blocks in Bitcoin's history.

I agree, that's is certainly a challenge.

And I see there is no appetite: https://github.com/bitcoin/bitcoin/issues/2278#issuecomment-13198202
newbie
Activity: 25
Merit: 0
August 20, 2015, 08:57:12 AM
#14
Let's not argue.

I am looking at some random blocks. What I see is that a single transactions is a
hash -> Don't know how to compress
index -> why not sort by time, input etc, something 'drawn' from the data?

We have inputs and outputs. For example, frequently we have inputs from the same origin, why not add them up for storage?

Just looking at a set of blocks make we wonder why we can't map them in a way that we just cut of all the leading zeros. ++

What about dynamic data types?

I mean that would be certainly a bit tricker, but could increase the capacity of a block many times.
legendary
Activity: 1246
Merit: 1011
August 20, 2015, 08:49:18 AM
#13
Those 4 bytes have barely consumed over a megabyte in the 300k+ blocks we've mined.

It's copied and stored a million times, why not use the space more economically.

Because this entails:
  • The consumption of some development resources.
  • A more complicated protocol.
  • A greater potential for bugs.

The costs of dealing with this problem dwarf the potential gain of sparing all nodes 21MB/century of storage.
- Are developer resources busy arguing about the lack of capacity on each block?
- The protocol will be simpler because there is one field less or what am I missing?
- Less complexity reduces the potential for bugs, the implementation, as every change, presents a possibility for error.

  • Developers are certainly contending different capacity/decentralisation trade-offs and implementations thereof.  In this context, 21MB/century is wholly insignificant.
  • The blocks would be simpler.  The protocol would be more complicated.  The protocol must be able to understand all of the blocks in Bitcoin's history.
newbie
Activity: 25
Merit: 0
August 20, 2015, 08:38:04 AM
#12
Attempting to remove 3 bytes from the block header would likely render all ASIC mining hardware worthless, at a saving of 200kB per year of disk space.

- I am not an expert, but I googled and couldn't find evidence.

(compression seems very underutilized)

Most of the contents of the block chain are hashes (uncompressible) and ECDSA signatures (uncompressible).

- How come? The header clearly not.
- Also, from what I read the transactions aren't either.
Thus, most of the volume is uncompressed.

However, you are right, compression would require more CPU cycles, but simple reductions could come at a very low cost of maybe 0.1% performance. Nothing in comparison to the saved bandwidth and HDD space.
newbie
Activity: 25
Merit: 0
August 20, 2015, 08:30:24 AM
#11
Version numbers also help to facilitate miner voting. Miners voted for using BIP66 rules by producing v3 blocks. BitcoinXT nodes vote for BitcoinXT by producing blocks with 0x20000007 set as the version.

- They can also facilitate forking, again, there are clearly different versions being developed right now.

Maybe, we have only a very narrow window open to make such changes to the block layout. Once there are 5 different implementations out there it will be impossible to even change such a small matter.
newbie
Activity: 25
Merit: 0
August 20, 2015, 08:16:49 AM
#10
Those 4 bytes have barely consumed over a megabyte in the 300k+ blocks we've mined.

It's copied and stored a million times, why not use the space more economically.

Because this entails:
  • The consumption of some development resources.
  • A more complicated protocol.
  • A greater potential for bugs.

The costs of dealing with this problem dwarf the potential gain of sparing all nodes 21MB/century of storage.

- Are developer resources busy arguing about the lack of capacity on each block?
- The protocol will be simpler because there is one field less or what am I missing?
- Less complexity reduces the potential for bugs, the implementation, as every change, presents a possibility for error.
newbie
Activity: 1
Merit: 0
August 20, 2015, 12:25:10 AM
#9
Attempting to remove 3 bytes from the block header would likely render all ASIC mining hardware worthless, at a saving of 200kB per year of disk space.

(compression seems very underutilized)

Most of the contents of the block chain are hashes (uncompressible) and ECDSA signatures (uncompressible).
staff
Activity: 3458
Merit: 6793
Just writing some code
August 19, 2015, 10:22:34 PM
#8
I think the point of the version numbers is to define which consensus rules a block follows. Older blocks may no longer be considered valid blocks under new rules, but with the version numbers, clients can identify which rules those blocks follow. E.g. version 2 blocks did not necessarily include transactions that followed the BIP66 rules but version 3 does. If the version numbers did not exist, then we would have an issue where some v2 blocks are no longer valid under v3 rules and the clients would get all screwed up because they have historic blocks that don't validate.

Version numbers also help to facilitate miner voting. Miners voted for using BIP66 rules by producing v3 blocks. BitcoinXT nodes vote for BitcoinXT by producing blocks with 0x20000007 set as the version.
legendary
Activity: 1246
Merit: 1011
August 19, 2015, 08:37:33 PM
#7
Those 4 bytes have barely consumed over a megabyte in the 300k+ blocks we've mined.

It's copied and stored a million times, why not use the space more economically.

Because this entails:
  • The consumption of some development resources.
  • A more complicated protocol.
  • A greater potential for bugs.

The costs of dealing with this problem dwarf the potential gain of sparing all nodes 21MB/century of storage.
newbie
Activity: 25
Merit: 0
August 19, 2015, 11:35:00 AM
#6
Ok. At the moment it looks like it's just used to suggest to the client that they may need to update their software;

Quote
1990     // Check the version of the last 100 blocks to see if we need to upgrade:
1991     static bool fWarned = false;
1992     if (!IsInitialBlockDownload() && !fWarned)
1993     {
1994         int nUpgraded = 0;
1995         const CBlockIndex* pindex = chainActive.Tip();
1996         for (int i = 0; i < 100 && pindex != NULL; i++)
1997         {
1998             if (pindex->nVersion > CBlock::CURRENT_VERSION)
1999                 ++nUpgraded;
2000             pindex = pindex->pprev;
2001         }
2002         if (nUpgraded > 0)
2003             LogPrintf("%s: %d of last 100 blocks above version %d\n", __func__, nUpgraded, (int)CBlock::CURRENT_VERSION);
2004         if (nUpgraded > 100/2)
2005         {
2006             // strMiscWarning is read by GetWarnings(), called by Qt and the JSON-RPC code to warn the user:
2007             strMiscWarning = _("Warning: This version is obsolete; upgrade required!");
2008             CAlert::Notify(strMiscWarning, true);
2009             fWarned = true;
2010         }
2011     }
2012 }
https://github.com/bitcoin/bitcoin/blob/87f37e259d6deb52ee464edde7aece687eea97a5/src/main.cpp#L1989

Considering that we now move into a phase where different client's mine on the same chain there is no use for that anymore.
newbie
Activity: 25
Merit: 0
August 19, 2015, 11:28:16 AM
#5
Those 4 bytes have barely consumed over a megabyte in the 300k+ blocks we've mined.

It's copied and stored a million times, why not use the space more economically.

I see 2 options,
- either it's removed by convention, (or maybe used for special transaction, maybe we could randomly gift the privilege of using the space to the node that broadcasted the transaction?)
- compression, just remove it and spoof whatever is required (compression seems very underutilized)
sr. member
Activity: 433
Merit: 267
August 19, 2015, 09:33:39 AM
#4
Ok. At the moment it looks like it's just used to suggest to the client that they may need to update their software;

Quote
1990     // Check the version of the last 100 blocks to see if we need to upgrade:
1991     static bool fWarned = false;
1992     if (!IsInitialBlockDownload() && !fWarned)
1993     {
1994         int nUpgraded = 0;
1995         const CBlockIndex* pindex = chainActive.Tip();
1996         for (int i = 0; i < 100 && pindex != NULL; i++)
1997         {
1998             if (pindex->nVersion > CBlock::CURRENT_VERSION)
1999                 ++nUpgraded;
2000             pindex = pindex->pprev;
2001         }
2002         if (nUpgraded > 0)
2003             LogPrintf("%s: %d of last 100 blocks above version %d\n", __func__, nUpgraded, (int)CBlock::CURRENT_VERSION);
2004         if (nUpgraded > 100/2)
2005         {
2006             // strMiscWarning is read by GetWarnings(), called by Qt and the JSON-RPC code to warn the user:
2007             strMiscWarning = _("Warning: This version is obsolete; upgrade required!");
2008             CAlert::Notify(strMiscWarning, true);
2009             fWarned = true;
2010         }
2011     }
2012 }
https://github.com/bitcoin/bitcoin/blob/87f37e259d6deb52ee464edde7aece687eea97a5/src/main.cpp#L1989
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
August 19, 2015, 09:04:29 AM
#3
Those 4 bytes have barely consumed over a megabyte in the 300k+ blocks we've mined.
sr. member
Activity: 433
Merit: 267
August 19, 2015, 08:32:34 AM
#2
I believe it's just a quick way for unforked software to reject the forked blocks. I think 4 bytes is way too big, but it's a small portion of a block anyway.

BitcoinXT, for example, changes the block version when the fork is activated.
https://github.com/bitcoinxt/bitcoinxt/commit/946e3ba8c7806a66c2b834d3817ff0c986c0811b

Update:
https://en.bitcoin.it/wiki/Protocol_documentation#block

Update2:
Looks like my guess was wrong. I'm not sure where that data is actually used.
Quote
2549 bool CheckBlockHeader(const CBlockHeader& block, CValidationState& state, bool fCheckPOW)
2550 {
2551     // Check proof of work matches claimed amount
2552     if (fCheckPOW && !CheckProofOfWork(block.GetHash(), block.nBits, Params().GetConsensus()))
2553         return state.DoS(50, error("CheckBlockHeader(): proof of work failed"),
2554                          REJECT_INVALID, "high-hash");
2555
 
2556     // Check timestamp
2557     if (block.GetBlockTime() > GetAdjustedTime() + 2 * 60 * 60)
2558         return state.Invalid(error("CheckBlockHeader(): block timestamp too far in the future"),
2559                              REJECT_INVALID, "time-too-new");
2560
 
2561     return true;
2562 }
https://github.com/bitcoin/bitcoin/blob/87f37e259d6deb52ee464edde7aece687eea97a5/src/main.cpp#L2549
newbie
Activity: 25
Merit: 0
August 19, 2015, 05:05:00 AM
#1
I can't see any value in that information.

https://en.bitcoin.it/wiki/Protocol_documentation#version

Field Size   4
Description   version
Data type   int32_t
Comments  Identifies protocol version being used by the node

What would speak against removing the version from the header?

Alternatively, what are the implications of me compiling the current core release with a different version number?
Jump to: