Pages:
Author

Topic: Spin-offs: bootstrap an altcoin with a btc-blockchain-based initial distribution - page 9. (Read 53636 times)

full member
Activity: 209
Merit: 100
Yes, this is of course THE QUESTION.

If spin-offs can be successful then there cannot possibly be a better target for a spinoff tracking clone than Ethereum.

I guess the answer to this is only known to DeathAndTaxes ... i.e. how alive is the code base behind the spinoff idea ?

Will there actually be an Aethereum at all ?
full member
Activity: 157
Merit: 100
So the official Ethereum sale has started.  When can people try out the aethereum wallet etc and see how much aethereum they have ?
legendary
Activity: 1036
Merit: 1000
I have the feeling that you are overthinking this....
Just allow PubKeyHash tx only, and use bitcoin signed messages for claiming. And announce with some time so interested parties can momentarily send funds to a PubKeyHash if they have it in some other form.

Can you elaborate on this?
hero member
Activity: 583
Merit: 505
CTO @ Flixxo, Riecoin dev
I have the feeling that you are overthinking this....
Just allow PubKeyHash tx only, and use bitcoin signed messages for claiming. And announce with some time so interested parties can momentarily send funds to a PubKeyHash if they have it in some other form.
full member
Activity: 209
Merit: 100
Mr. Darkmatter,

See some of the reasons for why this distro mechanism is the *least bad one* in the post above
https://bitcointalksearch.org/topic/m.7676923
member
Activity: 64
Merit: 10
This is a very interesting distribution mechanism.  I will be reading more throughout the day.
donator
Activity: 1218
Merit: 1079
Gerald Davis
Gerald, you've really done a nice job parsing the UTXO.  Am I reading your results correctly that 99.9999% of the consolidated claims can be cast as either P2SH or P2PkH?  And that with a 72 MB snapshot file, we can retain ALL of the information related to the wealth distribution?  (Developers can then choose which dust limits they apply and whether to support complex scripts, reducing the size of the snapshot file even further).  

Correct.  It would be roughly 70 MB depending on the specific format used. The raw script and native multisig claims are a rounding error and won't significantly affect the size of the snapshot. There are ~3 million claims so the size will depend on the size of the P2SH and P2PkH claim records.  At a minimum this would be 20 bytes for the identifier plus 2 to 3 bytes avg (varint) to represent the value.  At most it would be 29 bytes per record (1 byte for type, 20 byte identifier, 8 bytes for value).  

One final optimization option would be to refactor all the native multisig outputs to their P2SH equivalents.  Users could still make claims in a trust free manner but the original scripts should probably be retained in a separate file (probably along with all other refactored outputs).  The refactor file wouldn't be needed by clients but it would be useful for web services which assist users in identifying if they have a claim and what steps they need to take to redeem that claim.

For the record here are the only "valid" unspent outputs which are still raw scripts.  This just means I haven't yet been able to provably show them as unspendable or refactor them to either a Pay2PubKeyHash or Native-Multisig claims.  It probably is possible to reduce this list of exceptions further.

Code:
TxId:Index                                                            Type                   Size          Value
----------------------------------------------------------------------------------------------------------------
0ea6987a7dd49887bea87ae2106e4ef99d29799047a8fb10906be93ba0cbc719:0    RawScript               173         100000
66747648ef92d78bb36c267c0f444e6df96e44f463a2f92541483fcb8e882b27:1    RawScript                54          10000
9837a637931f74df1cb52b1045e479e4d7065f72db4d449d732211eb0e5cfd4c:0    RawScript               180          55000
9837a637931f74df1cb52b1045e479e4d7065f72db4d449d732211eb0e5cfd4c:1    RawScript               180          55000
af32bb06f12f2ae5fdb7face7cd272be67c923e86b7a66a76ded02d954c2f94d:0    RawScript                35      100000000
c4b46c5d88327d7af6254820562327c5f11b6ee5449da04b7cfd3710b48b6f55:0    RawScript                35          20000
1890467d5d65519a586ccd16e9e3a893a803467cd28941591e72bc99dec9ac5d:0    RawScript               174         100000
9f17f3ce43019c24baa6d679edfdddeada856f617cd9c1f6008d49be4542b768:0    RawScript               364        1100000
cd2dacbd05389580cb569985b3a8b1db67ea6cc84371223590e241a5026d0a8a:0    RawScript               209          15000
09857a1abf60f04971af09b36be4fb0d0a6143bc0af9c4cfcebf99ed6f97df9a:1    RawScript                60          10000
fb01987b540ec286973aac248fab643de82813af452d958056fee8de9f4535ab:0    RawScript                35          20000
daae4951c39c0198166a1d9e2eef542f17412883710567a22ad8dccbe2476ab9:0    RawScript               180          60000
daae4951c39c0198166a1d9e2eef542f17412883710567a22ad8dccbe2476ab9:1    RawScript               180          60000
702c36851ed202495c2bec1dd0cefb448b50fafd3a5cdd5058c18ca53fc2c3d1:0    RawScript                35          20000
faf8989ed87c5a667a1ead813aea718727e01767c124193297eaf409ff4645e5:1    RawScript                35         500000
75bb6417afc7500a6389201a67bfc2428a1241170a214bbf6833a389191036fe:0    RawScript               182          55000
75bb6417afc7500a6389201a67bfc2428a1241170a214bbf6833a389191036fe:1    RawScript               182          55000

To refactor valid but not non-standard outputs and to remove invalid outputs I first reduced the raw scripts in the UTXO to a set of templates.  Then updated the parser to handle these templates.  The remaining raw script outputs are one of the following templates which the parser handles as "unknown" and keeps as a raw script.

Code:
OP_HASH256  OP_EQUAL
OP_DUP OP_HASH256 OP_EQUAL OP_NOTIF OP_HASH256 OP_EQUAL OP_ENDIF OP_DROP
OP_SHA256 OP_EQUAL OP_SWAP OP_CHECKSIG OP_BOOLOR OP_SWAP OP_CHECKSIG OP_AND OP_VERIF
OP_SHA256 OP_EQUAL OP_SWAP OP_CHECKSIG OP_BOOLOR OP_SWAP OP_CHECKSIG OP_BOOLAND
OP_SIZE OP_WITHIN OP_SWAP OP_SHA256 OP_EQUAL OP_BOOLAND OP_SWAP OP_CHECKSIGVERIFY OP_SWAP OP_CHECKSIG OP_BOOLOR
OP_DUP OP_SIZE OP_WITHIN OP_SWAP OP_SHA256 OP_EQUAL OP_BOOLAND OP_IF OP_DROP OP_ELSE OP_CHECKSIGVERIFY OP_ENDIF OP_CHECKSIG
OP_IF 2 2 OP_CHECKMULTISIGVERIFY OP_ELSE 2 2 OP_CHECKMULTISIGVERIFY OP_ENDIF
OP_DUP OP_HASH160 OP_EQUALVERIFY OP_SWAP OP_DUP OP_HASH160 OP_EQUALVERIFY 2 OP_ROT OP_ROT 2 OP_CHECKMULTISIG
OP_HASH256 OP_EQUALVERIFY OP_DUP OP_HASH160 OP_EQUALVERIFY OP_CHECKSIG




Some of these outputs are "insecure" in that once the true owner tries to spend them (or sign a claim), anyone else could modify the output and steal the coins and/or claim.  One example is  af32bb06f12f2ae5fdb7face7cd272be67c923e86b7a66a76ded02d954c2f94d:0.  The output is encumbered only by a hash (no signature required).  The ScriptSig needed to produce a valid txn is thus just some value which when hashed produces the hash value in the PkScript (output script).  The problem is that when the owner attempts to spend those outputs any other node (including a miner) could modify the txn substituting their own output instead.  Since the required input is just a value with no signature there is no way to prevent this (other than trusting a miner in secret).  I am not exactly sure how they should be handled but I am leaning towards truncating them for security reasons.

Quote
Also note the "dent" in the distribution due to the dust limit.  If the dust-limit had been in effect since 2009, we'd have a much cleaner UTXO set.  (I expect those ~million small claims to the left of the yellow band were probably the result of people experimenting on the main network rather than the testnet).

A large portion of the sub dust txns (especially the 1 satoshi variety) are spam from Satoshi's Dice.  The service used a 1 satoshi (effectively unspendable) output to notify betters of a failed bet.  They were a significant reason for the introduction of the dust output rules.  At one time the 1 satoshi outputs represented almost half of the UTXO.   Actual spendable outputs tend to grow rather slowly so without the dust rules it is entirely possible at some point the overwhelming majority of the UTXO would have been dust spam.  Despite the controversy at the time the dust rules are both logical (what purpose is there in creating outputs which are marked as spendable, stored by all nodes forever, and yet are uneconomical to spend) and necessary to prevent the UTXO from becoming utterly unmanageable.
legendary
Activity: 1162
Merit: 1007
Gerald, you've really done a nice job parsing the UTXO.  Am I reading your results correctly that 99.9999% of the consolidated claims can be cast as either P2SH or P2PkH?  And that with a 72 MB snapshot file, we can retain ALL of the information related to the wealth distribution?  (Developers can then choose which dust limits they apply and whether to support complex scripts, reducing the size of the snapshot file even further).  

For those that might be interested, here is a bar chart of DeathAndTaxes' results.  You can clearly see that the vast majority of the claims represent only a small portion of the total bitcoin wealth; and conversely, a small portion of the claims represents the vast majority of the wealth.  Also note the "dent" in the distribution due to the dust limit.  If the dust-limit had been in effect since 2009, we'd have a much cleaner UTXO set.  (I expect those ~million small claims to the left of the yellow band were probably the result of people experimenting on the main network rather than the testnet).

legendary
Activity: 1036
Merit: 1000
I'd like to refine the notion in the OP of "most efficient coin distribution," why Bitcoin has something vastly closer to it and why other coins can't hope to catch up anytime soon.

I think the answer requires a clear understanding of the crucial role investors play in society, so here I will attempt to explain that for those who've never considered it:

Simply put, a good investor can predict the future better than other people can. If, for example, you're a good investor and you predict a water shortage in a few years you can buy water rights now and have a good chance of selling them later for a profit. However, as a side effect, you bid up the price of water for everyone else, which incentivizes people to conserve water, incentivizes water producers to invest more into water production, motivates new companies to get into the water business, prompts more people to major in related fields in college, etc. The water shortage you correctly predicted will be mitigated by these actions. In this way, "greedy speculators" can even save lives. Shocked

As a good investor, you would likely sell when the water production infrastructure has become too big and society would benefit more from some of the resources/manpower being diverted elsewhere, and by pushing down the price you create exactly the right incentive structure for that. Being as most people aren't good at predicting the future and you are, you help society manage resources more efficiently than it would without you.

Not only do you help society manage resources more efficiently, even to the point of saving lives, but the better you are at that the more power you will gain to do it on a larger and larger scale since you keep getting wealthier and have more power to move markets. Think of how many lives Jim Rogers has probably saved in Africa by being such a successful and large-scale commodities investor! Conversely, if you're bad at predicting the future like most people are, you will tend to lose money and damage society - in the case of water by aggravating the shortages and bankrolling the oversupply gluts - but your ability to affect society will diminish as you lose your wealth and thereby effectively lose the "right" to divert resources in society.

In this way, the freer the market, the more it optimizes toward moving the ability to direct resources into the most capable hands: the most highly skilled investors (future prediction specialists) and the best entrepreneurs - and these people are exactly the people we want getting rich. Without them, we'd all be a lot less rich ourselves, or maybe even dead or never born because the resources for our parents to raise us weren't sufficient due to systemic mismanagement. Notice this is not centralized government management; every argument for "why we need central planners" is answered in the concepts touched on above: there is a huge distributed network of specialists at predicting the future and putting resources to the most efficient known uses. These are your planners, bringing all the benefits of having experts plan stuff, but not central planners in government offices.

Over time (not immediately), this process vests greater control over the societal ledger with those people who benefit society the most with their skills, and of course these skilled people are also motivated to do this because it directly benefits them by making them rich.  

Note that even short-term speculators and day traders - if they are good - also play a valuable role: they dampen short-term and even intraday volatility by buying the dips and selling the peaks. It's the unskilled speculators that create the volatility, but those people tend to lose their shirts pretty fast. That's a major reason why there's less volatility as a market matures, with the unskilled and therefore volatility-increasing speculators getting smaller and smaller while the skilled and therefore volatility-reducing speculators get bigger and bigger, thereby making the market ever smoother. (Note, however, that even though the Bitcoin market is maturing, it is not getting that much less volatile, simply because the market expands exponentially, so that the immature portion of it is always bigger than the older mature portion.)

Now Bitcoin investment is a little more abstract than water investments. We cannot say that investing in Bitcoin before it got scarce helped society avert a water shortage. However, it incentivized miners to get in and secure the network to a level commensurate with the market cap, it motivated people like Tony Gallippi to divert his talents and time toward starting BitPay, and it generally created the incentive structure to support many other similar infrastructural buildouts. Of course it also attracted media and other much-needed public attention, as well as that of academics and of course investors.

It is obvious from reading the speculation forum (and reddit) that only those with the conviction to hodl actually hodl, and those strongly tend to be people who both understand Bitcoin at a deep level (aren't spooked by FUD) and understand the necessary aspects of investing (aren't spooked by volatility because they understand the dynamics of the binary bet and exponential growth). When bitcoin went from under a dollar to over $30, then down to $2 over the course of a single year, the market was shaking out the weak hands - the poor speculators - in a spectacularly violent fashion. But what is this volatility optimizing for - that is, who's this market making rich? Here are some obvious ones:

1) Good hodlers, people who deeply grok Bitcoin and its investment aspects, who will likely dishoard in an organized fashion at peaks, dampening the bubbles and not contributing to the panic knifedowns nor to the FUD-based selloffs

2) Good traders who see the big picture, thereby decreasing the volatility

3) Perhaps warranting their own category are the people who have extremely deep or highly technical understanding of Bitcoin, to the level of what would almost seem to others like "insider information" (though there's nothing bad about this; in fact it's a good thing, if you follow the reasoning outlined above)

The main point is, besides some people who got lucky, the Bitcoin ledger is held mainly by people who most strongly believe in and most intimately understand Bitcoin (and understood it early!) and the relevant investing principles, or those who have actually contributed materially to the ecosystem. These people are far more likely than others to know where resources should be diverted to in the economy for maximum benefit to the community, and the maturity of the Bitcoin ledger ensures that they actually have the ability to command such resources (i.e., they have accumulated a lot of bitcoins to invest in infrastructure, and they have relatively little competition from unskilled investors who fund dead-end projects that waste dev time, etc.).

TL;DR: While the debate over whether the Bitcoin ledger is more or less "fair" than some other distribution may never be settled to everyone's satisfaction, we can clearly see that it is in a very useful sense the most efficient distribution for the general benefit of the community, thanks to the eminently helpful role that skilled investors play in society and how the extreme volatility over the years has continually shaken out the unskilled investors who would waste the community's resources.

If we imagine even a new amazing coin that is magically perfectly distributed to everyone, 100 coins for every person in the world, we can easily see that the initial investment decisions will be terribly skewed since all sorts of unskilled people will be throwing their coins at things and diverting resources and labor into useless, go-nowhere projects. Also, volatility would be way worse, and panic sell-offs due to FUD/misunderstandings would be horrendous. A ledger needs to grow organically through a market process, or else it will face challenges that it would only be ready to face some years later when the market is more solidified in strong hands by the above process.

By the way, are there any other things the Bitcoin ledger has optimized for (any other type of person the history of Bitcoin price movements has made especially rich (or especially shaken out))?
donator
Activity: 1218
Merit: 1079
Gerald Davis
One last update before the weekend.  Compared to the one above I hashed all PubKeys to their PubKeyHashes.  This has the effect of saving 13 bytes per record and it reduces the total number of records by another ~1,600.

Code:
     Compact Distribution     Num Claims             Claim Value
---------------------------------------------------------------------
             Valid Claims      2,981,667      12,879,829.77705084 BTC
           Invalid Claims         16,744           2,610.01397770 BTC
             Total Claims      2,998,411      12,882,439.79102854 BTC

Code:
       Valid Distribution     Num Claims             Claim Value              Pct of Valid        Pct of Total
--------------------------------------------------------------------------------------------------------------
               PubKeyHash      2,962,366      12,862,099.17235302 BTC            99.8623 %           99.8421 %
               ScriptHash         14,252          17,719.13794714 BTC             0.1376 %            0.1375 %
           NativeMultiSig          5,033              10.44409149 BTC             0.0001 %            0.0001 %
                RawScript             16               1.02265919 BTC             0.0000 %            0.0000 %
               
     Invalid Distribution     Num Claims             Claim Value              Pct of Valid        Pct of Total
--------------------------------------------------------------------------------------------------------------
       UnspendableOpFalse              1           2,609.36304319 BTC             0.0203 %            0.0203 %
     UnspendableP2PoolBug              1               0.60280235 BTC             0.0000 %            0.0000 %
 UnspendableInvalidOpcode             20               0.04530933 BTC             0.0000 %            0.0000 %
 UnspendableInvalidPubKey         16,696               0.00242283 BTC             0.0000 %            0.0000 %
     UnspendablePushError              1               0.00040000 BTC             0.0000 %            0.0000 %
     UnspendableZeroValue             25               0.00000000 BTC             0.0000 %            0.0000 %

Code:
           Minimum Claim     Num Claims             Claim Value               Pct of Valid        Pct of Total
--------------------------------------------------------------------------------------------------------------
                0.01 bits      2,981,667      12,879,829.77705084 BTC           100.0000 %           99.9797 %
                0.10 bits      2,510,787      12,879,829.76979222 BTC           100.0000 %           99.9797 %
                1.00 bits      2,295,098      12,879,829.70929688 BTC           100.0000 %           99.9797 %
                5.43 bits      2,261,743      12,879,829.61522343 BTC           100.0000 %           99.9797 %
               10.00 bits      2,245,744      12,879,829.48921353 BTC           100.0000 %           99.9797 %
               54.30 bits      2,152,928      12,879,826.75869942 BTC           100.0000 %           99.9797 %
              100.00 bits      2,010,915      12,879,817.03184494 BTC            99.9999 %           99.9796 %
            1,000.00 bits      1,565,261      12,879,651.11951631 BTC            99.9986 %           99.9784 %
donator
Activity: 1218
Merit: 1079
Gerald Davis
I removed zero value outputs, excluded 20,412 provably unspendable outputs, and consolidated outputs to the same signer into a single claim.  Here are some updated distributions (which if you are like me and like stats may find interesting beyond just this concept).  

All stats are as of  block 305,303 but I doubt it has changed significantly through the most recent block.  I would expect the number of non-standard scripts, native-multisig, and PubKey outputs to decline as they are spent and replaced with an increasing number of P2PkH and P2SH outputs which means that snapshots should become more efficient over time.

Code:
Distribution                      Num Claims             Claim Value         
------------------------------------------------------------------------
Valid Claims                      2,983,463      12,879,829.77705084 BTC
Invalid Claims                       16,744           2,610.01397770 BTC
Total Claims                      3,000,207      12,882,439.79102854 BTC

Code:
          Distribution            Num Claims             Claim Value             Pct of Valid        Pct of Total
-----------------------------------------------------------------------------------------------------------------
     0.01 to       0.10 bits        470,881               0.00725863 BTC             0.0000 %            0.0000 %
     0.11 to       1.00 bits        238,560               0.06057174 BTC             0.0000 %            0.0000 %
     1.01 to       5.43 bits        215,689               0.06049534 BTC             0.0000 %            0.0000 %
     5.44 to      10.00 bits         35,806               0.09588866 BTC             0.0000 %            0.0000 %
    10.01 to      54.30 bits         49,355               0.22008615 BTC             0.0000 %            0.0000 %
    54.31 to     100.00 bits        234,955              12.46195347 BTC             0.0001 %            0.0001 %
   100.01 to   1,000.00 bits        447,226             166.14026722 BTC             0.0013 %            0.0013 %
 1,000.01 to  10,000.00 bits        432,314           1,535.80230367 BTC             0.0119 %            0.0119 %
10,000.01 to 100,000.00 bits        541,229          16,031.22170978 BTC             0.1245 %            0.1244 %
     >0.1 BTC to       1 BTC        275,469          96,829.33301374 BTC             0.7518 %            0.7516 %
      >1  BTC to      10 BTC        203,095         572,033.28805785 BTC             4.4413 %            4.4404 %
     >10  BTC to     100 BTC         98,567       3,531,749.15046627 BTC            27.4208 %           27.4152 %
    >100  BTC to   1,000 BTC         13,169       2,997,803.73298225 BTC            23.2752 %           23.2705 %
  >1,000  BTC to  10,000 BTC          1,415       3,111,492.00522526 BTC            24.1579 %           24.1530 %
>10,0000  BTC to 100,000 BTC             97       2,242,337.41607392 BTC            17.4097 %           17.4062 %
>100,000 BTC                              2         309,838.93715729 BTC             2.4056 %            2.4051 %

Code:
           Minimum Claim          Num Claims             Claim Value             Pct of Valid        Pct of Total
-----------------------------------------------------------------------------------------------------------------
                   0.01 bits      2,983,463      12,879,829.77705084 BTC           100.0000 %           99.9797 % (all valid claims)
                   0.10 bits      2,512,582      12,879,829.76979221 BTC           100.0000 %           99.9797 %
                   1.00 bits      2,296,893      12,879,829.70929687 BTC           100.0000 %           99.9797 %
                   5.43 bits      2,263,537      12,879,829.61522062 BTC           100.0000 %           99.9797 % (new dust treshold)
                  10.00 bits      2,247,538      12,879,829.48921072 BTC           100.0000 %           99.9797 % (new minimum fee)
                  54.30 bits      2,154,634      12,879,826.75708845 BTC           100.0000 %           99.9797 % (old dust treshold)
                 100.00 bits      2,012,583      12,879,817.02725725 BTC            99.9999 %           99.9796 % (new minimum fee)
               1,000.00 bits      1,565,357      12,879,650.88699003 BTC            99.9986 %           99.9784 % (1 mBTC)
              10,000.00 bits      1,133,043      12,878,115.08468636 BTC            99.9867 %           99.9664 %
             100,000.00 bits        591,814      12,862,083.86297658 BTC            99.8622 %           99.8420 %
           1,000,000.00 bits        316,345      12,765,254.52996284 BTC            99.1104 %           99.0903 % (1 BTC)
          10,000,000.00 bits        113,250      12,193,221.24190499 BTC            94.6691 %           94.6499 % (10 BTC)
         100,000,000.00 bits         14,683       8,661,472.09143872 BTC            67.2483 %           67.2347 % (100 BTC)
       1,000,000,000.00 bits          1,514       5,663,668.35845647 BTC            43.9732 %           43.9643 % (1,000 BTC)
      10,000,000,000.00 bits             99       2,552,176.35323121 BTC            19.8153 %           19.8113 % (10,000 BTC)
     100,000,000,000.00 bits              2         309,838.93715729 BTC             2.4056 %            2.4051 % (100,000 BTC)
   1,000,000,000,000.00 bits              0               0.00000000 BTC             0.0000 %            0.0000 % (1,00,000 BTC)

A large portion of the "custom scripts" can be refactored to be a standard template or they are invalid.  I now have only 20 scripts which are potentially valid and can't be reclassified.  Another interesting thing is there are a large number of native-multisig outputs which have one or more invalid PubKeys.  This allows them to be refactored.   I parsed all multisig outputs and removed invalid pubkeys.  If the number of valid keys is less than m in m-of-n then the output is unspendable.  If the number of valid keys is reduced to 1 and the script is 1-of-n then it can be refactored to a Pay2PubKeyHash script.  I haven't hashed PubKeys (to convert them to PubKeyHashes) yet.  Doing so may allow further consolidation of the output set (for keys where there are both PubKeyHash and PubKey outputs for the same key).

The size of the snapshot will depend heavily on the min claim value used. If we assume the average record is 24 bytes (20 byte identifier and 4 byte for value) that would make a full blockchain snapshot ~72MB.  It would be up to the spinoff developer but I believe to reduce size a min claim of 10, 100 or 1,000 bits is defendable.   That would result in a snapshot of 54 MB (25% reduction), 48 MB (33% reduction), and 38 MB (47% reduction) respectively.  Even with a minimum of 1,000 bits the included claims represents > 99.99% of the spendable BTC in the blockchain.

donator
Activity: 1218
Merit: 1079
Gerald Davis
I have been too busy to do as much with this in the past two weeks but I have made some progress on a UTXO parser.  I believe by using PubKey recovery the storage of the snapshot can be greatly simplified.

PubKeyHash outputs are already in hashed format.
PubKey outputs can be reduced to PubKeyHash. (13 bytes per record savings)
Native Multisig outputs can be reduced to a set of PubKeyHashes and identifier will indicate the number of keys (13 to 39 bytes per record savings).
1-of-1 multisig outputs can be reclassified as PubKeyHash.

This also has a minor additional cost savings in combining outputs of the same key in different formats (i.e. payments to PubKey and PubKeyHash of the same key).  The parser is still a work in progress but it looks like the snapshot would be ~50MB (exclude outputs below dust threshold, hash all keys, store values as varint, use identifier to id template, and store unrecognized raw scripts).  Time to parse (with highly unoptimized debug code) is about 30 minutes.  Optimized it should be significantly faster but I was happy to see first iteration was less than an hour.


Breakdown with standard multisig scripts broken out of raw scripts

Code:
Type           Num_Outputs   Value_Outputs        Length
------------------------------------------------------------
PubKeyHash     11,236,706    11,018,045.72853340      20
PubKey           41,665     1,844,047.73760491      33
ScriptHash         19,222        17,719.13794714      20
NativeMultiSig   44,440            12.91792116     122
RawScript             720*        2,614.26902186      68

UXTO Records:         3,297,375
Unique Outputs:      11,342,753
Output Value (BTC):  12,882,439.7910285
* Of the 720 as of yet unidentified raw scripts there are only 259 with a value of >1 satoshi. 



Breakdown of Native Multisig outputs
Code:
Type                 Num_Outputs   Value_Outputs  Length
------------------------------------------------------------
NativeMultiSig 1Of1          375      0.03712087   37
NativeMultiSig 1Of2       26,771      3.08014357   75
NativeMultiSig 1Of3       17,268      0.31637593  198
NativeMultiSig 2Of2           11      0.02995760   97
NativeMultiSig 2Of3           11      9.35391400  139
NativeMultiSig 3Of3            2      0.10000000  105

Unique identifiers (combine all outputs to the same "address")
Code:
Distribution           NumClaims  TotalClaimValue
-----------------------------------------------------
Total Claims           3,013,999  12,882,439.79102850
Claims <  5.43 bits      737,239           0.16200128
Claims < 10.00 bits      753,238           0.28801118
Claims < 54.30 bits      846,142           3.02013345
Claims <100.00 bits      988,422          12.77187202
 
Claims >=   5.43 bits  2,276,760  12,882,439.62902720
Claims >=  10.00 bits  2,260,761  12,882,439.50301730
Claims >=  54.30 bits  2,167,857  12,882,436.77089500
Claims >= 100.00 bits  2,025,577  12,882,427.01915650

In the above distribution no redundancy check was done.  To further compact the output set one could:
1) organize all multi-sig outputs in the same order (sort by PubKey or PubKeyHash)
2) hash PubKeys and compare against existing PubKeyHashes
3) convert all 1 of 1 multisig to Pay2PubKeyHash and check against existing keys.  
4) Check for uncompressed and compressed versions of the same key
5) Remove obviously unspendable outputs

donator
Activity: 1218
Merit: 1079
Gerald Davis
It is somewhat surprising.  I would have imagined that the cold storage of major exchanges would be secured by m of n  P2SH scripts.  We know that none of them do that simply because the total value encumbered by all ScriptHashes & Raw scripts is <0.16% of all BTC.

Haven't had much free time to do more analysis of the blockchain but found an interesting test case in this tx:
https://blockchain.info/tx/5143cf232576ae53e8991ca389334563f14ea7a7c507a3e081fbef2538c84f6e

This tx has (or had as of the snapshot block) 3073 unspent outputs.  It is the largest tx when measured by the number of unspent outputs.
legendary
Activity: 1162
Merit: 1007
Distribution of unspent outputs (haven't combined entries to the same identity).  RawScript includes "native" multisig.  It hasn't been broken out yet.

Code:
Type                 NumOutputs    OutputValue  AvgScriptLen
------------------------------------------------------------
PubKeyHash           11,236,706    11,018,046            20
PubKey                   41,665     1,844,048            33
ScriptHash               19,222        17,719            20
RawScript                45,160         2,627           121

UXTO Records:    3,297,375
Unique Outputs: 11,342,753
Output Value:   12,882,440 BTC

As expected the PubKeyHash, PubKey, and ScriptHash outputs combined make up 99.98% of the outstanding coins.

Nice work!  I don't know if anyone's ever compiled this information before, do you?  

I am actually surprised that there's only 17,719 BTC (0.138%) in P2SH outputs.  I would have thought some very large accounts (e.g., funds held by BitStamp) would be secured using multisig; however, this shows that's probably not the case.  

Here's how it looks on a pie chart:

donator
Activity: 1218
Merit: 1079
Gerald Davis
Distribution of unspent outputs (haven't combined entries to the same identity).  RawScript includes "native" multisig.  It hasn't been broken out yet.

Code:
Type                 NumOutputs    OutputValue  AvgScriptLen
------------------------------------------------------------
PubKeyHash           11,236,706    11,018,046            20
PubKey                   41,665     1,844,048            33
ScriptHash               19,222        17,719            20
RawScript                45,160         2,627           121

UXTO Records:    3,297,375
Unique Outputs: 11,342,753
Output Value:   12,882,440 BTC

As expected the PubKeyHash, PubKey, and ScriptHash outputs combined make up 99.98% of the outstanding coins.
donator
Activity: 1218
Merit: 1079
Gerald Davis
You obviously have way more experience than me with the bitcoin technical details, but I thought the reason there were two forms for compressed public keys 02 and 03 was because there are two possible y values for a given x value.  Aren't you losing a small amount of information this way?  And would there be anything wrong with converting uncompressed public keys into the standard compressed format?

So I can confirm that uncompressed keys are stored in compressed format using 0x04 and 0x05 which work the same as 0x02 and 0x03 do for compressed keys to identify even or odd.

Code:
TxId : Index                                                         Block#          Value               Type  SLen Script
0e3e2357e806b6cdb1f70b54c3a3a17b6714ee1f0e68bebb44a74b1efd512098:0        1     5000000000 PubKeyUncompressed    33 0496b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c52
9b0fc92260312ce44e74ef369f5c66bbb85848f2eddd5a7a1cde251e54ccfdd5:0        2     5000000000 PubKeyUncompressed    33 057211a824f55b505228e4c3d5194c1fcfaa15a456abdf37f9b9d97a4040afc073

To decode:
If prefix is 0x04, the next 32 bytes are the x value, compute the even y value and append to produce the 65 byte uncompressed key.
If the prefix is 0x05, the next 32 bytes are the x value, compute the odd y value, append, and change prefix byte to 0x04 to produce the 65 byte uncompressed key.

Compressed -> Uncompressed
046b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c52 -> 0496b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c52da7589379515d4e0a604f8141781e62294721166bf621e73a82cbf2342c858ee
057211a824f55b505228e4c3d5194c1fcfaa15a456abdf37f9b9d97a4040afc073     -> 047211a824f55b505228e4c3d5194c1fcfaa15a456abdf37f9b9d97a4040afc073dee6c89064984f03385237d92167c13e236446b417ab79a0fcae412ae3316b77

donator
Activity: 1218
Merit: 1079
Gerald Davis
Need some sleep but here is a preview of UXTO Parser.

Code:
E:\Repos\UxtoParser\UxtoParser.Demo\bin\Debug>UxtoParser.Demo.exe decodetx 14015bd586c0c7a28979ca294b114441f23bfc97be17cd6077b9e12e2709fec3
Raw Key:
63c3fe09272ee1b97760cd17be97fc3bf24144114b29ca7989a2c7c086d55b0114

Raw Value:
010c013600946cb2e08075bcbaf157e47bcb67eb2b2339d24268800f514104e6da9c60084b43d28266243c636bcdaf4d8f17b5954e078d2dece7d4659e0dee3419a40b939c24ac813c692a323ca5207a6fb387ffe28e48f706c95dbf46648f210226cb0561011d9045f6371cb09086ba7148d9942328bcf135ccdda921022eac137ab02d826df0af54e92a352945c9892df6cd77f1a7c390fc82c8b0edea53ae8f9d01

Decoded Record
Version: 1
Code: 12 (IsCoinbase?: False  OutZeroSpendable?: False OutOneSpendable?: True BitmaskLen: 1)
Bitmask: 0x01
BlockHeight: 265985

TxId : Index                                                                Value               Type  SLen Script
14015bd586c0c7a28979ca294b114441f23bfc97be17cd6077b9e12e2709fec3:1           6000         PubKeyHash    20 946cb2e08075bcbaf157e47bcb67eb2b2339d242
14015bd586c0c7a28979ca294b114441f23bfc97be17cd6077b9e12e2709fec3:2          12000          RawScript   137 514104e6da9c60084b43d28266243c636bcdaf4d8f17b5954e078d2dece7d4659e0dee3419a40b939c24ac813c692a323ca5207a6fb387ffe28e48f706c95dbf46648561011d9045f6371cb09086ba7148d9942328bcf1dd78cb6edb35ccdda921022eac137ab02d826df0af54e92a352945c9892df6cd77f1a7c390fc82c8b0edea53ae

Code:
E:\Repos\UxtoParser\UxtoParser.Demo\bin\Debug>UxtoParser.Demo.exe decodetx e6556c82129245c5a5bc1e71151588c3c8ef5e86abd41940ad61178f5374a652
Raw Key:
6352a674538f1761ad4019d4ab865eefc8c3881515711ebca5c5459212826c55e6

Raw Value:
010a118edad22b005a25e8152cdeae9419b34595104e874bf93a8f0183a4a7550040d2918f30a7d55e8b7000983b24b1116179e27a838488be7f00415435a335bc5aa1417bcdd8dea10ce725ed7aaf91cd68

Decoded Record
Version: 1
Code: 0x10 (IsCoinbase?: False  OutZeroSpendable?: True OutOneSpendable?: False BitmaskLen: 1)
Bitmask: 0x11
BlockHeight: 305000

TxId : Index                                                                Value               Type  SLen Script
e6556c82129245c5a5bc1e71151588c3c8ef5e86abd41940ad61178f5374a652:0        3662099         PubKeyHash    20 5a25e8152cdeae9419b34595104e874bf93a8f01
e6556c82129245c5a5bc1e71151588c3c8ef5e86abd41940ad61178f5374a652:2        1000003         PubKeyHash    20 40d2918f30a7d55e8b7000983b24b1116179e27a
e6556c82129245c5a5bc1e71151588c3c8ef5e86abd41940ad61178f5374a652:6      120487026         PubKeyHash    20 415435a335bc5aa1417bcdd8dea10ce725ed7aaf
donator
Activity: 1218
Merit: 1079
Gerald Davis
You obviously have way more experience than me with the bitcoin technical details, but I thought the reason there were two forms for compressed public keys 02 and 03 was because there are two possible y values for a given x value.  Aren't you losing a small amount of information this way?  And would there be anything wrong with converting uncompressed public keys into the standard compressed format?

Good point.  Need to dig deeper.  Actually I am pretty sure if uses a prefix of 0x05 for half the compressed uncompressed keys to make that distinction.  I haven't checked the code yet but I think that is how it handles it (and it may explain the mystery of the 0x05 prefix).  You might be right about just compressing all keys and using a 0x02 and 0x03 prefix.   Off the top of my head I can't think of a reason why that wouldn't work but they didn't go that route in the UXTO and I assume there is a reason for that.  Back to digging into the guts of the code again.  

As a side not, the hilarity of "compressed uncompressed keys" Smiley  Please future altcoin developers just use compressed keys.  Period.  The protocol has no understanding of anything but compressed keys.
legendary
Activity: 1162
Merit: 1007
So I started working on a UXTO parser instead of a blockchain parser.

Yes, I've been following.  This is great!  I agree that it's probably best for the parser to work off the UXTO set as well.  

Quote
...one thing I realized is that even if we don't take a hash of the PubKeys we only need to store 33 bytes even for uncompressed PubKeys.

You obviously have way more experience than me with the bitcoin technical details, but I thought the reason there were two forms for compressed public keys 02 and 03 was because there are two possible y values for a given x value.  Aren't you losing a small amount of information this way?  And would there be anything wrong with converting uncompressed public keys into the standard compressed format?

donator
Activity: 1218
Merit: 1079
Gerald Davis
So I started working on a UXTO parser instead of a blockchain parser.  The UXTO contains all the information that is needed to make the snapshot ledger and it is far more compact (~700MB vs 15GB).  The UXTO (contained in chainstate folder = leveldb database) is the set of unspent outputs as of the current block.  To be useful the parser would need to the ability to be created from an arbitrary block but for now I am just working from the current block.

The format is not well documented and rather dense but I am making sense of it.
https://bitcointalksearch.org/topic/m.7243116

Working through the UXTO format in the core client, one thing I realized is that even if we don't take a hash of the PubKeys we only need to store 33 bytes even for uncompressed PubKeys.

For example given a PubKey 0496b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c52da7589379515d 4e0a604f8141781e62294721166bf621e73a82cbf2342c858ee

The 0x04 prefix indicates it is uncompressed which means pefix byte followed by 32 bytes for x value and 32 bytes for the y value.  The y values can be computed from the x value and the secp256k1 curve so dropping the y value can save us 32 bytes.  This won't be confused with an actual compressed PubKey as they will have a prefix of 0x02 or 0x03.

Store:
0496b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c5201

Compute y value:
da7589379515d4e0a604f8141781e62294721166bf621e73a82cbf2342c858ee

Reconstruct Uncompressed PubKey:
0496b538e853519c726a2c91e61ec11600ae1390813a627c66fb8be7947be63c52da7589379515d 4e0a604f8141781e62294721166bf621e73a82cbf2342c858ee
Pages:
Jump to: