Author

Topic: Block chain question (Read 2397 times)

legendary
Activity: 1526
Merit: 1134
March 28, 2011, 07:06:48 AM
#17
Still, I don't know how the simplifed system can discover payments that are sent to it from afar (as opposed to in person) without downloading full blocks before triming them.  Otherwise, how does it know that Grandma sent you 50 bitcoins with your eCard on your birthday this year?

You're right, it has to download full blocks. As block sizes increase I suspect we'll have to add support for server-side filtering. Connect to 10 peers, provide a script template and a block number, the peers reply with merkle branches of all matching transactions in blocks >= that number. You can then fetch the headers and the transactions themselves in the usual manner and have the same assurances. The only attack is denial of service, but anyone who can control your internet connection can already do that anyway.

I don't think this is a very hard patch to write, but we're nowhere near the point where it's important to do yet.
legendary
Activity: 1526
Merit: 1134
March 28, 2011, 07:02:17 AM
#16
...However, I have no idea what you just said.

Sorry. I'm hoping to record a talk at some point that explains BitCoin from a technical perspective. Maybe some time this quarter.

Quote
Here are my two concerns:
1) Over time, will a client continue to require more and more local storage (or CPU, or any particular resource) in order to operate?  If so, is there any reason why this doesn't pose a problem?

The designs we use today mean that yes, this is true. There are two types of BitCoin implementation possible:

1) Full node. This is the type everyone uses today.

2) Client only. This is the type BitCoinJ is implementing. There is unfinished work to allow Satoshis software to do both.

In client only node the resource requirements are much, much lower than for a full node, however you cannot take part in mining and you cannot verify transactions for yourself. You can talk to other full nodes over the network and observe that they since they accepted a transaction, they must have verified it as being OK so it's probably safe to rely on it, assuming you really are talking to a variety of nodes and not having your WiFi connection hijacked or something.

But even in full mode, BitCoin doesn't scale all that badly. Whilst storage costs constantly increase, this is also true of lots of other web based businesses (think Gmail, YouTube etc). Storage is so cheap it's really not a big deal.

Quote
2) Since we know that miners will require more data to be stored as time goes on, will this disincentivize ordinary users from contributing processing power to the network, and cause it to become less decentralized with time?

Yes and this is already happening. Hardly anyone bothers to do CPU mining because the returns are so low. If you want to mine today, you have to do more than just select a menu item in the software - at minimum you need a GPU and some time to put into finding out how to do it.

However you have to distinguish between decentralization of verification (which will always be possible for you to do cheaply) and decentralization of mining, which is desirable but less of a big issue. To oversimplify, as long as no miner dominates the network BitCoin is safe. The complicated story depends on how many confirmations people wait for and how much capacity a bad miner has, see the calculations at the end of Satoshis paper.

Mining is already beyond the point where non-technical people can contribute just by downloading and running a GUI program and will soon be beyond the reach of enthusiasts with gaming rigs too. But that's OK. Ultimately this is still a business with very low barriers to entry and so there should be plenty of competition as mining consortiums try to find the cheapest electricity, the most efficient hardware and so on.

Quote
Mining pools are already necessary for mining to be worthwhile to most ordinary users; is it possible to create a mining pool that would somehow alleviate this problem, making it so that the storage requirements for contributors would not increase with time?

If you mine in a pool you don't need to store the block chain.
hero member
Activity: 616
Merit: 500
Firstbits.com/1fg4i :)
March 25, 2011, 08:25:37 PM
#15
Storage space has grown faster than read/write speeds if i'm not mistaken.
legendary
Activity: 1708
Merit: 1010
March 25, 2011, 04:11:06 PM
#14
Client mode impls cannot verify transactions even if they download the blocks (or merkle branches) of tx inputs and link them to a place in the chain. The reason is that checking tx validity is insufficient. You need to be sure it's not a double spend too. Unless you are aware of all transactions you can't do that.

If you only have the block headers (~5mb of storage per year of operation) you have to wait for the network to include a tx into a block before you can be sure it's valid.

That depends upon how much existing trust you have with the counterparty.  Credit card transactions of less than $50 at a gas station are regularly approved automaticly without even waiting for the credit card network to respond.  The double spend attack is quite difficult, as it requires good timing; and there are other ways besides the blockchain itself to increase your confidence.

Still, I don't know how the simplifed system can discover payments that are sent to it from afar (as opposed to in person) without downloading full blocks before triming them.  Otherwise, how does it know that Grandma sent you 50 bitcoins with your eCard on your birthday this year?
hero member
Activity: 590
Merit: 500
March 25, 2011, 02:49:01 PM
#13
because those resources are growing faster than the requirements are.

While this is most likely something we can rely upon, it seems unwise to build an economy that will not be usable at all in the future without the existence of technology which has not been developed yet.  It's kind of the hacker way to not rush into developing software which is incompatible with older or smaller-scale hardware... in the case of something like Linux, that's prudent caution; in the case of an entire economic system, it seems like imperative wisdom.

(Is it possible to move this conversation to a different category?  I'm not sure if it belongs in Technical Support anymore...)

the current 50B/year maximum is perfectly sustainable as a result of the prudent 1MB/block limit (in reality, we're not anywhere close to that.  average block size is about 3.5KB and the full block chain is about 400MB).  when that size gets adjusted to support increased transaction rates, it should also be done prudently.

once they get the headers-only version of the block chain to work, the growth rate would not be relevant to users, as only miners would require the full block chain and older/slower hardware would not be practical for mining.
full member
Activity: 210
Merit: 106
March 25, 2011, 02:11:33 PM
#12
because those resources are growing faster than the requirements are.

While this is most likely something we can rely upon, it seems unwise to build an economy that will not be usable at all in the future without the existence of technology which has not been developed yet.  It's kind of the hacker way to not rush into developing software which is incompatible with older or smaller-scale hardware... in the case of something like Linux, that's prudent caution; in the case of an entire economic system, it seems like imperative wisdom.

(Is it possible to move this conversation to a different category?  I'm not sure if it belongs in Technical Support anymore...)
hero member
Activity: 590
Merit: 500
March 25, 2011, 02:05:12 PM
#11
Thanks for your input, [mike], I suspect if anyone would have a good answer to this question, you would.

...However, I have no idea what you just said.

Here are my two concerns:
1) Over time, will a client continue to require more and more local storage (or CPU, or any particular resource) in order to operate?  If so, is there any reason why this doesn't pose a problem?
2) Since we know that miners will require more data to be stored as time goes on, will this disincentivize ordinary users from contributing processing power to the network, and cause it to become less decentralized with time?  Mining pools are already necessary for mining to be worthwhile to most ordinary users; is it possible to create a mining pool that would somehow alleviate this problem, making it so that the storage requirements for contributors would not increase with time?

1. because those resources are growing faster than the requirements are.  in the past 10 years, the storage in GB/$ has gone up about 25x.  what i spent for a 120GB drive in 2001, you can now get a 3TB drive for.  144MB/day (51GB/year) would have been a lot in 2001, but how it's practically nothing.  while that rate will grow (if we assumed a visa-class number of transactions, it would be more like 40TB/year), i believe technological advancement will be able to keep ahead of that.

2. but as time goes on, that amount of data will become less and less relevant.  copying a floppy disk or later, a CD, to your hard drive used to be a substantial use of space (on the original IBM PC, a single floppy was 1/20th of your hard drive.  on my first computer in 1998, copying a CD to my hard drive took up 1/10th of my hard drive), now it's utterly trivial.
full member
Activity: 210
Merit: 106
March 25, 2011, 01:35:43 PM
#10
Thanks for your input, [mike], I suspect if anyone would have a good answer to this question, you would.

...However, I have no idea what you just said.

Here are my two concerns:
1) Over time, will a client continue to require more and more local storage (or CPU, or any particular resource) in order to operate?  If so, is there any reason why this doesn't pose a problem?
2) Since we know that miners will require more data to be stored as time goes on, will this disincentivize ordinary users from contributing processing power to the network, and cause it to become less decentralized with time?  Mining pools are already necessary for mining to be worthwhile to most ordinary users; is it possible to create a mining pool that would somehow alleviate this problem, making it so that the storage requirements for contributors would not increase with time?
legendary
Activity: 1526
Merit: 1134
March 25, 2011, 06:50:03 AM
#9
Client mode impls cannot verify transactions even if they download the blocks (or merkle branches) of tx inputs and link them to a place in the chain. The reason is that checking tx validity is insufficient. You need to be sure it's not a double spend too. Unless you are aware of all transactions you can't do that.

If you only have the block headers (~5mb of storage per year of operation) you have to wait for the network to include a tx into a block before you can be sure it's valid.
legendary
Activity: 1708
Merit: 1010
March 24, 2011, 12:54:44 AM
#8

A generating client must have ready access to all blocks referenced as inputs into all the transactions that it is including into it's block, because it must verify that all of those transactions are valid before working upon the block.  Otherwise, if a bad transaction is included into their block, that transaction will invalidate the block and all other nodes will reject it.  Which means that said generating client just wasted a great deal of time and energy on trying to get a block.  Currently, 'ready' access means that a full local copy of the blockchain is prudent, but there is nothing stopping anyone from using some kind of network file system for their blockchain storage.  I/O delays would put such a generating node at a disadvantage.  Perhaps someday the blockchain will be split up into a series of files, (perhaps one file per year) allowing the clients to keep the last couple years locally and still have access to the archives on another machine should they need it.


Hmm, this makes me wonder if there could be a point where the generate reward is still large enough, but the chain is bulky enough to make it rational for some miners to not include any tx at all so as not to need the chain.

Which is why some kind of solution such as the local / archived blockchain split that I mentioned will eventually become necessary.  In order to keep the chance of collecting transaction fees a valuable enough incentive to keep someone from altering a client to produce empty blocks just for the reward.  A two year local history wouldn't be any more difficult to keep locally than what we have right now, and would cover 99+% of all transactions that miners are going to ever see, no matter how long into the future we are talking about.  In a couple decades, there might be special clients that exist for the sole purpose of archiving the blockchain, and when the rare event occurs that a transaction is referenced that is older than two years, all of the generators are going to need a copy of that block.  So the specialized archive client would then be handing out copies of that block hundreds of times from ram.

Alternatively, the pruning process could still be employed by generating clients to great effect, since transactions wouldn't be pruned from the local blockchain until they have already been spent; and if they are not there anymore when a transaction calls for them, then the generating client can assume that the transaction is invalid anyway.
legendary
Activity: 1246
Merit: 1016
Strength in numbers
March 24, 2011, 12:14:18 AM
#7

A generating client must have ready access to all blocks referenced as inputs into all the transactions that it is including into it's block, because it must verify that all of those transactions are valid before working upon the block.  Otherwise, if a bad transaction is included into their block, that transaction will invalidate the block and all other nodes will reject it.  Which means that said generating client just wasted a great deal of time and energy on trying to get a block.  Currently, 'ready' access means that a full local copy of the blockchain is prudent, but there is nothing stopping anyone from using some kind of network file system for their blockchain storage.  I/O delays would put such a generating node at a disadvantage.  Perhaps someday the blockchain will be split up into a series of files, (perhaps one file per year) allowing the clients to keep the last couple years locally and still have access to the archives on another machine should they need it.


Hmm, this makes me wonder if there could be a point where the generate reward is still large enough, but the chain is bulky enough to make it rational for some miners to not include any tx at all so as not to need the chain.
legendary
Activity: 1708
Merit: 1010
March 23, 2011, 11:44:24 PM
#6

Quote from: BitCoinJ
BitCoinJ implements (or rather, will implement) the "simplified payment verification" mode of Satoshis paper. It does not store a full copy of the block chain, rather, it stores what it needs in order to verify transactions with the aid of an untrusted peer node.



The transactions in each block are hashed into a merkle tree, and the merkle tree root becomes part of the block header.  Each header is 80 bytes long, and in the simplified version, a "lightweight" client can verify a transaction that is being sent to it is valid by downloading the block(s) that the input transactions referenced by the transaction to be verified from another peer that is not the peer that the transaction is coming from.  This does not need to be a trusted peer because the lightweight client can then check the hashes in the merkle tree to verify that each block sent to it is correct and complete.  It could also download another copy of each block as needed from another peer, but this probably isn't necessary in the transaction values that Joe Average is likely to ever keep on his smartphone anyway.  If the user has a trusted client, say a full client running on his pc at home, and a continuous Internet connection (which he probably needs with a lightweight client anyway) then even keeping the headers locally isn't necessary, as the trusted client could verify the transactions; acting as a remote app server for the smartphone client.  The simplified client mode doesn't even require that the client download new blocks in full, until such time as it needs to verify a transaction, and can wait to download the headers in groups up to 500 in a set, so bandwidth usage is greatly reduced as well.

If I got the details wrong, hopefully Gavin will set me back on the straight & narrow path.

Quote

Edit:  And on that note, would my "problem" eventually become an issue for those machines which are generating new blocks?  Specifically, is the entire chain necessary to generate a new block, and if so, is there some rough upper bound on how quickly the size of the chain might grow?

A generating client must have ready access to all blocks referenced as inputs into all the transactions that it is including into it's block, because it must verify that all of those transactions are valid before working upon the block.  Otherwise, if a bad transaction is included into their block, that transaction will invalidate the block and all other nodes will reject it.  Which means that said generating client just wasted a great deal of time and energy on trying to get a block.  Currently, 'ready' access means that a full local copy of the blockchain is prudent, but there is nothing stopping anyone from using some kind of network file system for their blockchain storage.  I/O delays would put such a generating node at a disadvantage.  Perhaps someday the blockchain will be split up into a series of files, (perhaps one file per year) allowing the clients to keep the last couple years locally and still have access to the archives on another machine should they need it.
administrator
Activity: 5222
Merit: 13032
March 23, 2011, 08:31:53 PM
#5
So would the size of the stored chain in this simplified scheme be bounded?

Yes. You only need to store the latest block headers, which should never be more than a few MB.

Edit:  And on that note, would my "problem" eventually become an issue for those machines which are generating new blocks?  Specifically, is the entire chain necessary to generate a new block, and if so, is there some rough upper bound on how quickly the size of the chain might grow?

You need access to all unspent transactions to generate blocks (unless you're in a pool). There is a maximum block size of 1 MB currently, so the maximum growth is currently about 144 MB per day.

Full network nodes will eventually have to be very powerful, but a VISA-size network should be possible.
hero member
Activity: 616
Merit: 500
Firstbits.com/1fg4i :)
March 23, 2011, 08:13:18 PM
#4
The size of the block chain is a bigger issue specially for people that can't afford faster connections and with devices significantly weaker than a desktop PC.
full member
Activity: 210
Merit: 106
March 23, 2011, 04:23:37 PM
#3
Short answer: no.

Longer answer:  it is complicated, and what you need depends on whether or not you're trying to generate new blocks.  To keep it simple, the original client downloads everything.
Ohh, thus:

Quote from: BitCoinJ
BitCoinJ implements (or rather, will implement) the "simplified payment verification" mode of Satoshis paper. It does not store a full copy of the block chain, rather, it stores what it needs in order to verify transactions with the aid of an untrusted peer node.

So would the size of the stored chain in this simplified scheme be bounded?

Edit:  And on that note, would my "problem" eventually become an issue for those machines which are generating new blocks?  Specifically, is the entire chain necessary to generate a new block, and if so, is there some rough upper bound on how quickly the size of the chain might grow?
legendary
Activity: 1652
Merit: 2301
Chief Scientist
March 23, 2011, 04:14:20 PM
#2
Am I correct in my understanding that:
a) There is a single block chain for the whole world at all times

No.  The end of the chain can, and does, fork, but the forks are short and the network pretty quickly decides on the One True Chain.

Quote
b) The block chain contains a record of every bitcoin transaction that has ever taken place
Yes.
Quote
c) The entire block chain must be downloaded in order for a client to use bitcoin
Short answer: no.

Longer answer:  it is complicated, and what you need depends on whether or not you're trying to generate new blocks.  To keep it simple, the original client downloads everything.
full member
Activity: 210
Merit: 106
March 23, 2011, 04:02:36 PM
#1
It seems silly, but the more I think about this question, the more it bothers me, and I haven't found any clear answer to it online.

Am I correct in my understanding that:
a) There is a single block chain for the whole world at all times
b) The block chain contains a record of every bitcoin transaction that has ever taken place
c) The entire block chain must be downloaded in order for a client to use bitcoin

At this point in time, the wiki notes that it can take "hours" for a client to download the block chain the first time - won't this figure just continue to grow indefinitely as the block chain grows?  Does the size of each block depend on the number of transactions it contains... if so, if bitcoin really "takes off", won't the huge volume of transactions cause the (memory) size of the block chain to grow even faster?  Already it seems to be taking up a few hundred megabytes on my computer, so it seems like this could become a point of impracticality especially for mobile devices...
Jump to: