Block chain size/storage and slow downloads for new users - page 15.

BtcGains

newbie

Activity: 38

Merit: 0

I use multibit, makes life a lot easier. Bitcoin-Qt is a pain to add new wallets / prove keys without going through every single block which can take days if you're lucky that it doesn't crash. Hopefully multibit expands to other altcoins soon

davien

sr. member

Activity: 462

Merit: 250

It always takes me a lot of time but if i close the wallet only if it is necessary than the next time i download the block chain it does not take me so much time ,but i have to admit at first it took me 3 days non stop downloading....

ThomasCrowne

full member

Activity: 210

Merit: 100

★☆★ 777Coin - The Exciting Bitco

The first time I downloaded the blockchain (friend that introduced me to bitcoin insisted I run a full-node), it took my computer a little over a day and a half to download. The last time I had to re-index my blockchain file (a couple of weeks ago) it took over 3 days! Granted I have a few things running on the machine and it isn't exactly the New Kid on the Block, but sheesh.

kd5zgl

newbie

Activity: 12

Merit: 0

N00b question and a comment: Now that I have the entire blockchain downloaded and verified on my Windows box, can I simply shoot it over to a Linux box (under the Linux bitcoin core software, of course) to get a jump on things, or do I have to download/verify it separately there? Are the blockchain data file formats the same? thanx.

Quote from: jctech on June 26, 2014, 12:15:55 PM

Maybe I might start liking the UTXO and stuff and disliking Windows instead (the client is running on Windows). I now remember that Windows is doing pretty crappy job at managing the disk and the files on it. Once I find the time for this, I will try to run the thing on (modern) Linux and see what happens.

After the past few weeks running Windows 8.1 (NTFS) and Ubuntu 14.04 (ext4), both for the first time, doing some of the same disk intensive tasks, I can virtually guarantee that the Linux filesystem is enormously faster and more efficient. Good luck with your experiments!

jctech

newbie

Activity: 8

Merit: 0

Thank you for the clarifications.

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

(...) At some point client A starts sending you blocks X to Z (...)

... and at this time my client would tell the client A to stop sending the blocks and send me bunch of others.

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

It doesn't validate all of them. It is done to ensure there has been no database corruption (possibly during the prior close due to a power failure). It only checks a limited number of the most recent blocks. You can from the config file adjust how many blocks to check and how detailed of a check to perform. You can even set this to zero blocks if you like.

How about marking the stored block chain as good after the client properly exits and then at the startup look for this mark and do the check only if the mark is not there?

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

There is a cache. It is called the UTXO. Block are only used to create and update the UTXO. All validation of new txns and blocks is done against the UTXO. Once a block is written to the disk other than for responding to block requests from other peers (or updating the UTXO in a reorg) they aren't used by your client.

Good to know. But why it is then accessing the disk so much? I am using the latest client. True, the speed went waaaaaaaaaaay up from the 0.6.x I was using before but still it touches the disk quite a bit.

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

Older blocks are not needed except to provide blocks to peers who are bootstrapping. Saying you prefer large files over small files in all cases is a dubious request.

Well, maybe it might seem dubious in your eyes but you cannot be sure if there isn't a valid reason on my side for the request. It would be nice if I could control that stuff.

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

Reinventing the wheel? The chainstate is stored in leveldb which is accepted as an incredibly lightweight and very fast key pair database. It is doubtful you would design an alternative custom database with similar functionality that outperforms leveldb. Also even if you could would the development time be worth reinventing the wheel rather than improving the actual client?

Good point. Thank you for pointing this out. One other question for me to ask would be "how about improving the leveldb so others can benefit from it as well?".

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

The coinbase txns of all blocks represent <0.003% of the blockchain. The size is already limited by general limits on the size of ScriptSigs for all transactions. Seems a dubious use case.

Good to know. Thank you. I did not do much statistics but

Quote from: DeathAndTaxes on June 26, 2014, 11:07:56 AM

That cache is called the UTXO (the chainstate folder you dislike so much). Blocks are used to build the UTXO in a trustless manner. They aren't used to process or validate new blocks and transactions. The raw blocks are just used to bootstrap new nodes so they too can build the UTXO in a trustless manner.

Maybe I might start liking the UTXO and stuff and disliking Windows instead (the client is running on Windows). I now remember that Windows is doing pretty crappy job at managing the disk and the files on it. Once I find the time for this, I will try to run the thing on (modern) Linux and see what happens.

Now I am starting to think that Windows is pretty crappy system for a task like this. I do know its disk cache sucks a lot. I also now remembered that recently I realized that the Windows's filesystem also sucks a lot (when compared with Linux filesystems) (have you ever tried to defragment NTFS and make sure that it really is defragmented? That is the thing I now remember doing recently and realizing it to be pretty impossible and concluding that NTFS sucks). Now I think that the leveldb guys and you, Bitcoin guys did an admirable job at forcing the stupid Windows to behave under such load (20+ GB of growing data + 0.5 GB of heavily updated data).

DeathAndTaxes

donator

Activity: 1218

Merit: 1079

Gerald Davis

Quote from: jctech on June 26, 2014, 10:16:19 AM

I think the problem here is not the size of the blockchain itself. The problem is how the blockchain is handled.

Well there is room for improvement but the client doesn't use blocks the way you think it does so that leads to a lot of incorrect conclusions.

Quote

While using the client I found several problems. The first problem is that the client wastes bandwidth by downloading blocks that it already has. I suspect the P2P protocol does not have provision for a node telling "I already have block X, please send me blocks A, B and Z instead". I can see this problem in the debug.log file which is littered by "ERROR: ProcessBlock() : already have block X" where X runs from a certain number consecutively for several such messages then it jumps and again runs consecutively.

There are messages for requesting specific blocks and ranges of blocks. The client uses them. The issue may be a misbehaving client on the other end. Say you request block X to Z from client A and get no response. So you request block X to Z from client B. Client B responds and you process those blocks. At some point client A starts sending you blocks X to Z which you now already have. If you find a specific bug where YOUR client is requesting blocks it already has be sure to report it but make sure it is actually a bug.

Quote

The second problem is the "Reading the list of blocks" and "Validating blocks" actions which takes a lot of time. Well, my question is why the client needs to "read the list of blocks" and "validate the blocks" every time it starts up. Well, the "read the list of blocks" is not taking that much time but "validate the blocks" is 10 minute operation. You know, once the blocks are validated, why they need to be revalidated at every program startup ?

It doesn't validate all of them. It is done to ensure there has been no database corruption (possibly during the prior close due to a power failure). It only checks a limited number of the most recent blocks. You can from the config file adjust how many blocks to check and how detailed of a check to perform. You can even set this to zero blocks if you like.

Quote

The third problem is that the client is "jumping over the data like goat over cemetery" while doing these two actions. This is MUCH SLOWER than reading the data in sequence. Why it needs to jump over the data so much? Maybe implement some caching?

There is a cache. It is called the UTXO. Block are only used to create and update the UTXO. All validation of new txns and blocks is done against the UTXO. Once a block is written to the disk other than for responding to block requests from other peers (or updating the UTXO in a reorg) they aren't used by your client.

Quote

The fourth problem is why the program splits the blockchain into 125 MB chunks? That is inefficient in Windows where opening and closing a file is pretty expensive operation. In my blockchain directory the first 10 GB are stored in 5 files (well, in fact 10 because I need to count the revXXXXX files) because they were downloaded by a 0.6.3 BETA client but the remaining 9 GB is spread over 75 files. Is there a way to reconfigure these storage parameters? And once I change them, is there a way to tell the client to repackage the blockchain so it is stored according to my wishes? I prefer "few large files" over "many small files" on Windows because "many small files" is inefficient.

Older blocks are not needed except to provide blocks to peers who are bootstrapping. Saying you prefer large files over small files in all cases is a dubious request.

Quote

A similar problem is with the "chainstate" data which is only 0.5 GB but is littered into 229 files. Well, that might not be your fault as I understand that these fileis actually belong to some sort of general purpose database which was recently replaced and actually is much faster now but I believe that this data could be handled more efficiently if it was in a single file (maybe developing a special purpose database?)

Reinventing the wheel? The chainstate is stored in leveldb which is accepted as an incredibly lightweight and very fast key pair database. It is doubtful you would design an alternative custom database with similar functionality that outperforms leveldb. Also even if you could would the development time be worth reinventing the wheel rather than improving the actual client?

Quote

Also regarding the size of the blockchain, there are two things that should be done. The first thing is that the coinbase transaction can be as big as the miner wants (and some coinbase transactions weight few tens of KB, storing various stuff, see "Hidden Surprises in Bitcoin Blockchain" search on Google and especially this blog) so putting a limit to it for example 128 or even 64 bytes would be good (but the limit should not be too small because otherwise we could run into a bunch of blocks with no solution).

The coinbase txns of all blocks represent <0.003% of the blockchain. The size is already limited by general limits on the size of ScriptSigs for all transactions. Seems a dubious use case.

Quote

And the second thing would be when storing the blockchain, extract the addresses and especially the public keys out of the block data, store them into some sort of index file and in the block data replace them with indices. That could reduce the size of the stored blockchain pretty significantly.

That cache is called the UTXO (the chainstate folder you dislike so much). Blocks are used to build the UTXO in a trustless manner. They aren't used to process or validate new blocks and transactions. The raw blocks are just used to bootstrap new nodes so they too can build the UTXO in a trustless manner.

kd5zgl

newbie

Activity: 12

Merit: 0

Hi All,
  I am new to bitcoin, new to bitcointalk and to cryptocurrency in general, so let me apologize in advance if this info appears elsewhere.

  I had the slow blockchain update problem (fast computer, high bandwidth connection) and then I noticed something, maybe nothing, hope it helps:

  Try opening Bitcoin Core, go to Help / Debug Window / Network Traffic. When Core is first opened, it updates like crazy. After about fifteen minutes the network traffic drops to almost nada. This is when my node seems "stuck". All I have to do is exit Core, restart it, and as soon as it has active connection it starts updating like gangbusters again. Had to do this several times to get the whole blockchain, but it was worth it. Maybe it will work for some other folks, hope so.

  I'm running v0.9.2.1-g354c0f3-beta (64-bit) on Win8.1, I also have an Ubuntu 14.04 box that runs pretty much 24/7 and I'd like to run a fulltime nodes on it, partly because of the concerns voiced by Mike and others regarding the dwindling number of nodes and how that affects the decentralized nature of bitcoin. It seems like a damn good idea, bitcoin, and I plan to hang on for as long as the Powers That Be allow me to.

  Cheers, and Hope This Helps.

jctech

newbie

Activity: 8

Merit: 0

I think the problem here is not the size of the blockchain itself. The problem is how the blockchain is handled.

While using the client I found several problems. The first problem is that the client wastes bandwidth by downloading blocks that it already has. I suspect the P2P protocol does not have provision for a node telling "I already have block X, please send me blocks A, B and Z instead". I can see this problem in the debug.log file which is littered by "ERROR: ProcessBlock() : already have block X" where X runs from a certain number consecutively for several such messages then it jumps and again runs consecutively. I think this not only wastes the bandwidth but also the time because the other nodes spend time sending these useless blocks instead of blocks that make progress. This happens pretty frequently at my node which is connected behind a firewall.

The second problem is the "Reading the list of blocks" and "Validating blocks" actions which takes a lot of time. Well, my question is why the client needs to "read the list of blocks" and "validate the blocks" every time it starts up. Well, the "read the list of blocks" is not taking that much time but "validate the blocks" is 10 minute operation. You know, once the blocks are validated, why they need to be revalidated at every program startup ?

The third problem is that the client is "jumping over the data like goat over cemetery" while doing these two actions. This is MUCH SLOWER than reading the data in sequence. Why it needs to jump over the data so much? Maybe implement some caching?

The fourth problem is why the program splits the blockchain into 125 MB chunks? That is inefficient in Windows where opening and closing a file is pretty expensive operation. In my blockchain directory the first 10 GB are stored in 5 files (well, in fact 10 because I need to count the revXXXXX files) because they were downloaded by a 0.6.3 BETA client but the remaining 9 GB is spread over 75 files. Is there a way to reconfigure these storage parameters? And once I change them, is there a way to tell the client to repackage the blockchain so it is stored according to my wishes? I prefer "few large files" over "many small files" on Windows because "many small files" is inefficient.

A similar problem is with the "chainstate" data which is only 0.5 GB but is littered into 229 files. Well, that might not be your fault as I understand that these fileis actually belong to some sort of general purpose database which was recently replaced and actually is much faster now but I believe that this data could be handled more efficiently if it was in a single file (maybe developing a special purpose database?)

Also regarding the size of the blockchain, there are two things that should be done. The first thing is that the coinbase transaction can be as big as the miner wants (and some coinbase transactions weight few tens of KB, storing various stuff, see "Hidden Surprises in Bitcoin Blockchain" search on Google and especially this blog) so putting a limit to it for example 128 or even 64 bytes would be good (but the limit should not be too small because otherwise we could run into a bunch of blocks with no solution). And the second thing would be when storing the blockchain, extract the addresses and especially the public keys out of the block data, store them into some sort of index file and in the block data replace them with indices. That could reduce the size of the stored blockchain pretty significantly.

gongbixocqd7942

full member

Activity: 126

Merit: 100

The currency appreciation of the space is very large

zvs

legendary

Activity: 1680

Merit: 1000

https://web.archive.org/web/*/nogleg.com

Quote from: tvbcof on June 07, 2014, 10:08:39 PM

Quote from: zvs on June 07, 2014, 09:52:47 PM

Quote from: f12ej57kk on June 01, 2014, 08:33:21 AM

is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..

uSD might be more comfortable, but whatever floats your boat I guess.

If he wanted to store USD and not bitcoins, sure.

qumitgaif709984

member

Activity: 69

Merit: 10

Quote from: Peter Todd on July 09, 2013, 01:50:49 PM

Quote from: tvbcof on July 09, 2013, 01:30:06 PM

Quote from: Peter Todd on July 09, 2013, 12:56:38 PM

...
What we are going to have to do is require peers to either do something useful, like relay valid fee-paying transactions and valid blocks to us, or expend some kind of limited resource, like perform a proof-of-work or just pay directly via micropayment. That'll make widescale DoS attacks prohibitively expensive, but it also impacts SPV nodes too that don't contribute to the health of the network. Of course, obviously if such an attack happens this code will be written and deployed very quickly, so don't get any ideas...

'Something useful' could be, among other things, being verifiable situated in a domain which is underpopulated. The domain could be geographical, political, implementational (meaning it works in particular way such as implementing an underrepresented overlay messaging protocol) or whatever.

Indeed - that's what we try to achieve with the current system of trying to connect to nodes with ip addresses in a varied set of /16's. Varying implementations is an interesting idea too, although one that's harder to actually verify.

If you can come up with ways to do more than that we'd love to know, but be warned it's a really, really difficult problem.

Very good explanation.

lailiangufkjf65

member

Activity: 66

Merit: 10

what about electrum?

tvbcof

legendary

Activity: 4760

Merit: 1283

Quote from: zvs on June 07, 2014, 09:52:47 PM

Quote from: f12ej57kk on June 01, 2014, 08:33:21 AM

is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..

uSD might be more comfortable, but whatever floats your boat I guess.

zvs

legendary

Activity: 1680

Merit: 1000

https://web.archive.org/web/*/nogleg.com

Quote from: f12ej57kk on June 01, 2014, 08:33:21 AM

is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..

tvbcof

legendary

Activity: 4760

Merit: 1283

Quote from: l3n1nN on June 07, 2014, 01:37:49 AM

what about electrum?

The two solutions (electrum and SPV) strike me as more or less equivalent. In both situations the user is not a peer in the (supposedly) peer-2-peer network. The user relies on people who are peers, but the 'server' cannot cheat them. Or at least not easily and not at this point. The userbase using either solution does not really add much of anything aside from body-count to the solution.

Seems to me that Bitcoin is on the trajectory of moving toward being not much more or much less 'peer-2-peer' than the mainstream banking system where banks who have the resources peer with one another to create a system (and users have debit cards and such.) Certainly that would be the case if the transaction rate is increased significantly (and Bitcoin achieves and maintains popularity.) Some years ago I suggested that Bitcoin marketing starts to move away from the 'peer-2-peer' label as a sales pitch. This suggestion was met with the expected level of animosity. Eventually 'the powers that be' seem to have taken my advice to some extent however. Not that Bitcoin won't still have advantages (chiefly, the potential absence of counter-party risk) but many of the earlier sales pitches will prove to be early hot air.

l3n1nN

newbie

Activity: 45

Merit: 0