Pages:
Author

Topic: Block chain size/storage and slow downloads for new users - page 15. (Read 228658 times)

newbie
Activity: 38
Merit: 0
I use multibit, makes life a lot easier. Bitcoin-Qt is a pain to add new wallets / prove keys without going through every single block which can take days if you're lucky that it doesn't crash. Hopefully multibit expands to other altcoins soon
sr. member
Activity: 462
Merit: 250
It always takes me a lot of time but if i close the wallet only if it is necessary than the next time i download the block chain it does not take me so much time ,but i have to admit at first it took me 3 days non stop downloading....
full member
Activity: 210
Merit: 100
★☆★ 777Coin - The Exciting Bitco
The first time I downloaded the blockchain (friend that introduced me to bitcoin insisted I run a full-node), it took my computer a little over a day and a half to download.  The last time I had to re-index my blockchain file (a couple of weeks ago) it took over 3 days!  Granted I have a few things running on the machine and it isn't exactly the New Kid on the Block, but sheesh.
newbie
Activity: 12
Merit: 0
N00b question and a comment: Now that I have the entire blockchain downloaded and verified on my Windows box, can I simply shoot it over to a Linux box (under the Linux bitcoin core software, of course) to get a jump on things, or do I have to download/verify it separately there? Are the blockchain data file formats the same? thanx.

Maybe I might start liking the UTXO and stuff and disliking Windows instead (the client is running on Windows). I now remember that Windows is doing pretty crappy job at managing the disk and the files on it. Once I find the time for this, I will try to run the thing on (modern) Linux and see what happens.

After the past few weeks running Windows 8.1 (NTFS) and Ubuntu 14.04 (ext4), both for the first time, doing some of the same disk intensive tasks, I can virtually guarantee that the Linux filesystem is enormously faster and more efficient. Good luck with your experiments!
newbie
Activity: 8
Merit: 0
Thank you for the clarifications.

(...) At some point client A starts sending you blocks X to Z (...)

... and at this time my client would tell the client A to stop sending the blocks and send me bunch of others.

It doesn't validate all of them.  It is done to ensure there has been no database corruption (possibly during the prior close due to a power failure).  It only checks a limited number of the most recent blocks.  You can from the config file adjust how many blocks to check and how detailed of a check to perform.  You can even set this to zero blocks if you like.

How about marking the stored block chain as good after the client properly exits and then at the startup look for this mark and do the check only if the mark is not there?

There is a cache.  It is called the UTXO.  Block are only used to create and update the UTXO.  All validation of new txns and blocks is done against the UTXO.  Once a block is written to the disk other than for responding to block requests from other peers (or updating the UTXO in a reorg) they aren't used by your client.

Good to know. But why it is then accessing the disk so much? I am using the latest client. True, the speed went waaaaaaaaaaay up from the 0.6.x I was using before but still it touches the disk quite a bit.

Older blocks are not needed except to provide blocks to peers who are bootstrapping.   Saying you prefer large files over small files in all cases is a dubious request.

Well, maybe it might seem dubious in your eyes but you cannot be sure if there isn't a valid reason on my side for the request. It would be nice if I could control that stuff.

Reinventing the wheel?  The chainstate is stored in leveldb which is accepted as an incredibly lightweight and very fast key pair database.  It is doubtful you would design an alternative custom database with similar functionality that outperforms leveldb.  Also even if you could would the development time be worth reinventing the wheel rather than improving the actual client?

Good point. Thank you for pointing this out. One other question for me to ask would be "how about improving the leveldb so others can benefit from it as well?".

The coinbase txns of all blocks represent <0.003% of the blockchain.  The size is already limited by general limits on the size of ScriptSigs for all transactions.  Seems a dubious use case.

Good to know. Thank you. I did not do much statistics but

That cache is called the UTXO (the chainstate folder you dislike so much).  Blocks are used to build the UTXO in a trustless manner.  They aren't used to process or validate new blocks and transactions.   The raw blocks are just used to bootstrap new nodes so they too can build the UTXO in a trustless manner.

Maybe I might start liking the UTXO and stuff and disliking Windows instead (the client is running on Windows). I now remember that Windows is doing pretty crappy job at managing the disk and the files on it. Once I find the time for this, I will try to run the thing on (modern) Linux and see what happens.

Now I am starting to think that Windows is pretty crappy system for a task like this. I do know its disk cache sucks a lot. I also now remembered that recently I realized that the Windows's filesystem also sucks a lot (when compared with Linux filesystems) (have you ever tried to defragment NTFS and make sure that it really is defragmented? That is the thing I now remember doing recently and realizing it to be pretty impossible and concluding that NTFS sucks). Now I think that the leveldb guys and you, Bitcoin guys did an admirable job at forcing the stupid Windows to behave under such load (20+ GB of growing data + 0.5 GB of heavily updated data).
donator
Activity: 1218
Merit: 1079
Gerald Davis
I think the problem here is not the size of the blockchain itself. The problem is how the blockchain is handled.

Well there is room for improvement but the client doesn't use blocks the way you think it does so that leads to a lot of incorrect conclusions.

Quote
While using the client I found several problems. The first problem is that the client wastes bandwidth by downloading blocks that it already has. I suspect the P2P protocol does not have provision for a node telling "I already have block X, please send me blocks A, B and Z instead". I can see this problem in the debug.log file which is littered by "ERROR: ProcessBlock() : already have block X" where X runs from a certain number consecutively for several such messages then it jumps and again runs consecutively.

There are messages for requesting specific blocks and ranges of blocks.  The client uses them.  The issue may be a misbehaving client on the other end.  Say you request block X to Z from client A and get no response.  So you request block X to Z from client B.  Client B responds and you process those blocks.  At some point client A starts sending you blocks X to Z which you now already have.  If you find a specific bug where YOUR client is requesting blocks it already has be sure to report it but make sure it is actually a bug.

Quote
The second problem is the "Reading the list of blocks" and "Validating blocks" actions which takes a lot of time. Well, my question is why the client needs to "read the list of blocks" and "validate the blocks" every time it starts up. Well, the "read the list of blocks" is not taking that much time but "validate the blocks" is 10 minute operation. You know, once the blocks are validated, why they need to be revalidated at every program startup ?

It doesn't validate all of them.  It is done to ensure there has been no database corruption (possibly during the prior close due to a power failure).  It only checks a limited number of the most recent blocks.  You can from the config file adjust how many blocks to check and how detailed of a check to perform.  You can even set this to zero blocks if you like.

Quote
The third problem is that the client is "jumping over the data like goat over cemetery" while doing these two actions. This is MUCH SLOWER than reading the data in sequence. Why it needs to jump over the data so much? Maybe implement some caching?

There is a cache.  It is called the UTXO.  Block are only used to create and update the UTXO.  All validation of new txns and blocks is done against the UTXO.  Once a block is written to the disk other than for responding to block requests from other peers (or updating the UTXO in a reorg) they aren't used by your client.

Quote
The fourth problem is why the program splits the blockchain into 125 MB chunks? That is inefficient in Windows where opening and closing a file is pretty expensive operation. In my blockchain directory the first 10 GB are stored in 5 files (well, in fact 10 because I need to count the revXXXXX files) because they were downloaded by a 0.6.3 BETA client but the remaining 9 GB is spread over 75 files. Is there a way to reconfigure these storage parameters? And once I change them, is there a way to tell the client to repackage the blockchain so it is stored according to my wishes? I prefer "few large files" over "many small files" on Windows because "many small files" is inefficient.

Older blocks are not needed except to provide blocks to peers who are bootstrapping.   Saying you prefer large files over small files in all cases is a dubious request.

Quote
A similar problem is with the "chainstate" data which is only 0.5 GB but is littered into 229 files. Well, that might not be your fault as I understand that these fileis actually belong to some sort of general purpose database which was recently replaced and actually is much faster now but I believe that this data could be handled more efficiently if it was in a single file (maybe developing a special purpose database?)

Reinventing the wheel?  The chainstate is stored in leveldb which is accepted as an incredibly lightweight and very fast key pair database.  It is doubtful you would design an alternative custom database with similar functionality that outperforms leveldb.  Also even if you could would the development time be worth reinventing the wheel rather than improving the actual client?

Quote
Also regarding the size of the blockchain, there are two things that should be done. The first thing is that the coinbase transaction can be as big as the miner wants (and some coinbase transactions weight few tens of KB, storing various stuff, see "Hidden Surprises in Bitcoin Blockchain" search on Google and especially this blog) so putting a limit to it for example 128 or even 64 bytes would be good (but the limit should not be too small because otherwise we could run into a bunch of blocks with no solution).
  The coinbase txns of all blocks represent <0.003% of the blockchain.  The size is already limited by general limits on the size of ScriptSigs for all transactions.  Seems a dubious use case.

Quote
And the second thing would be when storing the blockchain, extract the addresses and especially the public keys out of the block data, store them into some sort of index file and in the block data replace them with indices. That could reduce the size of the stored blockchain pretty significantly.

That cache is called the UTXO (the chainstate folder you dislike so much).  Blocks are used to build the UTXO in a trustless manner.  They aren't used to process or validate new blocks and transactions.   The raw blocks are just used to bootstrap new nodes so they too can build the UTXO in a trustless manner.

newbie
Activity: 12
Merit: 0
Hi All,
  I am new to bitcoin, new to bitcointalk and to cryptocurrency in general, so let me apologize in advance if this info appears elsewhere.

  I had the slow blockchain update problem (fast computer, high bandwidth connection) and then I noticed something, maybe nothing, hope it helps:

  Try opening Bitcoin Core, go to Help / Debug Window / Network Traffic. When Core is first opened, it updates like crazy. After about fifteen minutes the network traffic drops to almost nada. This is when my node seems "stuck". All I have to do is exit Core, restart it, and as soon as it has active connection it starts updating like gangbusters again. Had to do this several times to get the whole blockchain, but it was worth it. Maybe it will work for some other folks, hope so.

  I'm running v0.9.2.1-g354c0f3-beta (64-bit) on Win8.1, I also have an Ubuntu 14.04 box that runs pretty much 24/7 and I'd like to run a fulltime nodes on it, partly because of the concerns voiced by Mike and others regarding the dwindling number of nodes and how that affects the decentralized nature of bitcoin. It seems like a damn good idea, bitcoin, and I plan to hang on for as long as the Powers That Be allow me to.

  Cheers, and Hope This Helps.



  
newbie
Activity: 8
Merit: 0
I think the problem here is not the size of the blockchain itself. The problem is how the blockchain is handled.

While using the client I found several problems. The first problem is that the client wastes bandwidth by downloading blocks that it already has. I suspect the P2P protocol does not have provision for a node telling "I already have block X, please send me blocks A, B and Z instead". I can see this problem in the debug.log file which is littered by "ERROR: ProcessBlock() : already have block X" where X runs from a certain number consecutively for several such messages then it jumps and again runs consecutively. I think this not only wastes the bandwidth but also the time because the other nodes spend time sending these useless blocks instead of blocks that make progress. This happens pretty frequently at my node which is connected behind a firewall.

The second problem is the "Reading the list of blocks" and "Validating blocks" actions which takes a lot of time. Well, my question is why the client needs to "read the list of blocks" and "validate the blocks" every time it starts up. Well, the "read the list of blocks" is not taking that much time but "validate the blocks" is 10 minute operation. You know, once the blocks are validated, why they need to be revalidated at every program startup ?

The third problem is that the client is "jumping over the data like goat over cemetery" while doing these two actions. This is MUCH SLOWER than reading the data in sequence. Why it needs to jump over the data so much? Maybe implement some caching?

The fourth problem is why the program splits the blockchain into 125 MB chunks? That is inefficient in Windows where opening and closing a file is pretty expensive operation. In my blockchain directory the first 10 GB are stored in 5 files (well, in fact 10 because I need to count the revXXXXX files) because they were downloaded by a 0.6.3 BETA client but the remaining 9 GB is spread over 75 files. Is there a way to reconfigure these storage parameters? And once I change them, is there a way to tell the client to repackage the blockchain so it is stored according to my wishes? I prefer "few large files" over "many small files" on Windows because "many small files" is inefficient.

A similar problem is with the "chainstate" data which is only 0.5 GB but is littered into 229 files. Well, that might not be your fault as I understand that these fileis actually belong to some sort of general purpose database which was recently replaced and actually is much faster now but I believe that this data could be handled more efficiently if it was in a single file (maybe developing a special purpose database?)

Also regarding the size of the blockchain, there are two things that should be done. The first thing is that the coinbase transaction can be as big as the miner wants (and some coinbase transactions weight few tens of KB, storing various stuff, see "Hidden Surprises in Bitcoin Blockchain" search on Google and especially this blog) so putting a limit to it for example 128 or even 64 bytes would be good (but the limit should not be too small because otherwise we could run into a bunch of blocks with no solution). And the second thing would be when storing the blockchain, extract the addresses and especially the public keys out of the block data, store them into some sort of index file and in the block data replace them with indices. That could reduce the size of the stored blockchain pretty significantly.
full member
Activity: 126
Merit: 100
The currency appreciation of the space is very large
zvs
legendary
Activity: 1680
Merit: 1000
https://web.archive.org/web/*/nogleg.com
is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..

uSD might be more comfortable, but whatever floats your boat I guess.



If he wanted to store USD and not bitcoins, sure.
member
Activity: 69
Merit: 10
...
What we are going to have to do is require peers to either do something useful, like relay valid fee-paying transactions and valid blocks to us, or expend some kind of limited resource, like perform a proof-of-work or just pay directly via micropayment. That'll make widescale DoS attacks prohibitively expensive, but it also impacts SPV nodes too that don't contribute to the health of the network. Of course, obviously if such an attack happens this code will be written and deployed very quickly, so don't get any ideas...

'Something useful' could be, among other things, being verifiable situated in a domain which is underpopulated.  The domain could be geographical, political, implementational (meaning it works in particular way such as implementing an underrepresented overlay messaging protocol) or whatever.

Indeed - that's what we try to achieve with the current system of trying to connect to nodes with ip addresses in a varied set of /16's. Varying implementations is an interesting idea too, although one that's harder to actually verify.

If you can come up with ways to do more than that we'd love to know, but be warned it's a really, really difficult problem.
Very good explanation.
member
Activity: 66
Merit: 10
what about electrum?
legendary
Activity: 4690
Merit: 1276
is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..

uSD might be more comfortable, but whatever floats your boat I guess.

zvs
legendary
Activity: 1680
Merit: 1000
https://web.archive.org/web/*/nogleg.com
is blockchain most secure wallet?

USB flash disk stuffed in your butt crack is probably the most secure wallet..
legendary
Activity: 4690
Merit: 1276
what about electrum?

The two solutions (electrum and SPV) strike me as more or less equivalent.  In both situations the user is not a peer in the (supposedly) peer-2-peer network.  The user relies on people who are peers, but the 'server' cannot cheat them.  Or at least not easily and not at this point.  The userbase using either solution does not really add much of anything aside from body-count to the solution.

Seems to me that Bitcoin is on the trajectory of moving toward being not much more or much less 'peer-2-peer' than the mainstream banking system where banks who have the resources peer with one another to create a system (and users have debit cards and such.)  Certainly that would be the case if the transaction rate is increased significantly (and Bitcoin achieves and maintains popularity.)  Some years ago I suggested that Bitcoin marketing starts to move away from the 'peer-2-peer' label as a sales pitch.  This suggestion was met with the expected level of animosity.  Eventually 'the powers that be' seem to have taken my advice to some extent however.  Not that Bitcoin won't still have advantages (chiefly, the potential absence of counter-party risk) but many of the earlier sales pitches will prove to be early hot air.

newbie
Activity: 45
Merit: 0
what about electrum?
legendary
Activity: 1134
Merit: 1002
is blockchain most secure wallet?

bc.i wallet is better than exchange ones, but not as good as offline wallet and paper wallet.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
is blockchain most secure wallet?
No, your computer can be compromised and your block chain account can be hacked. Offline paper wallet is the most secure.
newbie
Activity: 40
Merit: 0
is blockchain most secure wallet?
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
"SPV wallets will always be fast no matter how popular Bitcoin gets." why do you think so?Huh Huh come on, it will be like any other wallets , no difference
Bitcoin core requires user to download full blockchain which takes a long time. However,  SPV wallets do not need to download any blocks.
Pages:
Jump to: