Author

Topic: Scalability tsunami coming on fast (Read 2342 times)

legendary
Activity: 1072
Merit: 1181
August 15, 2012, 04:27:15 PM
#16
When I downloaded the blockchain for the first time yesterday with bitcoin-qt, I noticed my hard disk making a lot of noise.  iostat showed that it was mostly writes.  strace showed that there were many writev(2)s occurring, presumably due to checkpointing / flushing / syncing the database.  Shouldn't it be possible to use a log structured on-disk format, or just to do transactions such that things only need to be written out every few minutes? 

Berkleley DB is log-structured. During normal operation, it writes the effects of database transactions into log files, and when these log files exceed a given size or age (see the -dblogsize command-line option), they are committed to the database.

The problem is that many parts of the current block chain index are overwritten many many times, so the total amount of log files to write is much more than the actual resulting database size.

LevelDB already helps here, by moving the log-flush code to a separate thread. "Ultraprune" (basically a rewrite of the block validation logic that uses a pruned set of transaction outputs as main database, instead of an index of all transactions ever) will reduce the amount of data to be written significantly.
full member
Activity: 166
Merit: 101
August 15, 2012, 03:53:50 PM
#15
There are at least two big changes coming up that improve things dramatically:

1) LevelDB switches out BDB for a better database, which can (on some types of hardware) give significant speedups


When I downloaded the blockchain for the first time yesterday with bitcoin-qt, I noticed my hard disk making a lot of noise.  iostat showed that it was mostly writes.  strace showed that there were many writev(2)s occurring, presumably due to checkpointing / flushing / syncing the database.  Shouldn't it be possible to use a log structured on-disk format, or just to do transactions such that things only need to be written out every few minutes?  The idea being that if the process or system stops uncleanly, it might have to redo a bit of verification computation next time it's started, but everything would still be consistent and correct for as far as the last checkpoint got to.  Based on my observations of the IO profile of the client, this alone should change its performance from being unacceptable to acceptable (for me, with my spinning disk).

Assuming that the above won't happen for quite a while, I've also ordered an intel SSD, as, again based on my observations about the IO profile, this should greatly improve performance during initial "download" (it's not actually downloading that takes the time, nor verifying, but just the inefficient way that the results of the verification are written out).
legendary
Activity: 1904
Merit: 1002
August 12, 2012, 06:31:08 PM
#14
Amazing, i can't wait for a new bitcoin alternative that uses MAVEPAY. Maybe Bitcoin2 from same devs? maybe something entirely new.
It's very obvious bitcoin has had it's best time, now it's time to move on to something that may actually work.

LOL... I'll buy all your bitcoins for $5 Wink

Best deal ever since they are obviously old garbage.
legendary
Activity: 1937
Merit: 1001
August 12, 2012, 06:22:36 PM
#13
Amazing, i can't wait for a new bitcoin alternative that uses MAVEPAY. Maybe Bitcoin2 from same devs? maybe something entirely new.
It's very obvious bitcoin has had it's best time, now it's time to move on to something that may actually work.
sr. member
Activity: 336
Merit: 250
August 11, 2012, 06:12:49 PM
#12
Good thread, good thread.

First i've heard of mavepay.
legendary
Activity: 947
Merit: 1042
Hamster ate my bitcoin
August 11, 2012, 06:04:35 PM
#11
There seems to be sort of a deer-in-the-headlights thing going on with the scalability issue.

I just installed a new client last night and it ran for 8 hours, and that is on a high-end machine with a high-bandwidth FIOS connection. The btc network is rapidly reaching the point where new installs will be impractical and this could be the case within the year or even within months.

It seems that the majority of suggestions (and that is what they are at this point, suggestions) to fix this is to use a trusted mechanism which defeats the whole purpose of P2P. Hello, folks, this is not DigiCash and we are not going to bring DigiCash back. I think we should be pretty clear about this: NO TRUSTED NODES. If we install trusted nodes, that will be exactly what the Schumers will go after to shut the network down.

I saw some forum post claiming the bible considered this problem and described how to address it, but I just re-read the bible again and there is no such solution, unless you are referring to the idea of transaction pruning to headers. Transaction pruning will do nothing. It is an arithmetic order solution to an exponentially increasing problem. If you were to implement some elaborate pruning scheme, you would not even notice the difference.

To make this work we need to have a way to P2P verify millions of transactions per day. We need to go beyond the existing technology and come up with something fundamentally new to solve this problem.


Thank you Blinken, my thoughts exactly.

I think we should have a section dedicated to discussing scalability.

The trouble is bitcoin is becoming a cult. It is a sin to question the sacred protocol. If you do you will be derided as either a fool or as an agent of fear, uncertainty and doubt.
legendary
Activity: 1526
Merit: 1134
August 11, 2012, 01:27:23 PM
#10
There are at least two big changes coming up that improve things dramatically:

1) LevelDB switches out BDB for a better database, which can (on some types of hardware) give significant speedups

2) Pieter has implemented something he cals "ultraprune". Despite the name it does not prune (it lays the groundwork for that). It changes the database formats significantly, so the working set can fit entirely in RAM. This makes a massive difference, dropping block chain download time from hours to more like 20 minutes or less.

After that there are still more scalability improvements that can be made. But yes, long term, end users will run SPV clients like MultiBit and the problem will go away entirely.
legendary
Activity: 1221
Merit: 1025
e-ducat.fr
August 11, 2012, 08:40:40 AM
#9
I think we should be pretty clear about this: NO TRUSTED NODES. If we install trusted nodes, that will be exactly what the Schumers will go after to shut the network down.

It seems you may be falling in a fallacy like many other threads: bitcoin is not about removing the need for trusted nodes (in fact, anyone downloading the block chain and contributing a GPU is a trusted node, earning an amount of trust equal to its GPU power over the total hashing power), it's about decentralized trust.

There will be individual nodes and super nodes (each serving many clients).
Super nodes are not going to make it any easier to attack the network, quite the opposite: they will be backed by a profitable albeit competitive (much more competitive than traditional banks) transaction processing business, affording them adequate technical and legal means.
Super nodes will present a very elusive target to the banking lobby since customers can easily switch in the event of a shut down (on what legal ground by the way ? can lawmakers forbid sending a signed message over the internet ?)
legendary
Activity: 1106
Merit: 1004
August 11, 2012, 07:19:58 AM
#8
Yes you need to have a node to "help" you, but you do not need to trust it (they do not know your private keys).

Well, you do need to trust the server not to omit things from you, and most important, not to collect your financial data (log all your transactions, collecting data about how you use your money and how much you have in total).
I guess that's precisely what the OP wants to avoid (having to trust someone).

You need no more trust in this node that you do with your ISP when you use the Satoshi client.

You can use Tor or other proxy to avoid having your financial data linked to your IP when in P2P mode. Tor's not enough when in client-server mode.
hero member
Activity: 798
Merit: 1000
August 10, 2012, 04:21:14 PM
#7
Using an account ledger with account numbers in lieu of public keys or pubkey hashes, you can get transactions down to around 100 bytes each which is scalable with visa-like transaction volumes on consumer bandwidth. It will also make life much easier if there is a move to post-QC cryptography with huge public keys and signatures. The sigs will still be tough, but at least the pubkeys are taken care of. Ed25519 also works a lot faster than ECDSA in batch verification.
sr. member
Activity: 438
Merit: 291
August 10, 2012, 03:41:18 PM
#6

Try out bitcoin spinner:
https://en.bitcoin.it/wiki/BitcoinSpinner

Yes you need to have a node to "help" you, but you do not need to trust it (they do not know your private keys). You need no more trust in this node that you do with your ISP when you use the Satoshi client.

And obviously you can use blockexplorer.com to check your address balances if you are paranoid.

I am 90% convinced that this architecture will be the future for the clients most "users" use. And only merchants or other business who need more control will use the original client.





hero member
Activity: 555
Merit: 654
August 10, 2012, 12:47:13 PM
#5

To make this work we need to have a way to P2P verify millions of transactions per day. We need to go beyond the existing technology and come up with something fundamentally new to solve this problem.

Fundamentally new p2p systems exists: check MAVEPAY. You can get 10K transactions/second with enough bandwidth.  It hasn't been implemented. But unluckily is it completely incompatible with Bitcoin.

I really don't think that 8 hours is too much waiting right now. Probably in 1-2 years the problem must be addressed somehow.

Best regards,
 Sergio.
administrator
Activity: 5222
Merit: 13032
August 10, 2012, 11:38:03 AM
#4
This has always been a problem, and the developers continuously improve it. I first heard about Bitcoin over 2 years ago from a post on a forum complaining about how long it was taking for Bitcoin to download all of the blocks. It took 2-3 days at that time to download all of the blocks.
legendary
Activity: 1106
Merit: 1004
August 10, 2012, 11:07:11 AM
#3
It seems that the majority of suggestions (and that is what they are at this point, suggestions) to fix this is to use a trusted mechanism which defeats the whole purpose of P2P. Hello, folks, this is not DigiCash and we are not going to bring DigiCash back. I think we should be pretty clear about this: NO TRUSTED NODES. If we install trusted nodes, that will be exactly what the Schumers will go after to shut the network down.

You don't need to trust anyone to run a lightweight p2p node.
A lightweight node is one that only keep block headers.

Currently, the single implementation of light p2p node I'm aware of is BitcoinJ. And this implementation still downloads every block content IIRC, to check if you have a tx in it. It only stores the headers. This will be a problem once downloading blocks requires more bandwidth than what an average user can have, but we're still far from that (the problem with the main client is inserting the data in the database, not downloading it). And people are already working on implementing Bloom filters that would allow lightweight clients to only query the transaction set that interests them, and still be able to be sure it's valid due to the Merkle root in the block headers.

Those who should worry with scalability are those who must use a full node: pool operators, solo miners and miners in P2Pool. They'll be paid for it, though.

I saw some forum post claiming the bible considered this problem and described how to address it, but I just re-read the bible again and there is no such solution, unless you are referring to the idea of transaction pruning to headers. Transaction pruning will do nothing. It is an arithmetic order solution to an exponentially increasing problem. If you were to implement some elaborate pruning scheme, you would not even notice the difference.

Pruning might help those who need a full client, but yeah, I agree with your feeling, it's probably not that relevant. At least I believe their main issue will be bandwidth not storage.
legendary
Activity: 1008
Merit: 1023
Democracy is the original 51% attack
August 10, 2012, 10:37:10 AM
#2
It actually used to take far longer to do the first install of the client. The first few times I did it last year, it took 1-2 days of the computer staying on (also with super fast computer/broadband).

The code has already gotten significantly better, and more improvements are in the works. But yes, scalability needs to be continually on the dev priority list and I think it is.
sr. member
Activity: 338
Merit: 253
August 10, 2012, 10:32:50 AM
#1
There seems to be sort of a deer-in-the-headlights thing going on with the scalability issue.

I just installed a new client last night and it ran for 8 hours, and that is on a high-end machine with a high-bandwidth FIOS connection. The btc network is rapidly reaching the point where new installs will be impractical and this could be the case within the year or even within months.

It seems that the majority of suggestions (and that is what they are at this point, suggestions) to fix this is to use a trusted mechanism which defeats the whole purpose of P2P. Hello, folks, this is not DigiCash and we are not going to bring DigiCash back. I think we should be pretty clear about this: NO TRUSTED NODES. If we install trusted nodes, that will be exactly what the Schumers will go after to shut the network down.

I saw some forum post claiming the bible considered this problem and described how to address it, but I just re-read the bible again and there is no such solution, unless you are referring to the idea of transaction pruning to headers. Transaction pruning will do nothing. It is an arithmetic order solution to an exponentially increasing problem. If you were to implement some elaborate pruning scheme, you would not even notice the difference.

To make this work we need to have a way to P2P verify millions of transactions per day. We need to go beyond the existing technology and come up with something fundamentally new to solve this problem.


Jump to: