The amount that we can scale is limited by the amount of data we can put on the blockchain. Using a lot of payment-channel style optimizations, we can get somewhere between 20k and 1 million users to fit on Sia given the current technology stack. Once we're over 1 million users, we'll have to find something else.
Can you elaborate on this?
Scalability on blockchains is always a tricky issue. In our case, we're bottlenecked by storage proofs. A storage proof takes around 1kb. If we assume that 90% of Sia is going to be storage proofs, that's ~900 storage proofs per block. 1008 blocks in a week means ~900,000 storage proofs per week.
900,000 storage proofs per week does not sound like a lot, especially when files are being uploaded to 200 places at a time. If you assume each file lasts for 2 years though (100 weeks), we're all the way up to 450,000 files. The beta only puts files in 12 places, and they each last 1 week, so that's a total of 75,000 concurrent files for the beta.
We can improve on this number by using payment channels. Sia has support for payment channels through file contracts, which means you can store multiple files on the same file contract and you can keep adding files as you go. When you add files, you don't need to add transactions to the blockchain. When you first join the network, you'll have to make a bunch of setup transactions with hosts. Assuming each user wants to connect to 200 hosts, that's 200 contracts that will need to be made. We're also going to assume that the average host relationship lasts 6 months. (Instead of having them all expire at the same time though, we'll stagger it. So you'll renew contracts with 4 hosts per week). It comes down to 200 setup transactions, followed by 4 per week for the rest of the time you are using Sia. A setup transaction is really small, but closing the setup will result in a storage proof. Since you're storing multiple files, the storage proof is going to be bigger. We'll call this whole thing 1.5kb per host per renew, or about 6kb per week per user.
6kb per week per user, and ~900mb per week of available space on the blockchain. That comes to a total of 150,000 users, each triggering 4 storage proofs on the blockchain per week. Some users might want more storage proofs per week, some might be comfortable with less. Also, the % of storage proofs might be higher or lower than 90%. So we get a full range of 20,000 to 1,000,000 users on the blockchain. If we increase the block size to 20mb, and decrease the storage proof rate per user to 1 per month, we get all the way up to 80,000,000 users.
The security of this system depends on a few things. What percentage of hosts stay on the network for 6 months or longer? If it's a high percentage, than it's more secure to do it this way. If most hosts aren't on the network for at least 6 months, then you have either reduce the number of hosts you use (which reduces security), or you need more frequent storage proofs (which decreases the number of users that fit on the network). Hard drives typically last around 18 months though, so it seems reasonable to assume that most hosts will be interested in sticking around for at least 6 months. (Housing leases last 12 months, etc.).
Using 6 month long contracts means that the host is only checked on every 6 months. Hosts also don't get paid until the end of 6 months. But, the long term incentives remain the exact same as if the time period were shorter. A host who is not storing all of the data will have an amortized profit that is lower than if they were storing all of the data. (the penalty from getting caught * likelihood of getting caught > money saved on storage by cheating). From this perspective, things are fully secure.
Does that mean hosts have to wait 6 months before getting paid? It's a bit nicer than that. If people are only making 6 month contracts, then a host will have to wait 6 months before getting their first storage paycheck. Hosts however are going to have many clients, and these clients are going to have expirations that are spread out. Once a host is getting paid, they will be getting paid much more often than every 6 months. Additionally, downloads will cost siacoins. Even if the storage contract has not expired yet, the hosts will be making money by serving downloads. So again it does not seem to be a huge problem.
It's now important to mention that all of these numbers are chosen by the
users and by the
market. Users can choose, if they desire, to only make contracts with hosts that last 1 week. They will need to spend a lot more on transaction fees though, and this will get expensive once there are many users and the blockchain is always full. And if users feel comfortable only having 50 hosts, and having 2 year contracts with each host, then they will pay substantially less on transaction fees.
But I feel confident that 6 month contracts and 200 hosts is around the optimal number. It'll be easier to know once there are more people on the network. As always, we will start conservatively (when there are fewer users, using more storage proofs is okay because they will still be cheap), and we will switch to more aggressive numbers as it becomes clear that more aggressive numbers are also secure.
=====
To summarize, the way we're setting up the defaults means that the network will have room for around 150,000 users. The settings required to make this happen should be quite secure. By reducing the security settings (which may
still be extremely secure), we can get up to around 1,000,000 users. By increasing the block size, we can increase the number of users. In the absolute worst case scenario, we should still be able to support 20,000 users under our current model.
There's another nice feature though. Each user can have up to
unlimited data. You can fit an infinite amount of data under each contract, at the cost of 32 bytes of storage proof size every time you double the amount of data. Proofs for 1GB are 960 bytes. Proofs for 1TB are 1280 bytes. Proofs for 1EB are 1920 bytes.
If users are willing to share storage contracts, scaling increases substantially, because storage contracts can have infinite size. This is one way we might achieve scalability beyond our initial 150k users. Large users are almost exactly the same weight on the blockchain as small users.
We might also be able to do cryptographic layering. For example, Dropbox would only count at
1 user. All of Dropbox! There might be a way to create large auditable services that use the blockchain, and then thousands or millions of people can use these auditable services instead of using the blockchain correctly. We're not sure yet how the security would work here, but there's a good chance that this is an avenue for further scalability (If we can figure that out, scalability will go up to probably several hundred million users). Solutions such as sidechains, treechains, and other innovations in the space also provide options for scalability.
The scaling I'm telling you about today (150k-ish users) is only the scaling that we
already know how to achieve. We are confident that we can use additional tools to push the boundaries even farther. One example is storage proof 'skipping'. When a file contract expires, a host can submit a storage proof that, instead of being a full storage proof, contains a signature from the user that says "I tested the host off of the blockchain, and the host passed the test.". This reduces the size of the storage proof from 1kb to about 50 bytes (but it does require the user to be online when the contract expires).
=====
Hopefully this explains where my size numbers come from. 20k - 1m users is the number I'm currently using, but by the time we get there, we should have more options for increasing the scalability of the blockchain.
I'm happy to answer any further questions.