Hello Factom community. Back during the factoid crowd sale, I heard about this tech and bought some factoid on a whim (good choice
). Now with the recent surge in price I have been researching Factom and I have some questions:
1. First I need to get something straight. As far as I can tell, Factom is not for storing data. Is this correct? It seems to me that one could store data in entries, but it would only be stored on the Factom full nodes and it would supposedly be expensive, and so rather than storing data, users are intended to store hashes of data. Please correct/validate/enlighten me.
1.1 I couldn't find in the whitepaper how much space 1 entry credit purchases. How much space does it purchase? I'm trying to figure out how realistic it is to store data.
2. If Factom does not store data, but more likely data hashes, how does Factom secure information? In one of Factom's YouTube videos a use case for Factom is described as preventing the Sony hack; however, I fail to see how Factom could prevent such a hack. Sony's data is still on their servers and if a hacker breached those servers he would still be able to acquire sensitive information and do with it as he pleases.
3. Another question about data. If I lose the data that I have secured with Factom, I, nor anyone else, can prove whether or not it existed. Correct?
4. Related to the last question. As far as I understand it, anyone can submit entries into a chain, correct? If so, data loss becomes an issue again. I watched another YouTube video where Paul Snow is speaking at a conference in Dubai I think. In this video he explains how Bank of America could have reviewed a Factom chain to insure all the records were present and correct. I don't understand how Factom can prove if all the records were there. If all the records were hashed into "Countrywide Mortgage Records" and after review there were hashes that were unaccounted for, how does Bank of America know if these unaccounted hashes are important records or just spam?
5. It seems that Factom becomes successful and widely used it will be storing impressive amounts of data. In a world where millions or billions of entries are being submitted every year I can't help but think that the amount of data a full node will have to store will simply be too much. This is especially true when one considers that Federated servers are not being rewarded for their storage; they are only being rewarded for further entry submissions. This means that as time passes the ratio of the amount of data being stored compared to the reward will get bigger and bigger. While normal data centers, say Amazon Cloud Services, receives constant compensation for the continued use of storage, Factom Federated Servers don't. Please help me better understand the economics of Factom and how this problem (if it even is a problem) will be overcome.
5.1 It also seems inevitable that data centers will be the only ones capable of being full nodes. Is this correct? intended?
6. How come the amount of Federated Servers and Audit Servers are fixed? It seems to me that the more there are, the better. So why not just have all full nodes be in the pool of possible Federated and Audit Servers and then split the group in two according to the mentioned voting system?
7. After reading the whitepaper, it is still unclear to me whether directory blocks are made after every one minute or after every 10 minutes. The whitepaper states they are made after every minute, but on examination of the protocol through the block explorer, I get the idea they are made every 10 minutes. Please elaborate.
If answers to these question are posted somewhere else, forgive me. I couldn't find any. I also hope that this isn't overbearing, I just want to better understand how Factom works as I really hope it succeeds. Thank you for your time.
EDIT: One more question
8. How is the decentralized Federated Server model better than a centralized company that does the same thing? Since everything is validated client side, Couldn't a central entity assemble hashes and stamp them into the blockchain?
1. Factom is geared more for indexing data rather than storing it. in some cases they are one in the same, but the system makes no long term promise about quickly serving your data back to you whenever you request it. Think of it like pruning in bitcoin. Something the size of an omni or counterparty transaction would easily fit in an entry.
Now if you store a local copy of the data you need, then the factom data structures give you the ability to provide a proof later on to peers.
1.1 The current target price is about $0.001 per KiB, which equates to $1000/GiB. When the system goes decentralized in the future, it is not knowable what the Federated Servers will set this price at though.
2. It has more to do with a holistic approach to access control. If an element of immutable time is required for access, then it gives sysadmins time to respond to certain types of attacks.
3. two points.
a. We expect that communities will share their data subsets amongst each other freely. If property records are secured for some country, then it makes sense for various citizens in that country to download and share up the public data. People share up bittorrent data for free to members of their communities who want their particular datasets.
b. This is all public data, and someone will record it. Take centuries old newspapers for example. They can still be found, just not a the corner shop. You would need to go to a library to get it, or maybe pay a fee. I don't think the data will ever go away as long as there is some chance that it may be valuable, but it may be harder to get.
4. Countrywide would sign the hashes before placing them into factom. spam would be unsigned and could be ignored.
5. Paying for upload bandwidth in a distributed way is a really tricky unsolved problem in general. Both storj and maidsafe are exploring solutions to this. Every solution Paul and I explored could be gamed one way or another. If it is as successful as you are claiming, then the inflation subsidy would more than pay for upload services. There is no mining in factom so inflation does not get dissipated in electricity bills.
I made some back of the envolope numbers which Paul will present in Miami about how the data is segregated. I guess it isn't really an announcement, just an analysis, so I'll share it here. There are lots of wild guesses here, but it gives you the idea of how it will scale.
This is data per year.
Assume:
50 million shipping containers, each with 10 entries per year
117 million mortgages with 12 entries per year
5 million mortgage originations/title transfers with 30 entries per year
300 million health records with 10 entries per year
1 stock exchange with 1500 companies with 10 million entries per day each
assume 1k per entry.
total data in all the layers per year is 26.4 terabytes.
The shared part in blue is the directory blocks. This comes to 313 gigabytes.
the Entry Credit payment overhead (yellow) is only to prevent spam in the present. it can be discarded by most people, and they can just store the balance info. (think UTXO set vs full chain in Bitcoin) Notice factoids do not even show up on the graph. those are just a way to get entry credits in the first place, and there would be a 10,000-100,000 ratio in EC commits compared to factoid transfers.
Only very small subsets of the red Entry Block data is only needed by the applications to prove their states. your application would only need a small sliver of the red.
If the annual user data is 19.5 terabytes, you would only need to sift through blue 300 gigabytes to find the data for your application.
5.1 data in factom is separated into chains, which are a clean way to segregate applications and what data you are storing. I imagine in the future, you will only be storing and uploading the data which is important to your application. There are also plans to segment the network, so peers form subnets which only relay some of the data.
6. Because Sibyl. We get advantages from having a predefined authority set. As that set gets bigger it gets harder to manage. Also there need to be few enough of them to matter to vote out. see Dunbars number.
7. yes, sorry, we tweaked the protocol between the whitepaper publishing and launch. They are indeed every 10 minutes. we have a few other mistakes in the paper:
https://github.com/FactomProject/FactomDocs/blob/master/Factom_Whitepaper_Errata.md8. Thats where we started out with, but that one party could censor an individual while leaving the rest of the network working. would all of bitcoin switch over to a new network just because the miners were censoring one particular person? the next step is to allow free entry, but then to prove the negative, an application would need to download data from all possible sources, making spam trivial.
more thoughts here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-March/007721.html