Author

Topic: Proposal: Datacoin - A decentralized data storage cryptocurrency (Read 1188 times)

sr. member
Activity: 490
Merit: 250
Hi, I would like to work with you.  I think I have some theoretical cryptography skills to add.  I would like to ask how/if your storage will be mathematically proved?  It seems very exploitable, which wouldn't be good if the concept came to be worth lots of money.  I think you need a format of proof of storage, but I'm not sure you have it yet.  Thats ok I already did that work so I can help you out there. 

Also, the current "datacoin" is not like your proposal because it is very limited.  Don't be discouraged by that at all.  It is simply a way to prove a file existed at some time, not a way to have distributed hosting of a file.  

this was me

vintagetrex, inventor of the decentralized intellectual property system, a free market solution to compensating inventors
newbie
Activity: 4
Merit: 0
Hi, I would like to work with you.  I think I have some theoretical cryptography skills to add.  I would like to ask how/if your storage will be mathematically proved?  It seems very exploitable, which wouldn't be good if the concept came to be worth lots of money.  I think you need a format of proof of storage, but I'm not sure you have it yet.  Thats ok I already did that work so I can help you out there. 

Also, the current "datacoin" is not like your proposal because it is very limited.  Don't be discouraged by that at all.  It is simply a way to prove a file existed at some time, not a way to have distributed hosting of a file.  
legendary
Activity: 1008
Merit: 1000
I mined on two PCs. Is their an easier way to bring all the balances in one account instead of individually importing private key for each block?
sr. member
Activity: 249
Merit: 250
Lots of potential names, stashcoin, storecoin, gigacoin, diskcoin, drivecoin, infocoin
member
Activity: 82
Merit: 13
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley

It should scale extremely well, even to petabytes of data (as long as the miners have enough storage to meet the demands of the users).

Yeah, I understand. But I want just to clarify that Datacoin developer (i mean Datacoin that is running now) plans to implement some kind of 'individual chains', and other features could be also implemented in it, so 'proof of stock' as you called it is also planned.

Right, I will have to change the name to make it less confusing.

okay, so we'll be competitors  Smiley

I think it's inevitable that people will makes things in this space, so we will probably even see more similar technologies from other people.
sr. member
Activity: 350
Merit: 250
DTC unofficial team
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley

It should scale extremely well, even to petabytes of data (as long as the miners have enough storage to meet the demands of the users).

Yeah, I understand. But I want just to clarify that Datacoin developer (i mean Datacoin that is running now) plans to implement some kind of 'individual chains', and other features could be also implemented in it, so 'proof of stock' as you called it is also planned.

Right, I will have to change the name to make it less confusing.

okay, so we'll be competitors  Smiley
member
Activity: 82
Merit: 13
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley

It should scale extremely well, even to petabytes of data (as long as the miners have enough storage to meet the demands of the users).

Yeah, I understand. But I want just to clarify that Datacoin developer (i mean Datacoin that is running now) plans to implement some kind of 'individual chains', and other features could be also implemented in it, so 'proof of stock' as you called it is also planned.

Right, I will have to change the name to make it less confusing.
sr. member
Activity: 350
Merit: 250
DTC unofficial team
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley

It should scale extremely well, even to petabytes of data (as long as the miners have enough storage to meet the demands of the users).

Yeah, I understand. But I want just to clarify that Datacoin developer (i mean Datacoin that is running now) plans to implement some kind of 'individual chains', and other features could be also implemented in it, so 'proof of stock' as you called it is also planned.
member
Activity: 82
Merit: 13
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley

It should scale extremely well, even to petabytes of data (as long as the miners have enough storage to meet the demands of the users).
sr. member
Activity: 350
Merit: 250
DTC unofficial team
Good idea, but if to use it like Dropbox, its size can go to hundreds and hundreds (and thousands) of TB Smiley
sr. member
Activity: 392
Merit: 250
Suggestion for the title

 "Datacoin - Mine with your Hard Disk Free Space. The first Cloud Storage Coin"
member
Activity: 82
Merit: 13
Datacoin
A decentralized data storage cryptocurrency


Foreword
Something like this may already exist, but I couldn't seem to find anything.
None of this is implemented yet, I would first like to get some feedback and find any potential problems with the system as I have described it.

EDIT: I noticed there is another cryptocurrency called Datacoin, maybe this should be renamed.


Description
Datacoin is a cryptocurrency that acts as a decentralized cloud data storage service. Rather than requiring miners to use compute resources to find proof-of-work, miners store pieces of users' data and solve blocks with proof-of-stock. In other words, miners have more chance of finding a block by having large amounts of storage, not large amounts of compute power. Additionally, users can locate miners who are storing the data they need via a distributed hash table system.

Think of Datacoin as the child of BitTorrent and Bitcoin.

Datacoin is
  • Fault-tolerant - Data stored in the Datacoin network is redundant, so it is hard to lose
  • Highly-available - Like BitTorrent, data retrieval is fast since you can download different parts from different hosts
  • Trustless - Users don't have to worry about illegal or copyrighted material being removed or spied on
  • Anonymous - Like Bitcoin, if used correctly nobody has to know what you are up to (while this is not true of most cloud storage services)
  • Fairly-priced - Hosting prices would remain relatively cheap, just like Bitcoin transaction fees

Potential Use Cases:
  • Personal cloud storage - Datacoin can be used in a Dropbox-like way, but with a lower cost and without having to trust that Dropbox won't snoop on your files
  • CDN - Users can upload content to Datacoin, then allow the network to serve the data to large quantities of other people
  • Backups - Datacoin could possibly be a more cost-effective and fault-tolerant platform to backup data into compared to services like Amazon S3


Technical Overview

Buckets
Instead of addresses, Datacoin has Buckets. They work similarly to Bitcoin addresses in that they are associated with a private key, have a public hash as an identifier, and have a balance that can spent by the owner. However, Buckets additionally have a list of hashes of the pieces of data that are being stored (referred to as Chunks). In the canonical file system paradigm, Buckets are essentially files or folders that are owned by someone and can only be modified by whoever has its private key. Like BitTorrent, a Bucket can contain either a group of files and folders, or just one standalone file. The Datacoin balance of each Bucket is used to pay the network for the hosting of the data in that Bucket (but we'll talk about that more later on).

Blocks
Datacoin blocks work mostly the same as any other cryptocurrency, but with the addition of a list of Commits. Commits are additions and deletions of the data contents of Buckets, and reference the hashes of Chunks being modified along with the operation being performed. Blocks do not contain the stored data itself as all participants in the network would have to store all the data, and that would not be viable. To find the contents of a Bucket, clients would be able to go through the blockchain and look at all the Commits to compile an ordered list of Chunks (of course, they would have to connect to miners to actually download the chunks). Comparing to BitTorrent, the blockchain would essentially act as a server of .torrent files (e.g. The Pirate Bay), minus search functionality.

Mining
Miners on the Datacoin network find blocks by proving they are storing Chunks of data (proof-of-stock).

In its simplest form, this would just be computing someHashFunction(chunkData + nonce) and checking if the output is less than the target (like HashCash, but including the stored data). However, this system would favor miners with high compute power, and would give little incentive to store more than one chunk. To combat this, miners must also compute a Work Hash, which is a standard proof-of-work function that should take on the order of 10 - 60 seconds to solve. This Work Hash should include data to prove it is fresh, e.g. it should be computed as someHashFunction(timestamp + lastBlockHash + nonce). From there, miners compute proof-of-stock by performing someHashFunction(chunkData + workHash), iterating through all Chunks they are storing to try to find a hash under the target. Chunk sizes would need to stay small since all nodes will need a copy of the chunk data to verify the proof-of-stock.

It might sound counter-intuitive to use proof-of-work to incentivize more storage over more compute power, but as long as the Work Hash takes significantly more time to solve than a proof-of-stock hash on one Chunk (even on a very fast machine), it should be in miners' best interests to have as many Chunks to test as possible. To put this all more simply, the mining hashrate scales faster with the number of Chunks being stored than it does with the speed at which you execute the hash function.

Distributed Hash Table
When a client knows it wants to download a Chunk, it will first have to find a node on the network who is storing it. If all miners all just stored random Chunks, finding Chunks would be pretty inefficient as clients may have to traverse through many nodes before a match is found. To solve this, miners will also act as nodes of a distributed hash table (DHT). Simply put, each node will need to generate an ID hash that indicates "where" in the hash space the node is, stores Chunks that have hashes close to that ID, and peers with certain nodes based on their IDs. Using a DHT system like Koorde, clients can look up any Chunk in very few hops without requiring a large number of peer connections.

Fees
A fee is deducted from the balance of each Bucket every block as a storage fee, and the amount is based on the total filesize of all the Chunks in that Bucket. The deducted fees simply dissappear into the void. Since it would be costly on the network to include transactions for all of these fees every block, miners should calculate if a Bucket has a positive balance by looking at that Bucket's activity on the blockchain. If a Bucket balance is <= 0, miners would simply stop storing its data. If a Bucket does not have any Chunks in it, no fees would be deducted and it would effectively act as any other cryptocurrency address. I'm not 100% happy with this fee model, see the Issues and Questions section for more thoughts.


Issues and Questions

I'd like to get some feedback on the following things:

  • Miners have no incentive to serve Chunks to users, only to store them. Is this a problem? (Bitcoin nodes have no incentive to propagate transactions, maybe this will work out similarly?)
  • Should we allow users to pay fees when downloading? I'm not sure how to do this in a trustless way
  • Should Chunk sizes be dynamic, or should we find an optimal size for all Chunks?
  • The fee system seems weird since fees aren't being paid to miners. Is there a way for fees to be paid to miners without requiring everyone to have all the data to verify proof-of-stock?
  • The block reward might need to dynamically scale, since so much Datacoin disappears from storage fees.
  • Datacoin is very different from other cryptocurrencies, maybe it should be written completely fresh rather than forking another.
  • Maybe the DHT algorithm should be modified to allow Buckets to be configured to be stored by more miners (increasing redundancy and availability), at a higher fee cost
Jump to: