[ANNOUNCE] Whitepaper for Bitstorage - a peer to peer, cloud storage network

takagari

legendary

Activity: 1050

Merit: 1000

Seems like a new way for user's to share pirated data

better be a big network!

vintagetrex

sr. member

Activity: 490

Merit: 250

Quote from: bitcoin account on December 20, 2013, 09:24:58 AM

Quote from: jspilman on November 27, 2013, 11:51:00 AM

Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?

there is a single hash verifiable solution to this through an arrangement of proof of storage where storage of a file is proved over a period of work (# of randomly generated strings and hashes). I theorized this solution with some help from wolf0

these were me. I got onto my account. Check out my ideas. They are well formed.

Sarchar

member

Activity: 88

Merit: 10

Just chiming in to say what's up: I've been completely distracted with some of my other projects. While I believe in a project like this, it's just too complicated for me to do in my limited time. I'd be very happy to leave this up to the rest of you who seem to be doing similar (and greater!) things.

bitcoin account

newbie

Activity: 4

Merit: 0

Quote from: jspilman on November 27, 2013, 11:51:00 AM

Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?

there is a single hash verifiable solution to this through an arrangement of proof of storage where storage of a file is proved over a period of work (# of randomly generated strings and hashes). I theorized this solution with some help from wolf0

bitcoin account

newbie

Activity: 4

Merit: 0

Hi, I have been working on inventing for a project similar to this since June 2013. My skills are as an inventor and not a programmer. I worked for exxon mobil upstream research company 2 years ago when I was 20 and they stole inventions from me so I decided to get them back by inventing a decentralized IP system with a decentralized cloud storage that could poach IP from large companies.

I have made considerable headway on how to place incentive to promote users to submit valuable information to the ledger/blockchain. It is very exciting to see people validating the ideas over the last month. At least I'm not getting trolled like I was at times haha.

I think it is important to establish a goal for the cloud: is it to create a cloud that amasses the most valuable information possible? I would suggest that. I am an inventor, not a programmer. I would like to work with you guys/girls and hopefully help the project go in the right direction.

I'm currently locked out of my main forum account because I left my password at home and am traveling for the holidays.

https://bitcointalksearch.org/topic/nemesis-the-open-source-intellectual-property-system-252564

https://109.201.133.195/index.php?topic=333991.20

I've had a github up for a short while to get my ideas and specs down:
https://github.com/vintagetrex/Nemesis-Project/blob/master/README.md

thanks,

vintagetrex

bill.joy

newbie

Activity: 14

Merit: 0

Quote from: Peter Todd on November 27, 2013, 12:47:56 PM

Quote from: jspilman on November 27, 2013, 11:51:00 AM

Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?

That's impossible unfortunately - no amount of math can prevent a node from outsourcing the actual storage to a central location that is a single-point-of-failure.

The best you'll ever be able to do is pay enough that there is incentive to do the job right, and/or use social/legal mechanisms to audit what node operators do and then arrange these transactions with specific operators. Not terribly exciting solutions unfortunately...

Quote from: jspilman on November 27, 2013, 11:51:00 AM

The obvious solutions are proof-of-work or proof-of-stake, but I can't see how proof-of-work alone would suffice.

Yeah, proof-of-work can only prove IO bandwidth, not where the data is located.

However... interactive latency measurements can prove data within some sphere of radius t*c. With multiple trusted challenge servers around the world located in co-location centers one could easily prove that the data must be physically present in multiple locations.

You can even do in a fully decentralized way with some extensions to the Bitcoin scripting language by creating a txout that can only be spent by proving you have some data fragment, where the fragment is chosen randomly based on the previous block hash.

Because the fragment is based on the previous block hash, there is a time limit to how quickly the fragment must be retrieved, thereby proving (after sufficient trials) that the data is physically located within a sphere of radius 10minutes * the speed of light - currently this would prove the data may be physically located on Earth, the Moon and Venus, but no other planet. With a second proof-of-work blockchain established on, say, Pluto, we could then easily prove a similar result for data located on or nearby Pluto. (proving the Pluto proof-of-work blockchain is in fact located on Pluto is left as an exercise for the reader)

I think it's much slower than the speed of light unless you are sending only 1 or a few bits. The higher bandwidth desired, the shorter distance is possible. And the costs increase exponentially with distance, surpassing the mining reward. So your method may be good enough, with random fragments calculated based on a previous hash. The miners may have to store all the data in a nearby storage in order to compete with other miners.

I've been looking for something like this. What I recently read was PAST p2p storage developed in Microsoft Research a long time ago: http://research.microsoft.com/en-us/um/people/antr/past/ It doesn't have currency implemented as incentives. Each file has K copies stored on the p2p network. The problem with this approach is that POW can only be done on the K nodes which contains a fragment. We seem to need every miner to have full access to all the data in order to calculate the hash of a random fragment. But that will bloat the block chain. How do we solve this problem?

super3

legendary

Activity: 1094

Merit: 1006

Oh what a fun discussion. I'm working on something similar as well. Calling mine StorJ after the Bitcoin agents concept. Really want to get Bitcoin agents running on whichever one of these concepts turns out to be the best.

Quote from: tubbyjr on November 30, 2013, 02:43:52 PM

The concept seems similar to what Datacoin is doing, although I haven't actually read through your whitepaper.

Negative. Essentially Datacoin just took Primecoin and made the data portion bigger. Its a cool concept, but at the end of the day it just leaves you with a huge blockchain, and you can't reasonably store anything more than a few documents. After that the price just becomes unreasonably.

tubbyjr

full member

Activity: 182

Merit: 100

The concept seems similar to what Datacoin is doing, although I haven't actually read through your whitepaper.

indyjo

member

Activity: 74

Merit: 10

Developer of BitWrk

Quote from: Sarchar on November 27, 2013, 08:33:59 AM

Bitstorage - A distributed, peer to peer, cloud data storage network based on blockchain technology.

Hi Sarchar!

I must say that I really like the proposal. I am currently development something similar: BitWrk - like Bitstorage, but for computing power, not for storage. See this post for reference: https://bitcointalksearch.org/topic/announce-bitwrk-better-ways-to-earn-bitcoins-than-mining-179948

At first, I had the same reservations as jspilman:

Quote from: jspilman on November 27, 2013, 11:51:00 AM

Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?

We should probably assume that no direct solution exists for this problem.

There is, however, an indirect solution: A reputation system. Just like eBay's. You basically want to create incentive for not creating enormous amounts of fake accounts by offering advantages to users with a long, successful trade history.

Then there is the problem regarding proof of existence. How can I, as the original data owner, verify a proof of existence without being in possession of the original data? When I don't have the original data anymore, that's when I need to verify it the most (otherwise, I would have to accept any data).

My solution is that I wouldn't verify the proof of existence in this case. Instead, the data would contain a signature issued by my, the original owner. So, by downloading the data, I could verify its authenticity (d'ough!).

But then, assume I have downloaded my original data back from one of the storage peers. If I was malicious, I could claim that the downloaded data wasn't my original data. If I did that, the peer would simply publish the data, containing my signature, making it obvious to everyone that I wasn't telling the truth.

All in all I think it can be done. Great work!

As a user, I would prefer to be compensated in BTC, instead of yet another virtual currency.

Sarchar

member

Activity: 88

Merit: 10

Quote from: mappum on November 27, 2013, 03:47:32 PM

It all looks pretty solid, I've only come up with a few small issues.

First of all, I don't think there is a need to limit the downloads to nodes that are close and the owner. This prevents use as a CDN, and also, it is trivial to keep generating new private keys until one is found that is close to the desired file.

Right, ultimately however, each storage node should get to decide for himself which nodes to allow the transfer to. I don't think it is trivial, actually. If I have a key that's only a few bits away, it's going to be very hard for you to search for a key closer. With large amounts of data and keys, it's extremely likely that someone else will be between you and data randomly.

Quote

Also, (this may or may not be a problem) as mentioned in IRC, attackers could also keep changing their data to change its hash to decide what nodes they want to store it.

I see this as a benefit, actually: store the same data twice but with two different hashes. You could change the encryption key and/or a nonce in the header. In fact, this might help distribute your data more effectively.

Quote

A bigger problem is that infinite nodes can be hosting the same data while the owner will only pay out to n nodes. The nodes that get to be paid are simply the ones who announce it first, and any attacker can just keep announcing the data they have (unless all nodes are recording all announcements). This could add up to be a lot of data. Even with no attackers, it means potential income isn't guaranteed, and you will lose a lot of it from the bad luck of not being first.

It's not supposed to be first-come-first-serve, but it's supposed to be more like closest-first-comers. It means that if you later join and you're closer than someone else, you can claim the payment if miners decide to throw you in.

Sarchar

member

Activity: 88

Merit: 10

Quote from: Mike Hearn on November 27, 2013, 03:43:50 PM

The E-PDP paper is interesting but my reading of it was different to yours - their sample had to store 128mb of tags for a 4g file. It doesn't have the small overhead you think: it reduces client overhead by increasing server overhead. It's also more intensive - randomized hash challenges boil down to one disk seek on the server (in the best case of an unfragmented file) and some hashing which is fast, whereas their scheme involves lots of hopping around.

Still, I think it's not a huge difference. You could implement v1 using a simpler proof of storage and then upgrade to more complex proofs in a v2.

Kinda, yeah. I think in a distributed network, using larger blocks than 4KB (say 32KB) would be much better. I had envisioned each store request supposed to be less than 1MB in size, so with a 1024-bit modulus that would require 4096 bytes of extra storage. Increasing the message size to 128KB means only 1Kb extra storage. Unfortunately, that's data that has go to go into the blockchain. It'd be nice if there was a better way that allowed for smaller storage + infinite challenges.

Are my numbers right, here?

Definitely, the best way would be to develop this in steps, and the first step would definitely not implement the E-PDP algorithm, only a simple hash and verify method.

mappum

member

Activity: 82

Merit: 13

It all looks pretty solid, I've only come up with a few small issues.

First of all, I don't think there is a need to limit the downloads to nodes that are close and the owner. This prevents use as a CDN, and also, it is trivial to keep generating new private keys until one is found that is close to the desired file.

Also, (this may or may not be a problem) as mentioned in IRC, attackers could also keep changing their data to change its hash to decide what nodes they want to store it.

A bigger problem is that infinite nodes can be hosting the same data while the owner will only pay out to n nodes. The nodes that get to be paid are simply the ones who announce it first, and any attacker can just keep announcing the data they have (unless all nodes are recording all announcements). This could add up to be a lot of data. Even with no attackers, it means potential income isn't guaranteed, and you will lose a lot of it from the bad luck of not being first.

Mike Hearn

legendary

Activity: 1526

Merit: 1134

The E-PDP paper is interesting but my reading of it was different to yours - their sample had to store 128mb of tags for a 4g file. It doesn't have the small overhead you think: it reduces client overhead by increasing server overhead. It's also more intensive - randomized hash challenges boil down to one disk seek on the server (in the best case of an unfragmented file) and some hashing which is fast, whereas their scheme involves lots of hopping around.

Still, I think it's not a huge difference. You could implement v1 using a simpler proof of storage and then upgrade to more complex proofs in a v2.

jspilman

newbie

Activity: 19

Merit: 0

Quote from: Peter Todd on November 27, 2013, 12:47:56 PM

Because the fragment is based on the previous block hash, there is a time limit to how quickly the fragment must be retrieved, thereby proving (after sufficient trials) that the data is physically located within a sphere of radius 10minutes * the speed of light - currently this would prove the data may be physically located on Earth, the Moon and Venus, but no other planet. With a second proof-of-work blockchain established on, say, Pluto, we could then easily prove a similar result for data located on or nearby Pluto. (proving the Pluto proof-of-work blockchain is in fact located on Pluto is left as an exercise for the reader)

I suppose the proof of work could take the previous block hash along with some nonce and then iterate on the data in a 'memory-hard' fashion to add latency. Each nonce+result would be redeemable for a single payment coupon within some fixed time period, e.g. when the next block is found.

I think a good solution to this problem is useful in all sorts of interesting ways... for example, when the blockchain itself is the target data, and the storage payments are transaction fees.

Peter Todd

legendary

Activity: 1120

Merit: 1164

Quote from: jspilman on November 27, 2013, 11:51:00 AM

Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?

That's impossible unfortunately - no amount of math can prevent a node from outsourcing the actual storage to a central location that is a single-point-of-failure.

The best you'll ever be able to do is pay enough that there is incentive to do the job right, and/or use social/legal mechanisms to audit what node operators do and then arrange these transactions with specific operators. Not terribly exciting solutions unfortunately...

Quote from: jspilman on November 27, 2013, 11:51:00 AM

The obvious solutions are proof-of-work or proof-of-stake, but I can't see how proof-of-work alone would suffice.

Yeah, proof-of-work can only prove IO bandwidth, not where the data is located.

However... interactive latency measurements can prove data within some sphere of radius t*c. With multiple trusted challenge servers around the world located in co-location centers one could easily prove that the data must be physically present in multiple locations.

You can even do in a fully decentralized way with some extensions to the Bitcoin scripting language by creating a txout that can only be spent by proving you have some data fragment, where the fragment is chosen randomly based on the previous block hash.

Because the fragment is based on the previous block hash, there is a time limit to how quickly the fragment must be retrieved, thereby proving (after sufficient trials) that the data is physically located within a sphere of radius 10minutes * the speed of light - currently this would prove the data may be physically located on Earth, the Moon and Venus, but no other planet. With a second proof-of-work blockchain established on, say, Pluto, we could then easily prove a similar result for data located on or nearby Pluto. (proving the Pluto proof-of-work blockchain is in fact located on Pluto is left as an exercise for the reader)

Natanael

newbie

Activity: 27

Merit: 0

Quote from: Sarchar on November 27, 2013, 12:12:21 PM

Quote from: Natanael on November 27, 2013, 12:09:52 PM

Sarchar: That can be added easily. You could simply rent out storage space.

To whom? Who's responsible for fulfilling the payments? How do you fairly get paid for that storage?

Your node offers storage space for a price. Clients pay. You store the files. You can integrate proof of storage and more and use Bitcoin's scripts to make sure neither side is at a significant risk of getting screwed over. It's not like it can't mimic most of the features of Bitstorage.

Sarchar

member

Activity: 88

Merit: 10

Quote from: Mike Hearn on November 27, 2013, 12:14:35 PM

Quote from: Sarchar on November 27, 2013, 11:35:31 AM

That's probably the best way to do it, building piece by piece, I mean. I checked out your video and it's cool, but it has its limited usages (peer to peer micropayments is a freaking cool thing). Does it work or can it be adopted for things like public WiFi usage?

It's a general framework. Yes, paying for wifi access is possible (more possible than you might imagine).

Cool. I'm going to look more into this project.

Quote

How would you accomplish the proof of storage challenge if the original author of the data isn't available online to challenge (or the user loses his metadata required to produce those challenges)?

If the challenge metadata is stored in the same place as the micropayment channels, you either lose both or neither. I think that's solvable. The challenges don't have to be big. If you want 1000 days worth of storage, store 1000 80-bit hashes and you're done, right? 10kb of data is trivial.

I think that would be a problem in a distributed environment with terabytes of data meant to be stored indefinitely. The paper linked in my document describes a crypto method that allows for infinite challenges with O(1) metadata storage requirements less than a few kb in size. That metadata can easily be stored in a blockchain.

Quote

Is your TradeNet talk available somewhere?

Yep, see the top video and slide deck beneath it here:

http://plan99.net/~mike/

(unfortunately, the slides in the video are hard to see and washed out)

Thanks.

Mike Hearn

legendary

Activity: 1526

Merit: 1134

Quote from: Sarchar on November 27, 2013, 11:35:31 AM

That's probably the best way to do it, building piece by piece, I mean. I checked out your video and it's cool, but it has its limited usages (peer to peer micropayments is a freaking cool thing). Does it work or can it be adopted for things like public WiFi usage?

It's a general framework. Yes, paying for wifi access is possible (more possible than you might imagine).

Quote

How would you accomplish the proof of storage challenge if the original author of the data isn't available online to challenge (or the user loses his metadata required to produce those challenges)?

If the challenge metadata is stored in the same place as the micropayment channels, you either lose both or neither. I think that's solvable. The challenges don't have to be big. If you want 1000 days worth of storage, store 1000 80-bit hashes and you're done, right? 10kb of data is trivial.

Quote

Is your TradeNet talk available somewhere?

Yep, see the top video and slide deck beneath it here:

http://plan99.net/~mike/

(unfortunately, the slides in the video are hard to see and washed out)

Sarchar

member

Activity: 88

Merit: 10

Quote from: Natanael on November 27, 2013, 12:09:52 PM

Sarchar: That can be added easily. You could simply rent out storage space.

To whom? Who's responsible for fulfilling the payments? How do you fairly get paid for that storage?

Natanael

newbie

Activity: 27

Merit: 0

Sarchar: That can be added easily. You could simply rent out storage space.

Topic: [ANNOUNCE] Whitepaper for Bitstorage - a peer to peer, cloud storage network (Read 9047 times)