Pages:
Author

Topic: [IDEA] Dirt cheap online storage (Read 14466 times)

legendary
Activity: 1400
Merit: 1005
August 30, 2012, 06:03:31 PM
Did you read the links ? I'm about 95% sure thats exactly what he has Wink

Quote from: Kim Dotcom
We are building a massive global network. All non-US hosters will be able to connect servers & bandwidth. Get ready.
Developers get ready. The Mega API will provide incredible powers. Our API and your Mega tools will change the world.
The new Mega will offer one-click-encryption of ALL your data transfers, on the fly, easy to use, free of charge, TOTAL PRIVACY!
Developers of up/download & file managers, email & fax tools, VOIP & video apps please email twitter at kim.com for early API access.

Derp.  Not yet, lol!  Ok so, that is awesome.  Smiley
full member
Activity: 182
Merit: 100
August 30, 2012, 05:53:30 PM
Did you read the links ? I'm about 95% sure thats exactly what he has Wink

Quote from: Kim Dotcom
We are building a massive global network. All non-US hosters will be able to connect servers & bandwidth. Get ready.
Developers get ready. The Mega API will provide incredible powers. Our API and your Mega tools will change the world.
The new Mega will offer one-click-encryption of ALL your data transfers, on the fly, easy to use, free of charge, TOTAL PRIVACY!
Developers of up/download & file managers, email & fax tools, VOIP & video apps please email twitter at kim.com for early API access.
legendary
Activity: 1400
Merit: 1005
August 30, 2012, 05:43:11 PM
Someone has to program it first!   Sad
full member
Activity: 182
Merit: 100
August 30, 2012, 05:39:51 PM

Looks like we will all be adding hard drives to our btc/ltc miners to make some extra on the side. distributed file systems ftw. The "Mega Api" in combination with tor .onion, namecoin .bit, .... ugh wet dreams about the good old day's.

http://www.youtube.com/watch?feature=player_embedded&v=MokNvbiRqCM

legendary
Activity: 1400
Merit: 1005
July 03, 2012, 07:25:59 PM
Ok Sukrim, everything you said makes sense.  I agree with you.  Wink

Ideally, a special torrent client would be used that automatically reports how much they downloaded from whom to service.com.  And like you said, that communication would happen nearly instantly.
legendary
Activity: 2618
Merit: 1007
July 03, 2012, 05:04:57 PM
I meant client puts up money up-front to service.com (= centralized service instance that is responsible for handing out money). If a client then downloads something, they will very soon report to service.com how much they downloaded from whom. A seeder can then see if the people loading from him are really reporting their traffic or not. This way he can make sure people actually reporting correct amounts of traffic (he knows how much traffic went out per IP/user).

I don't see how a "regular web login system" should help anything against abuse or copyright claims. I personally really like the idea of at least limited anonymity (service.com would know: IP addresses, SSL IDs, Bitcoin flows + amount of traffic (maybe even per file/piece, though that's not necessary) as well as some meta data like bandwidth estimates). Service.com could even ban certain pieces, just like BitTorrent trackers can blacklist Infohashes.

There's no way though to make sure as service provider that people don't use the system for illegal stuff other than hand checking everything and requiring cleartext transfers. I'd rather go the BitTorrent way, where service.com simply has no chance whatsoever of knowing what's behind a piece hash and can only offer to blacklist these upon a court order.
legendary
Activity: 1400
Merit: 1005
July 02, 2012, 02:58:57 PM
The client pays, and it'd be up to a programmer to figure out how to make it work.  Maybe a special torrent client has to be used for hosting the files?  I don't know.

Quote
service.com publishes payments downloaders committed to paying to storage nodes, so storage nodes can check that they are trustworthy
I don't understand what you mean by this.  Client payments shouldn't be on a per-download basis, and all the client fees are paid up front to service.com, who then distributes them to the nodes.  No trust necessary.

It sounds like our goals aren't quite aligned Sukrim.  Why not just have a regular web login system for the clients?  My goal with the project isn't to serve anonymous file storage.  That'd just allow people to upload all sorts of illegal nonsense and put service.com in a world of hurt for organizing it all.
legendary
Activity: 2618
Merit: 1007
July 02, 2012, 01:54:28 PM
Also, whenever the file is downloaded (whether it be by the client or by another node wishing to host the file), the hosts are paid for that week by percentage according to how much data they provided during the download.

Who pays and how should/could this be enforced?

I'm still thinking mainly about issues with my original plan earlier on:
service.com checks hashlists provided by uploaders and pays some standard fees (quite low) for that
service.com publishes payments downloaders committed to paying to storage nodes, so storage nodes can check that they are trustworthy

If someone wants to check if storage nodes really offer the files they claim, they can anyways try to download them.

One of the few questions that remain for me is which protocol should be used (I'd lean towards WebDav), how downloaders verify themselves (so nobody else claims to be "Alice" and spends her money deposited on service.com) while still remaining anonymous (I'd lean towards PGP public/private key pairs) and some other minor things...
legendary
Activity: 1400
Merit: 1005
July 02, 2012, 11:52:21 AM
I haven't found a programmer capable of it so far.  Wink
sr. member
Activity: 252
Merit: 250
Inactive
July 02, 2012, 11:51:34 AM



I am sooo waiting for this to happen.


Do it. 
hero member
Activity: 784
Merit: 1000
0xFB0D8D1534241423
July 02, 2012, 11:46:36 AM
Exactly. You don't need to check the whole file from every node. Just check random chunks of the file.
legendary
Activity: 1400
Merit: 1005
July 02, 2012, 11:35:14 AM
I still say that the client should create an index file (of sorts) with random salted checksums prior to uploading the file.  Perhaps several thousand of them.  Then each day, the client sends out a request to complete the salted checksum to every node hosting the file.  Those who successfully complete the salted checksum and match the index file get paid - those who don't, do not get paid.

Alternatively, the client could upload this salted index to the central entity, who would then send out the checks.  Clients may not be online 24/7 to verify file integrity, but it'd be a simply job for the central entity to do.

The file wouldn't take up much storage space, but would ensure 1000's of days of file integrity verification.

Also, whenever the file is downloaded (whether it be by the client or by another node wishing to host the file), the hosts are paid for that week by percentage according to how much data they provided during the download.

So node payments:
- Would be weekly.
- The node would have to meet the checksum requirement 100% every day for the week to be paid for it.
- If the file is downloaded at any point in the week, the nodes would split the payment according to how much of the file they provided.  Higher-bandwidth nodes would be paid more for this reason.
- If the file is not downloaded, the nodes would split the payment equally.
legendary
Activity: 2618
Merit: 1007
July 02, 2012, 09:24:33 AM
You can NOT check if a client will provide a chunk by salted/random checksumming, only if a client stores a chunk. To check for actual availability, you have to transfer + measure accordingly. If I still get money/other benefits for not uploading anything, I won't (legal hassle, upload bandwidth is scarce).

1 chunk --> 5 nodes is even worse, as I'd have to constantly download and measure speed from 5 nodes just for 1 piece.

IP addresses are dynamic (with IPv4) - if I run that on Amazon EC2 (and now Google's service too), good luck with banning all my IPs! IDs are no problem, as they are being randomly generated anyways and can't even be guaranteed to be unique. Currently it would make no/little sense to steal NodeIDs in the BitTorrent DHT for example. It's easily possible however to do so.

The strain on the network should come from actual transactions, not only "show me that you still store this!" traffic imho. Also I would have inbound traffic...
Again, imagine me storing 5 TB of data in this cloud. 1 MB chunks, stored at 5 nodes at the same time to make sure they are still available all the time (in reality I'd use Solomon Reed but just for this example...). I only download a random (on average) 4 kB part of this file and then cancel the transfer (if I only always load 4 kB fixed, they migh choke anyone trying to transfer more than that, as it isn't a test).

25000000 parts * 4 kB = 100000000 kB transfer per week (100 GB) = ~165 kB/s constant download. I didn't even do the 1 kB = 1024 MB etc. thingie, including other overheads you can expect to have about 200 kB/s constant download just to check that your part files are still available once a week, which is very little (on average just ~4 tests a month).

Sorry, but monitoring the download speeds can ONLY be done by the actual downloaders themselves. Everything else is a HUGE waste of time and ressources.
full member
Activity: 222
Merit: 100
www.btcbuy.info
July 02, 2012, 08:44:56 AM
So you would have potentially huge overhead bandwidth (99 clients download my files just to test to /dev/null and the 100th actually really needs the data) - good for anonymity ("I didn't need the part, my client just loaded it randomly to check!") but really bad if I want to host 5 TB of data with 1 MB chunks:
1 download of 1 MB per week(!) would mean:
(5*1000*1000)/(7*24*60*60) = >8 MB per second constant upload only for making sure the files I claim to store are actually available. If more than 1 client actually wants to check 1 piece (I'd want to have estimates of at least a handful of clients how trustworthy this random storage node is in their experience) you'd even have to multiply that, so a 100MBit line isn't even enough just to make the data available.

This will definitely NOT be "dirt cheap"!

Also you (or NCs) have to trust other clients - then you can trust the storage node itself too, because for all you know the operator of a storage node could also operate lots of sock puppet client nodes that praise him for his great reliability.

You don't check every chunk by downloading, only small portion. Most chunks would be checked using salted checksum method.

Also, this was never intended to be a 1 Client : 1 Node. What you want to do is to put your data onto as many nodes as possible. This way, if needed, you could download all of it fast, even when individual nodes have suboptimal uplink.

Regarding sock puppet nodes - you can probably do that. I imagine, the nodes that do that would run into the problem of cost associated with running that network. I would also imagine, client would be able to blacklist node by ID or IP address, so if there is a known scammer, it will soon run out of paying clients.

And would it put a strain on node's network and other resources? Hell yeah. But so does bitcoin client, skype, tor node, freenet node, bittorrent client. If it was no cost, why would anybody pay for it.
legendary
Activity: 2618
Merit: 1007
July 02, 2012, 06:19:09 AM
So you would have potentially huge overhead bandwidth (99 clients download my files just to test to /dev/null and the 100th actually really needs the data) - good for anonymity ("I didn't need the part, my client just loaded it randomly to check!") but really bad if I want to host 5 TB of data with 1 MB chunks:
1 download of 1 MB per week(!) would mean:
(5*1000*1000)/(7*24*60*60) = >8 MB per second constant upload only for making sure the files I claim to store are actually available. If more than 1 client actually wants to check 1 piece (I'd want to have estimates of at least a handful of clients how trustworthy this random storage node is in their experience) you'd even have to multiply that, so a 100MBit line isn't even enough just to make the data available.

This will definitely NOT be "dirt cheap"!

Also you (or NCs) have to trust other clients - then you can trust the storage node itself too, because for all you know the operator of a storage node could also operate lots of sock puppet client nodes that praise him for his great reliability.
full member
Activity: 222
Merit: 100
www.btcbuy.info
July 02, 2012, 12:12:18 AM
Simple, just download directly from the nodes, and meter how much data comes in from that particular connection.
So all you'd need to do is choke all non-NC nodes and profit.

As said, the solution from coga is nice for ensuring somebody does indeed store a certain piece of data and keeps it on his/her HDD. It does NOT ensure that this data can be quickly accessed or accessed at all should the need arise. If I get paid only 50% (I won't get any tips) just for running a HDD with ~10W/hour but no outbound bandwidth cost at all, I'd be fine with that...

Compared to non-monetized P2P systems (e.g. Bittorrent's tit-for-tat) there is a much higher possibilty for fraud and a higher motivation and incentive to game the system (something that only pays out on bandwidth would mean that nodes only share very popular file pieces and delete them soon afterwards when nobody/few leechers want them any more).

The problem with nodes not returning actual data would be resolved using following method: Client would be responsible for monitoring his data for availability, randomly downloading his chunks. NCs would be getting statistics from a Clients and aggregating them, so each node will have %availability. When Client requests allocation, it would specify not only the price, but also the minimum availability rating. So, if you are available less that 10% time, you probably will not get much business, and if you are even able to sell ANY space, it would be for very low price. Probably, another statistical value to collect is how many time node refused a block, for which it correctly calculated hash.
legendary
Activity: 2618
Merit: 1007
July 01, 2012, 08:03:45 AM
Simple, just download directly from the nodes, and meter how much data comes in from that particular connection.
So all you'd need to do is choke all non-NC nodes and profit.

As said, the solution from coga is nice for ensuring somebody does indeed store a certain piece of data and keeps it on his/her HDD. It does NOT ensure that this data can be quickly accessed or accessed at all should the need arise. If I get paid only 50% (I won't get any tips) just for running a HDD with ~10W/hour but no outbound bandwidth cost at all, I'd be fine with that...

Compared to non-monetized P2P systems (e.g. Bittorrent's tit-for-tat) there is a much higher possibilty for fraud and a higher motivation and incentive to game the system (something that only pays out on bandwidth would mean that nodes only share very popular file pieces and delete them soon afterwards when nobody/few leechers want them any more).
legendary
Activity: 1400
Merit: 1005
June 27, 2012, 05:46:25 PM
Sarge: Sorry, nope, unfortunately it doesn't make it any cheaper. Basically we do what you stated, we take a 4U space and put a shelf adapter in it and it sits there, so it's more or less using up the same space as a 4U Smiley

Sorry, but it would be the same price.  But, PM me, let me know around what price range you were thinking, maybe we can work something out close to that.
Got it, thanks.  Wink
sr. member
Activity: 298
Merit: 252
June 27, 2012, 05:30:38 PM
Sarge: Sorry, nope, unfortunately it doesn't make it any cheaper. Basically we do what you stated, we take a 4U space and put a shelf adapter in it and it sits there, so it's more or less using up the same space as a 4U Smiley

Sorry, but it would be the same price.  But, PM me, let me know around what price range you were thinking, maybe we can work something out close to that.
Pages:
Jump to: