[ANN] Storj - Decentralized Storage - page 108.

joesmoe2012

hero member

Activity: 882

Merit: 501

Ching-Chang;Ding-Dong

Sorry not for storj, for metadisk. It's a closed beta for round 0. I think they are still taking applications for round 1 over @ storjtalk.org

dadingsda

legendary

Activity: 1310

Merit: 1000

Quote from: joesmoe2012 on June 22, 2014, 02:09:13 PM

the beta 0 beta is out.

ehm what?

joesmoe2012

hero member

Activity: 882

Merit: 501

Ching-Chang;Ding-Dong

the beta 0 beta is out.

PulsedMedia

sr. member

Activity: 402

Merit: 250

Good sources on what? Server pricings?

We charge for an Dual Hexcore Opteron, 16Gb ECC, 4x2Tb with 1Gbps burstable connection just 99€ a month. Tho the bandwidth is calced among max 150mbps average upload traffic. Downstream doesn't matter to us, and this would be mostly downstream most likely. Tho that is extremely cheap for a dual cpu node, and we are likely to increase that pricing at fall.
For storage servers we charge for an E350, 8G, 4x3Tb, 100Mbps Unmetered, 89.98€/Mo + 125€ setup fee.
Both of these two are very cost effective options, the latter coming to just under 7.5€/TiB/Mo.

Zfec: Good, a common, well used library. Eliminates chances of making mistakes, and puts the pressure of bug fixes on 3rd party on the erasure coding front.

Free data sources? What's that?
Please explain why transit capacity becomes storage available?
Does the system constantly transmit the data again and again and again even tho no one is requesting it?

If the system is built so that bandwidth == storage maximum, it becomes INSANELY expensive.
Most operators still pay something like 1.5€+ per Mbps, that's 1500€ for 1Gbps connection -> ~300Tb a month transfer capacity. That's 5€ per Tb! For a large operator like that the cost per Tb of storage to offer then quickly quadruples!
Nevermind a gazillion of other issues.

In any case, as long as the bandwidth usage is sane, for example i have a warehouse which is limited to 20/1M connection or über expensive SDSL, but it needs to be heated up during winters, 5-6 months of a year we need to spend money on heating it up, whether it's oil, bitcoin miners or to this. If BW usage is low, it would make us much sense to put in a couple kWs worth of disk arrays there over time as the capacity gets filled.

Which reminds of compression? On the plans?
Also, do you intend to build in algos separating colder data from warmer to emphasis warmer data to those storage nodes with more BW and colder to the ones with less BW, or do you intend to let natural selection do the job for you with some ultra simple idea?

If the price per TiB on end user ends up cheap enough i envision a lot of parties will use this for backup purposes.

super3

legendary

Activity: 1094

Merit: 1006

Quote from: PulsedMedia on June 10, 2014, 09:13:15 PM

didn't notice that - just learned of storj.

That article tho - you move directly from storage pricing to compare storage pricing against transfer pricing? What?!?
Now that doesn't make - since when did transit capacity become storage capacity?

Apples to Oranges.

Now, the going rate per TiB per month at lowest levels on current dedicated market is 7.5€ for low end servers, and around 6.2€ for very large nodes.
Hence, 100Gb costs a month somewhere around the average of 7.25€ when using the current most cost effective offers out there.

and for someone like me who has his own DC... Well, it's on the CAPEX and not OPEX side of the sheet, but turned to OPEX... Well, it's a fraction of that

Depends on what kind of deals you got, what bulk pricing you get, do you outright buy it or do you lease or rent it, what level of hardware are you utilizing etc. If you buy really big, and seriously skimp on the quality level of hardware we are talking about under 1.5€/TiB/Mo potentially, plus bandwidth fees.
So approximately 0.15€ for 100GiB/Mo with current HDD pricing at cost level, at high efficiency operation.

So, on a system like this, we could potentially see pricing to fall to about quadruple to that fairly easily, when the storage providers start to outweigh user base, we will see it to drop around just 25-30% markup.

If average lifetime of data is 1year, and it is read 4 times during this time, we have 3months timespan, hence with 1Gbps you can host about 790TiB. 1Gbps connected server never really achieves more than 85-95% of the maximum, infact, no link can do faster, since usually about 5% is spent on error correction.

The true ratios of read, cold data timespans etc. will only be revealed on production, it's totally impossible to predict.

What kind of erasure coding is in the plans?

Further, deduplication on GRAND scale would be worth it on something like this, on blocks/files larger than 1000Mb or something like that, then the hash tables will stay somewhat sanely sized. but that can be worked in later on... Just saying, deduplication along with erasure coding in this kind of system are not only just a GREAT idea, but almost necessary. Further, the dedup count could increase redundancy factor for that data as well, dynamically making data which is of interest to more people, more resilient to failures, and the cost could be shared by all those users who wanted to store that data, thus driving the cost dramatically down for those putting in data others have put in as well.
Those dedup tables will still consume INSANE amounts of storage - but hey, that's what's abundantly available

It would still work rather fast with nicely optimized lookup trees, split those hashes up! Wink

a single look up would consume only a couple of I/Os, needs some careful thought to see the quantity of I/Os and to minimize them.

Just because it's cloud, resources ought not to be wasted Wink

EDIT: Oh yeah, and i'm sorry to say, but reading that article made the impression that the writer doesn't know sh** about industrial scale computing, at least explain the reason why comparing apples to oranges, and assumptions that 1Gbps can actually do 1Gbps tells writer had no idea of network technology. In all my 15 or so years in hosting industry i've probably never seen above 120M/s for 1Gig link, and over internet never above 116M/s, and even that is a freakishly rare occurence, average node, average network seems to stall at 95-105M/s commonly.

We are using free data sources for our prototype application, so transit capacity does actually become data storage capacity. As we start to overload these sources it should be clear that the rest of that becomes profit would be split among that actually people who store the data (at whatever rates the market decides).

We want to build legacy APIs to have integrations with existing platforms. If we can make this easy enough we don't see the providers outpacing the consumers as quickly as one might think.

As far as erasure coding zfec looks best because we are a Python shop right now. We plan to implement multiple data sources, so Tahoe-LAFS integration would probably be a nice little fallback if you want proper erasure coding.

1 Gbps = 1 Gbps is just a simplification for describing the cost disparity. If you could give us some good sources on that we can have that updated.

PulsedMedia

sr. member

Activity: 402

Merit: 250

didn't notice that - just learned of storj.

That article tho - you move directly from storage pricing to compare storage pricing against transfer pricing? What?!?
Now that doesn't make - since when did transit capacity become storage capacity?

Apples to Oranges.

Now, the going rate per TiB per month at lowest levels on current dedicated market is 7.5€ for low end servers, and around 6.2€ for very large nodes.
Hence, 100Gb costs a month somewhere around the average of 7.25€ when using the current most cost effective offers out there.

and for someone like me who has his own DC... Well, it's on the CAPEX and not OPEX side of the sheet, but turned to OPEX... Well, it's a fraction of that

Depends on what kind of deals you got, what bulk pricing you get, do you outright buy it or do you lease or rent it, what level of hardware are you utilizing etc. If you buy really big, and seriously skimp on the quality level of hardware we are talking about under 1.5€/TiB/Mo potentially, plus bandwidth fees.
So approximately 0.15€ for 100GiB/Mo with current HDD pricing at cost level, at high efficiency operation.

So, on a system like this, we could potentially see pricing to fall to about quadruple to that fairly easily, when the storage providers start to outweigh user base, we will see it to drop around just 25-30% markup.

If average lifetime of data is 1year, and it is read 4 times during this time, we have 3months timespan, hence with 1Gbps you can host about 790TiB. 1Gbps connected server never really achieves more than 85-95% of the maximum, infact, no link can do faster, since usually about 5% is spent on error correction.

The true ratios of read, cold data timespans etc. will only be revealed on production, it's totally impossible to predict.

What kind of erasure coding is in the plans?

Further, deduplication on GRAND scale would be worth it on something like this, on blocks/files larger than 1000Mb or something like that, then the hash tables will stay somewhat sanely sized. but that can be worked in later on... Just saying, deduplication along with erasure coding in this kind of system are not only just a GREAT idea, but almost necessary. Further, the dedup count could increase redundancy factor for that data as well, dynamically making data which is of interest to more people, more resilient to failures, and the cost could be shared by all those users who wanted to store that data, thus driving the cost dramatically down for those putting in data others have put in as well.
Those dedup tables will still consume INSANE amounts of storage - but hey, that's what's abundantly available

It would still work rather fast with nicely optimized lookup trees, split those hashes up! Wink

a single look up would consume only a couple of I/Os, needs some careful thought to see the quantity of I/Os and to minimize them.

Just because it's cloud, resources ought not to be wasted Wink

EDIT: Oh yeah, and i'm sorry to say, but reading that article made the impression that the writer doesn't know sh** about industrial scale computing, at least explain the reason why comparing apples to oranges, and assumptions that 1Gbps can actually do 1Gbps tells writer had no idea of network technology. In all my 15 or so years in hosting industry i've probably never seen above 120M/s for 1Gig link, and over internet never above 116M/s, and even that is a freakishly rare occurence, average node, average network seems to stall at 95-105M/s commonly.

super3

legendary

Activity: 1094

Merit: 1006

Quote from: blackhatzw on June 10, 2014, 04:48:45 AM

When is the crowdsale, any timeline regarding this project?

Our date is July 15th at this point.

Quote from: jambola2 on June 10, 2014, 07:21:43 AM

Quote from: super3 on April 02, 2014, 11:18:10 PM

Quote from: RGBKey on April 02, 2014, 11:11:16 PM

Considering I have over 2 TB of space just on my normal hard drive, i'd love to use this. It looks really cool.

We have removed our whitepaper because we want to refine it a bit more. At least according to those numbers if you sold you hard drive at Dropbox prices you would make $2,000 a year off your 2 TB. You will be a very happy camper when we add proof of resource to Storj.

We aren't promising that quite yet as the computer science algorithms need a ton more work, but you might be able to run a web node (if you don't mind doing a decent amount of config) and make a nice sum of coins that way.

Wow !
Do you have any ballpark estimate of what the prices would be ?
I know you cannot completely control it , but what would you speculate it to be Tongue

Quote

"Why is your computer on?"
"I'm earning Bitcoins !"
"GPU mining is not really efficient"
"Nope , I'm earning with my hard disk"

I guess that would be the average conversation with a person oblivious of Storj

I'd love to be enrolled in the beta

We wrote an article on this: http://blog.storj.io/post/88303701698/storj-vs-dropbox-cost
Early on the prices will be fixed, but we will transition to a bid system. Users and Miners(Storjers???) will have to decide what price works on the open market.

PulsedMedia

sr. member

Activity: 402

Merit: 250

If this comes to fruition and the software is adaptible and robust enough we could easily contribute hundreds of Tbs of storage and probably 100Gbps of burst bandwidth.

But it needs to be robust and scalable.

We use tons of disks, each system has 2 disks, and they are sold on potential maximum disk usage basis, hence even smallest servers generally have 30%+ of disk space unused, large servers can run 80% empty. We have even nodes with 30Tb+ of disk space and just 4Tb used. For low activity data, that's perfect storage, as long as it's robust.
Ie. if users need more disk space available, we need to be able to autonomously, effortlessly and fast go from say 2Tb to the network to 1Tb to the network on a single node.

If that can happen and esp. if built in, and the software works stable on Linux, we could put it in 50 servers just to test out.

If it actually makes a positive return financially, we could eventually put up petabytes of storage only dedicated to the network.

NOW, as a data specialist, you need at least replication in any case, in a network like this probably 3+ replicas. Better yet, since CPU and RAM is going to be abound and no data on this network is going to be need to have tons of iops and real time RW access, built in erasure coding would be much better. Dedup is a must.

For example, a 1Gb file, split it into 30+5, 30 data pieces, 5 redundancy pieces.
It could be hard coded, say 128Mb pieces for files larger than 1Gb, 32Mb pieces for 500Mb-1Gig files, and smaller ones so that there will be always 16 pieces in multiplication of 4KiB, unless the file is really tiny (<1Mb perhaps?) then just replicated.
That will save storage a TON, increase reliability A lot.
With 32 pieces + 8 redundancy, 8 nodes can disappear simultaneously without loosing the data. That's in a network like this a bit risky still perhaps, maybe when the network matures it's not risky at all anymore since most nodes will be 24/7 servers eventually

Financially, to dedicate resource just for this, i need to see 3€+/Mo per TiB storage and 2€+ per TiB of uploaded data traffic. Market pricing for storage right now in dedis is about 7.5€/TiB/Mo.
It's best to let the market decide the pricing.
The market deciding pricing is the most important bit: You cannot compete with Dropbox like pricing, then people will use Dropbox. But something which is free as in beer -> people will flock in and use it insanely much, putting pressure for prices to increase.

Apps are important too, for example, backing up servers this would be golden!

I would love to use something like this to backup our vast amounts of data if the price is sufficiently low, but in my case it's either or, if the price is extremely low, i will use the system myself, if the storage prices are high, i will be providing storage. We cannot justify doubling the cost per user account, except for limited portfolio of services.

Further to gain market adoption, things need to be simple. Ultra simple. Even a trained monkey can use simple.

But i like this. Looks awesome. Can't wait to see when testing begins!

jambola2

legendary

Activity: 1120

Merit: 1038

Quote from: super3 on April 02, 2014, 11:18:10 PM

Quote from: RGBKey on April 02, 2014, 11:11:16 PM

Considering I have over 2 TB of space just on my normal hard drive, i'd love to use this. It looks really cool.

We have removed our whitepaper because we want to refine it a bit more. At least according to those numbers if you sold you hard drive at Dropbox prices you would make $2,000 a year off your 2 TB. You will be a very happy camper when we add proof of resource to Storj.

We aren't promising that quite yet as the computer science algorithms need a ton more work, but you might be able to run a web node (if you don't mind doing a decent amount of config) and make a nice sum of coins that way.

Wow !
Do you have any ballpark estimate of what the prices would be ?
I know you cannot completely control it , but what would you speculate it to be Tongue

Quote

"Why is your computer on?"
"I'm earning Bitcoins !"
"GPU mining is not really efficient"
"Nope , I'm earning with my hard disk"

I guess that would be the average conversation with a person oblivious of Storj

I'd love to be enrolled in the beta

blackhatzw

full member

Activity: 154

Merit: 100

When is the crowdsale, any timeline regarding this project?

super3

legendary

Activity: 1094

Merit: 1006

Update:
Starting to give out some access codes to the users on Storjtalk. Will give further updates after we fix some issues we are having with the bandwidth.

Banthex

newbie

Activity: 49

Merit: 0

I'm still waiting for a realized project...

StarenseN

legendary

Activity: 2478

Merit: 1362

Nice find, looks that it's real Shocked

schnötzel

legendary

Activity: 1316

Merit: 1041

Bitcoin is a bit**

Signed in. All i have to do is pledge in "crowdsale pledges"?

embicoin

sr. member

Activity: 249

Merit: 250

I signed in the early access form, this project sounds good

asboxi

full member

Activity: 236

Merit: 100

registered on your form

franckuestein

legendary

Activity: 1960

Merit: 1130

Truth will out!

Registered for early access Cheesy

It seems to be an interesting project Wink

voldemort628

full member

Activity: 155

Merit: 100

Quote from: super3 on June 03, 2014, 12:08:23 PM

Activated you manually.

Cheers mate

Quote

The solution is simple. Either don't accept Mastercoin in the crowdsale, or place a way smaller percentage for Mastercoin.

I thought the mechanism is that you *must* use mastercoin for anything done on the protocol, so unless you can work out something with their devs, i dont think not using msc is possible. that leaves u with the latter option but you will have to distribute it manually later like what happened with maidsafe.

Edit: I think they allow for btc crowdfunding now, which is a good thing

super3

legendary

Activity: 1094

Merit: 1006

Quote from: infernoman on June 03, 2014, 01:14:06 PM

still having problems getting onto the forum. username is the same as BTCtalk

Don't see it. PM me a user and pass and I will set it up.

infernoman

legendary

Activity: 964

Merit: 1000

still having problems getting onto the forum. username is the same as BTCtalk

Topic: [ANN] Storj - Decentralized Storage - page 108. (Read 389861 times)