Pages:
Author

Topic: Solution to Sia/Storj/etc DDOS issues and Sybil Vulnerability (Read 3240 times)

sr. member
Activity: 336
Merit: 265
I am very sleepy, so please excuse the poor quality prose that follows...

I realized today there was a very important distinction I needed to make between the reason I originally dismissed my proof-of-diskspace concept in 2013 and the applicability to personalized storage, which wasn't the application I dismissed.

I dismissed it for proving the decentralized resource for participating in the consensus algorithm, because there was no way to produce the encrypted data variants so as to prevent the Sybil attack, because the encryption of the variants of the same data would have to be done in public, i.e. no encryption was possible.

Whereas, with personal storage, then the owner of the file can indeed provide multiple encrypted variants so as to insure those copies are stored redundantly.

So this is why although I dismissed the applicability for blockchain consensus, it could still remain a valid concept for storage.

And thus there is no real innovation in consensus problems of blockchains, nor is the token of these Sia, MaidSafe, and Storj going to have any value. To the extent that the proof-of-storage/retrievability helps make a better enterprise cloud hosting system, it will not sustain a token + blockchain by itself. The payment and blockchain for such cloud hosting would still be which ever payment system and blockchain wins overall (and that is going to require innovation in blockchain technology).

Also as I pointed out before, the P2P cloud storage would be dominated by the highest economy-of-scale vendors, not a system running on the storage of home computers connected over consumer Internet connections.

Edit: more that here:

In addition to my prior critique of "proof-of-storage", I see some additional flaws in the idea expressed as quoted below:

Coins are issued by the network based on the following formula:

-1 coin = 1gb hosted for 1 month.
-Any downtime (detected by pinging) reduces profit 10x (ie, if your mining machine is down for 1 day, you lose 10 days worth of profit for that uptime month)
-100% of your "storage" has to be downloadable by the network within 1 hour, tested by the network randomly 4 times per month (uptime month of 30 days, not calendar). If you fail this test your profits over this period are reduced by double the amount of failed download, eg, you are hosting (mining with) 4gb of space, a random download attempt occurs and only 90% of the 4gb is downloaded, then your profits are reduced by 20% untill the next random download test.
-When you start mining you do not receive profit for the first uptime week of 7 days (this is to stop people that had some downtime simply creating a new miner on a new wallet straight away)
- Ping checks are performed every 15 minutes, you need to fail 2 to be considered "down". Thus you can install an update and restart without "down time".
- Miners are also rewarded the transaction fees of the network, spread evenly to the miners based on earning.

None of these quoted are objectively provable to the public-at-large, i.e. on a blockchain. For example, proof-of-storage can work from the perspective of the owner of the data to be stored, but not from a public perspective.

Network performance can't be proven. This is one of the fundamental reasons we have to deal with Byzantine fault tolerance on networks. How do you prove to a blockchain that the ping time you measured was accurate. You can't. How do you prove downtime. You can't. If you say voting, then you have Sybil attacks on voting. Byzantine agreement can't remain unstuck without a hardfork or whales. Etc..

Sorry this is entirely impossible. It violates the basic research and fundamentals. Much more detail is in my unpublished white paper wherein I start from first principles and try to explain these fundamentals (but ends up being far too much to summarize to laymen, so I don't know if that version of the whitepaper will be the one I end up publishing).

So this is what I mean with my criticism that proof-of-storage can't even really work well even for file storage in the Storj model where each user encrypts the data to be stored (in multiple variants), because it is impossible to insure fungible performance for the data retrievability.
sr. member
Activity: 336
Merit: 265
...but proper erasure encoding can make it many orders of magnitude more resistant.

Even on disk failure, some sectors can often be recovered with forensics. But then you need the storage providers' reputation at risk, so they have the economic incentive to pay for that forensics. Again it seems the Sybil attack is the problem, because they can blame the failure on a disposable Sybil.
Issue is not a full solution to a Sybil attack just Sybil resistance. When you get like 16 failures per trillion, its not really an issue. Even Amazon S3 has 15x more failures than that.

If in attacker has to store 10% of the entire network, but only has a 1.67e-11 chance of affecting a file I'd say thatis  good enough. Worst case you can start adding economic incentives and disincentives. Look up the attackers funds/earnings on the blockchain for 3 months. "Oops lost your file out of 1 trillion. Here is $10k taken from the attacker."

My original criticism was that what have we accomplished that I couldn't just buy at Google's cloud.

Because a P2P network can outperform Google's cloud at half the cost. If someone offered you a new car that goes 4x faster at half the cost, would you still want to stick with your own car?

How do you calculate that? Google can locate its servers next to hydropower and pay 4 cents per KWH. They have the economy-of-scale to buy hardware cheaper and build the infrastructure for data centers. They can locate on the faster Tier 1 backbone Internet.

How can the average individual provide storage that competes  Huh

Seems to me you will just build a system that Google can Sybil attack and provide all the storage for, increasing their profits and economies-of-scale.

The only possible way I can see to prevent this, is to never pay for storage, rather only swap storage for storage. In other words, if I store 500 GB from the network, then I can also store 500 GB on the network. But then the problem is the economics of accessing it the data. Isn't this similar to what MaidSafe is doing? But then how did they corrupt that with a token to raise ICO (no use for a token if P2P trading storage)?

To deal with the economics of access, I think the data one stores for the network would need to have the same access rate pattern as the data one stores on the network. The network needs to institute this policy.
legendary
Activity: 1094
Merit: 1006
...but proper erasure encoding can make it many orders of magnitude more resistant.

Even on disk failure, some sectors can often be recovered with forensics. But then you need the storage providers' reputation at risk, so they have the economic incentive to pay for that forensics. Again it seems the Sybil attack is the problem, because they can blame the failure on a disposable Sybil.
Issue is not a full solution to a Sybil attack just Sybil resistance. When you get like 16 failures per trillion, its not really an issue. Even Amazon S3 has 15x more failures than that.

If in attacker has to store 10% of the entire network, but only has a 1.67e-11 chance of affecting a file I'd say thatis  good enough. Worst case you can start adding economic incentives and disincentives. Look up the attackers funds/earnings on the blockchain for 3 months. "Oops lost your file out of 1 trillion. Here is $10k taken from the attacker."

My original criticism was that what have we accomplished that I couldn't just buy at Google's cloud.
Because a P2P network can outperform Google's cloud at half the cost. If someone offered you a new car that goes 4x faster at half the cost, would you still want to stick with your own car?
sr. member
Activity: 336
Merit: 265
...but proper erasure encoding can make it many orders of magnitude more resistant.

Even on disk failure, some sectors can often be recovered with forensics. But then you need the storage providers' reputation at risk, so they have the economic incentive to pay for that forensics. Again it seems the Sybil attack is the problem, because they can blame the failure on a disposable Sybil.
Issue is not a full solution to a Sybil attack just Sybil resistance. When you get like 16 failures per trillion, its not really an issue. Even Amazon S3 has 15x more failures than that.

If in attacker has to store 10% of the entire network, but only has a 1.67e-11 chance of affecting a file I'd say thatis  good enough. Worst case you can start adding economic incentives and disincentives. Look up the attackers funds/earnings on the blockchain for 3 months. "Oops lost your file out of 1 trillion. Here is $10k taken from the attacker."

My original criticism was that what have we accomplished that I couldn't just buy at Google's cloud.
legendary
Activity: 1094
Merit: 1006
...but proper erasure encoding can make it many orders of magnitude more resistant.

Even on disk failure, some sectors can often be recovered with forensics. But then you need the storage providers' reputation at risk, so they have the economic incentive to pay for that forensics. Again it seems the Sybil attack is the problem, because they can blame the failure on a disposable Sybil.
Issue is not a full solution to a Sybil attack just Sybil resistance. When you get like 16 failures per trillion, its not really an issue. Even Amazon S3 has 15x more failures than that.

If in attacker has to store 10% of the entire network, but only has a 1.67e-11 chance of affecting a file I'd say thatis  good enough. Worst case you can start adding economic incentives and disincentives. Look up the attackers funds/earnings on the blockchain for 3 months. "Oops lost your file out of 1 trillion. Here is $10k taken from the attacker."
sr. member
Activity: 336
Merit: 265
...but proper erasure encoding can make it many orders of magnitude more resistant.

Even on disk failure, some sectors can often be recovered with forensics. But then you need the storage providers' reputation at risk, so they have the economic incentive to pay for that forensics. Again it seems the Sybil attack is the problem, because they can blame the failure on a disposable Sybil.
legendary
Activity: 1094
Merit: 1006
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.

Everything starts as an idea. Do you believe in the idea of distribution and decentralization? It all falls apart if we can't get our data out of centralized data centers. What good is a decentralized application if its just run at Amazon S3?

Of course I do.

I'll paradigm-shift you. We can decentralize our servers. Abstractly I am thinking the fundamental error in decentralized file stores such as these, is we are modelling a monolith, i.e. a total order on redundancy. Paradigm-shift to a plurality of partial orders.

Btw, I like the name Storj.

Thanks its taken with permission from this post: https://bitcointalksearch.org/topic/m.642768

Kudos to Gmaxwell on the name then.

I have an idea.

What about a different approach to achieving redundancy.

Redundancy is fundamentally about making sure our data is stored on more than one hard disk.

If we could disperse the bits of the data across TBs of data, then the host actually has no incentive to cheat as the host can use RAID striping to maximize their performance.

So then we probably need a blockchain to manage this coordination.
Best way to do this is through Reed-Solomon erasure encoding. Its kinda like a mini RAID per file. For example we use 20-of-40. So the file is broken into 40 pieces of which you need any 20 to recover. Like you said you can't solve Sybils, but proper erasure encoding can make it many orders of magnitude more resistant.
sr. member
Activity: 336
Merit: 265
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.

Everything starts as an idea. Do you believe in the idea of distribution and decentralization? It all falls apart if we can't get our data out of centralized data centers. What good is a decentralized application if its just run at Amazon S3?

Of course I do.

I'll paradigm-shift you. We can decentralize our servers. Abstractly I am thinking the fundamental error in decentralized file stores such as these, is we are modelling a monolith, i.e. a total order on redundancy. Paradigm-shift to a plurality of partial orders.

Btw, I like the name Storj.

Thanks its taken with permission from this post: https://bitcointalksearch.org/topic/m.642768

Kudos to Gmaxwell on the name then.

I have an idea.

What about a different approach to achieving redundancy.

Redundancy is fundamentally about making sure our data is stored on more than one hard disk.

If we could disperse the bits of the data across TBs of data, then the host actually has no incentive to cheat as the host can use RAID striping to maximize their performance.

So then we probably need a blockchain to manage this coordination.
legendary
Activity: 1094
Merit: 1006
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.

Everything starts as an idea. Do you believe in the idea of distribution and decentralization? It all falls apart if we can't get our data out of centralized data centers. What good is a decentralized application if its just run at Amazon S3?

Of course I do.

I'll paradigm-shift you. We can decentralize our servers. Abstractly I am thinking the fundamental error in decentralized file stores such as these, is we are modelling a monolith, i.e. a total order on redundancy. Paradigm-shift to a plurality of partial orders.

Btw, I like the name Storj.
Thanks its taken with permission from this post: https://bitcointalksearch.org/topic/m.642768
sr. member
Activity: 336
Merit: 265
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.

Everything starts as an idea. Do you believe in the idea of distribution and decentralization? It all falls apart if we can't get our data out of centralized data centers. What good is a decentralized application if its just run at Amazon S3?

Of course I do.

I'll paradigm-shift you. We can decentralize our servers. Abstractly I am thinking the fundamental error in decentralized file stores such as these, is we are modelling a monolith, i.e. a total order on redundancy. Paradigm-shift to a plurality of partial orders.

Btw, I like the name Storj.
legendary
Activity: 1094
Merit: 1006
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.
Everything starts as an idea. Do you believe in the idea of distribution and decentralization? It all falls apart if we can't get our data out of centralized data centers. What good is a decentralized application if its just run at Amazon S3?
sr. member
Activity: 336
Merit: 265
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?

I didn't develop (pursue) the idea beyond the conceptual investigation phase, because I determined that it wasn't a solid enough direction to pursue.

The idealism of it appeals to me of course. But I've also learned to be very skeptical of idealistic causes, because they can be intoxicating and cloud objectivity.

I am obviously going to be more circumspect about dubious project technologies, given my age. I don't have another decade to expend on something that does not pan out.

The problem with monero is that it is not used much as a currency https://getmonero.org/getting-started/merchants the same thing happened to peercoin, hardly used for anything besides speculation.

It is hardly profitable for me to mine it desipite having 2x r9 290 and it is likely to get even worse as the blockreward continues to decrease, i am afraid the coin will end up in the hands of botnets.

Not used much as a currency? Did you miss all the post where people are talking about buying things with xmr.to? To make it short in that sense every shop that accepts bitcoin also accepts monero.

That was the main marketing innovation I saw from Monero's ecosystem. Even someone used that once to fund me.

It is befitting that Shapeshift.io copied you, given one of the threats Monero used to make when ever I would explain I wanted to work on my own experiments, was they being open source could just copy any thing that was valuable.

Btw, I was pitching the conceptual idea of XMR.to back in 2013 on BCT. It was one of the rebuttals I had for the Bitcoin maximalists. And yet again one of my ideas becomes a blockbuster success. You think I don't have a lot more of those ideas in my back pocket.
legendary
Activity: 1094
Merit: 1006
Wait a minute, how did you handle redundancy in your old solution? Did you do something like 3x redundancy in case some of the nodes went down?
sr. member
Activity: 336
Merit: 265
Edit: after writing the following, I took a minute to review super3's GitHub page and I see he is a young man recently graduated and working on his M.S.. I don't wish to discourage him from working on projects & ideas he is passionate about. And I feel empathetic if what I have explained to him is disappointing. Being I am 51 years old, I have a bit more experience than he does, so I am more apt to see wherein a technical idea is fundamentally flawed. I wish him all the best and it is unfortunate that we got entangled on this issue.


"That a regular user has nearer to parity in terms of economies-of-scale cost efficiencies is irrelevant to the game theory economics that forces centralization in this case."
tldr; "I'm just going to ignore the economics because it doesn't help my case."

Liar. That is sleazy to use argumentation that tries to spin what I wrote. State your economic argument instead.

Your tldr; is incorrect, because my point was that it was irrelevant because of the following...


Let me put in simpler terms for you. Google will simply take all your business (hiding behind Sybils with their vast economies-of-scale near hydropower, etc) and say thank you very much. You will have accomplished nothing.

"More importantly, the pool can then lie and create numerous Sybils so that it can be paid multiple times for storing the same data only once. So Storj loses redundancy."
I missed the part where the pool found all the data, reassembled it in the right order, brute for[ce] decrypt it, and de-duplicated.

The pool just routes the requests. It is just a pass-through mechanism. You seem to keep forgetting that latency is not a viable proof of Sybils. I made that point very early in this thread.

"Selecting on performance locks out the little guy, thus driving centralization. IPs can be created that add no appreciable latency."
Nope because the centralized farming(3 states away) is not going to beat the little farmer that happens to 3 blocks away.

Nonsense for several reasons (which I will let you figure out*). Latency is a non-starter. Have fun with your clusterfuck.

* One hint is that a multi-furcating network is always more efficient than a mesh network. That is why we won't run a water main directly to each household.


P.S. if you continue to resort to sleazy tactics of discussion, I will just ignore you and let you go on Dunning-Kruger confident to your foolish failure.
legendary
Activity: 1094
Merit: 1006
"That a regular user has nearer to parity in terms of economies-of-scale cost efficiencies is irrelevant to the game theory economics that forces centralization in this case."
tldr; "I'm just going to ignore the economics because it doesn't help my case."

"More importantly, the pool can then lie and create numerous Sybils so that it can be paid multiple times for storing the same data only once. So Storj loses redundancy."
I missed the part where the pool found all the data, reassembled it in the right order, brute for decrypt it, and de-duplicated.

"Selecting on performance locks out the little guy, thus driving centralization. IPs can be created that add no appreciable latency."
Nope because the centralized farming(3 states away) is not going to beat the little farmer that happens to 3 blocks away.

sr. member
Activity: 336
Merit: 265
There is no proof, it comes down to statistics. It boils down to a hypergeometric distribution which you can find the math for here. For an example case, you would have to store 80% of the entire network to get a 50% chance of the attack(for one file). Bitcoin has much lower tolerances and breaks down at around 51%.

So the attack you mentioned is valid, but ignores the math behind it. It goes from being a difficult attack to pull off, to statistically improbable. You can post as much red text as you like but only a hypergeometric distribution showing me otherwise is acceptable.

Your system is paying the "attacker" for as many free Sybils as he/she can create in order to fund as much storage as he needs to store as much data as your system wants to pay for. One can rent all the storage they need, e.g. Google's cloud.

Since Sybils (IP address) are virtually free to create, competitors are in an arms race to see who can capture more of the revenue. This is a power vacuum, which means there can only be one winner-take-all, because of economies-of-scale. The little guy gets squeezed and can't get paid hardly ever, analogous ASIC mining in Bitcoin where the more hashrate share the miner has, the more profitable they are. So expect pools to be created when nodes share their hard disks, so they can win more challenge-responses and earn more with less variance.

It is analogous to proof-of-work, in that it leads entirely to centralization.

I hope I don't have to write it in red text for it to not be ignored.

But you are forgetting that using hard drives doesn't scale like mining does. You don't have 14nm mining chips inside every single computer.

That a regular user has nearer to parity in terms of economies-of-scale cost efficiencies is irrelevant to the game theory economics that forces centralization in this case.

Nodes don't have an incentive to pool, they should make more or less the same if they are "solo-mining" or in a pool.

Incorrect due to variance same as for Bitcoin mining. That is unless every piece of data stored on the network will be accessed every day and in which case you wouldn't even need a challenge and response proof, because reading the data would be a proof.

More importantly, the pool can then lie and create numerous Sybils so that it can be paid multiple times for storing the same data only once. So Storj loses redundancy.

Sure you can spin up as many IPs as you want, but that doesn't mean you will get better performance than other nodes which is how farmers are actually selected to store data. Plus if you add a deposit to that, it makes spinning up those IPs not free anymore.

Selecting on performance locks out the little guy, thus driving centralization. IPs can be created that add no appreciable latency.

Deposits are amortized over all the profits, so just like proof-of-stake they are not an improvement.

Sorry these issues are fundamental. This is the reason I abandoned the idea in 2013, when I was the first person to invent it. Now you copy the broken idea that I invented. Sigh.
(and make a lot of money selling a broken concept to n00bs that I was unwilling to do because I have ethics)
legendary
Activity: 1094
Merit: 1006
There is no proof, it comes down to statistics. It boils down to a hypergeometric distribution which you can find the math for here. For an example case, you would have to store 80% of the entire network to get a 50% chance of the attack(for one file). Bitcoin has much lower tolerances and breaks down at around 51%.

So the attack you mentioned is valid, but ignores the math behind it. It goes from being a difficult attack to pull off, to statistically improbable. You can post as much red text as you like but only a hypergeometric distribution showing me otherwise is acceptable.

Your system is paying the "attacker" for as many free Sybils as he/she can create in order to fund as much storage as he needs to store as much data as your system wants to pay for. One can rent all the storage they need, e.g. Google's cloud.

Since Sybils (IP address) are virtually free to create, competitors are in an arms race to see who can capture more of the revenue. This is a power vacuum, which means there can only be one winner-take-all, because of economies-of-scale. The little guy gets squeezed and can't get paid hardly ever, analogous ASIC mining in Bitcoin where the more hashrate share the miner has, the more profitable they are. So expect pools to be created when nodes share their hard disks, so they can win more challenge-responses and earn more with less variance.

It is analogous to proof-of-work, in that it leads entirely to centralization.

I hope I don't have to write it in red text for it to not be ignored.
But you are forgetting that using hard drives doesn't scale like mining does. You don't have 14nm mining chips inside every single computer. Nodes don't have an incentive to pool, they should make more or less the same if they are "solo-mining" or in a pool.

Sure you can spin up as many IPs as you want, but that doesn't mean you will get better performance than other nodes which is how farmers are actually selected to store data. Plus if you add a deposit to that, it makes spinning up those IPs not free anymore.
sr. member
Activity: 336
Merit: 265
There is no proof, it comes down to statistics. It boils down to a hypergeometric distribution which you can find the math for here. For an example case, you would have to store 80% of the entire network to get a 50% chance of the attack(for one file). Bitcoin has much lower tolerances and breaks down at around 51%.

So the attack you mentioned is valid, but ignores the math behind it. It goes from being a difficult attack to pull off, to statistically improbable. You can post as much red text as you like but only a hypergeometric distribution showing me otherwise is acceptable.

Your system is paying the "attacker" for as many free Sybils as he/she can create in order to fund as much storage as he needs to store as much data as your system wants to pay for. One can rent all the storage they need, e.g. Google's cloud.

Since Sybils (IP address) are virtually free to create, competitors are in an arms race to see who can capture more of the revenue. This is a power vacuum, which means there can only be one winner-take-all, because of economies-of-scale. The little guy gets squeezed and can't get paid hardly ever, analogous ASIC mining in Bitcoin where the more hashrate share the miner has, the more profitable they are. So expect pools to be created when nodes share their hard disks, so they can win more challenge-responses and earn more with less variance.

It is analogous to proof-of-work, in that it leads entirely to centralization.

I hope I don't have to write it in red text for it to not be ignored.
legendary
Activity: 1094
Merit: 1006
There is no proof, it comes down to statistics. It boils down to a hypergeometric distribution which you can find the math for here. For an example case, you would have to store 80% of the entire network to get a 50% chance of the attack(for one file). Bitcoin has much lower tolerances and breaks down at around 51%.

So the attack you mentioned is valid, but ignores the math behind it. It goes from being a difficult attack to pull off, to statistically improbable. You can post as much red text as you like but only a hypergeometric distribution showing me otherwise is acceptable.
legendary
Activity: 2142
Merit: 1009
Newbie
The paper is nonsense.

It's a good ending of the convo, no other replies.
Pages:
Jump to: