Pages:
Author

Topic: Bitcoin addon: Distributed block chain storage (Read 3368 times)

sr. member
Activity: 280
Merit: 257
bluemeanie
I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.
If you want this then the Bitcoin block-chain and protocol is the wrong design to achieve it. Services like visa paypal are far better designed for serving many transactions from small clusters. More secure too— once you must trust a limited set of nodes to not cheat then protocols which cannot be compromised unless they do offer a better security model.

A fairly pointless comment.

Visa doesn't have any nodes.  You can participate in the authorization process in a Visa transaction?
staff
Activity: 4284
Merit: 8808
I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.
If you want this then the Bitcoin block-chain and protocol is the wrong design to achieve it. Services like visa paypal are far better designed for serving many transactions from small clusters. More secure too— once you must trust a limited set of nodes to not cheat then protocols which cannot be compromised unless they do offer a better security model.
donator
Activity: 2772
Merit: 1019
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org

excuse me?

maybe contact xelister and/or da2ce7. They have been working on "bitcoin over freenet" 2 years ago.
sr. member
Activity: 280
Merit: 257
bluemeanie
The main point of DHT implementation is to let run client with full chain even on cheap computers.

As long as you put major load on single node, the hardware requirements for that node grow along with chain. This may bring a situation when only jet set can run full chain. All others will have to blindly beleive to their chain.

This is centralization IMO.


that's inevitable.  The block chain is currently > 8 GB.  I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.  It's simply not practical to have everyone have a block chain.  These distributed currencies are just getting started and Bitcoin is already unmanageable.

My vision is 1) no more proof of work 2) distributed rather than decentralized currencies.  This offers a lot of advantages.
hero member
Activity: 490
Merit: 500
The main point of DHT implementation is to let run client with full chain even on cheap computers.

As long as you put major load on single node, the hardware requirements for that node grow along with chain. This may bring a situation when only jet set can run full chain. All others will have to blindly beleive to their chain.

This is centralization IMO.
hero member
Activity: 490
Merit: 500
DHT probably will let us go by google way.

Google always planned to never put major load on single node (as bitcoind doing now). This sounds like "Stay away from single expensive main frames. No one of them will never handle load of our tasks. Use many cheap and distributed nodes.". This how BigTable born. In other way, they would stuck in bottlenecks.
sr. member
Activity: 462
Merit: 250
Clown prophet
Cross post

2. Your 'ultimate storage' grows with more users, but so does the amount of spam produced. It would solve nothing. I like many others would still prefer to store the entire chain.
You answered this by own in 4. 1000s HDD are anyway better than one.

4. Pruning, would remove all spent transactions that are 2(?) transactions back since they wouldnt be needed, dramatically reducing the size of the blockchain. At which point your solution is entirely moot since anyone could store whats left of it without issue.
This is good solution, but this breaks chain integrity. And who said spent outputs will not be needed by anyone?

5. Storage devices are getting cheaper and larger every day and so is memory. im sure if it were needed at some point in the future someone could build a custom board with a crazy amount of memory on it to store the UTXO set. With the speed memory runs at im sure someone could make a slower, cheaper, larger ramdisks for this purpose.

Everybody blindly repeat this following satoshi. But satoshi said this regarding storage space capacity. HDD also have one more very important property which nobody takes into account: IO capacity. Soon, bitcoind will run out of IO capacity of spinning HDD, and later, solid state drives.

I don't propose to discard whole local chain. I propose don't dig it without need on local side.

I know at least one use case where my solution will bring performance benefits.

I know that DHT storage just moves load from disk IO to network IO. But just realize, we have a new block with thousands transactions.

With regular client, EACH NETWORK NODE will have to dig into own local chain and do a key lookup there for each transaction. Thousands or millions of nodes will have to do same hard IO work on each new block.

With regular client + DHT enabled - only few will do this. They will cache lookup results into local DHT cache and answer to others from there. So in this case, only few nodes will perform local chain lookup. Lookup results will distribute along network in mostly cached answers.

As bonus, there will be google-large peta-scale storage for all chain with its glory.a
legendary
Activity: 1386
Merit: 1009
As far as I understand DIANNA there's already a solution.
DIANNA client must include sort of lite Bitcoin client. It will store the block headers to verify the blockchain.
While Bloom filter can facilitate the process of verifying DIANNA transactions.

As the DIANNA transaction is linked to a specific Bitcoin transaction, we can simply request this transaction from Bitcoin network using Bloom filter.
legendary
Activity: 2618
Merit: 1007
You don't need all transactions to validate new transactions, only the UTXO set, which will be the work of SPV+ clients (see the "ulimate blockchain compression" topic).
sr. member
Activity: 280
Merit: 257
bluemeanie
Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.
There's no point implementing a new distributed data store unless you're also going to eliminate the need for every node to keep a complete copy of the blockchain.

the nodes still need the blocks to VALIDATE the transactions.  If we're talking about SPV, then they don't really need the blocks at all, just the headers.
legendary
Activity: 1400
Merit: 1013
Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.
There's no point implementing a new distributed data store unless you're also going to eliminate the need for every node to keep a complete copy of the blockchain.
sr. member
Activity: 462
Merit: 250
Clown prophet
I also don't really see the point in this...
If you're after a full block chain, you are probably still for very long much better off just downloading a torrent or so.
If you want to have a lite client instead, use ultraprune and/or the SPV+ mode that is in the works.
If you just want to have access to older transactions, use bloom filters.
If Bitcoin doesn't need this... Alternative chains, or even contracts (trading via chains) will really need indexed access to both bitcoin blocks and transactions.

So they will require somewhat of additional bitcoin client + database somewhere around. It is overkill.

May be it is possible to make an addon to bitcoin, which will optionally turn bitcoin client into DHT participant along with regular local 100% database.
sr. member
Activity: 280
Merit: 257
bluemeanie
While Freenet can do that, it's not necessarily PERFECT for the job.

Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.  Freenet is one of many technologies in this class.  Some DHTs concentrate on security, and in Bitcoin there is no need for transport security as data at that level is public(theres nothing in a Block anyone wants to hide).

Transaction relays are a different story I think.
legendary
Activity: 1400
Merit: 1013
excuse me?
Freenet's data store implements a distributed, redundant, content-addressed file system where information remains persistent while storages nodes randomly enter and leave the network, without requiring users to explicitly configure replication because the nodes automatically handle that.

Get rid of the unnecessary (and slow) anonymity layer and Freenet is perfect for storing large content-addressed data sets like the Bitcoin blockchain.
sr. member
Activity: 280
Merit: 257
bluemeanie
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org

excuse me?
legendary
Activity: 1400
Merit: 1013
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org
sr. member
Activity: 280
Merit: 257
bluemeanie
In the Confidence Chains system you have MANY block chains, and the users might download several of them for various purposes(if they want to trade an asset or participate in a market or auction).  Thus there must be a useful and robust way to retrieve this data over a P2P network.  Only the identities need to hear about transactions, and you can do this with most DHT implementations.  There is an important question for this system: how do you bootstrap a block chain over a p2p network.  Bitcoin solves this in a very specific way, I need a more generalized solution.

With Bitcoin, it's unclear as to how to best optimize the p2p network.  For me lack of docs on this subject might indicate there are some hidden exploits.
legendary
Activity: 2618
Merit: 1007
I also don't really see the point in this...
If you're after a full block chain, you are probably still for very long much better off just downloading a torrent or so.
If you want to have a lite client instead, use ultraprune and/or the SPV+ mode that is in the works.
If you just want to have access to older transactions, use bloom filters.

Also it seems the OP does NOT want to distribute block storage, but rather individual transactions indexed by hash (even if they have been pruned away). To make this somehow secure I guess one needs: TX hash, remaining merkle branch + block hash. Then one can see that this transaction was actually in the block that it claims. Still I am not too sure if a local database or a trusted local key-value store (noSQL) would not be more suited to the task.
sr. member
Activity: 280
Merit: 257
bluemeanie
For many applications though, you need to keep a copy of the block in local storage so you can validate the chain
For initial validation - yes, client have to download each block via DHT and build chain headers. But it really don't need to store each block body. Only headers as trusted chain. Block bodies theirselfs can be distributed in untrusted DHT storage, as each client has local chain headers and modified block can not be accepted (as its hash will change).

Can anyone start implementing this? I can be a first donator of this task.

I've spoken to a few DHT experts on this, most of them are optimistic about this idea.  The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.  You can certainly optimize the algorithm in this case.  BTW- is there any good documentation on how the current relay mechanism works for Bitcoin?  I had several people ask me for this and I couldn't find anything.

btw- I notice many on this board seem to think that 'developers' are some kind of cheap resource that you could just conjure up by throwing a few dollars around.  While there might be many people able and willing to write a few lines of code, these kind of problems are very complex and there are really very few people who are capable of solving them effectively.  I've already seen several projects releasing code that appears to perform a(much desired) function, but fails in very important ways.  Of course these problems wont show until long after it's released and people have invested real money in the system.
sr. member
Activity: 462
Merit: 250
Clown prophet
For many applications though, you need to keep a copy of the block in local storage so you can validate the chain
For initial validation - yes, client have to download each block via DHT and build chain headers. But it really don't need to store each block body. Only headers as trusted chain. Block bodies theirselfs can be distributed in untrusted DHT storage, as each client has local chain headers and modified block can not be accepted (as its hash will change).

Can anyone start implementing this? I can be a first donator of this task.
Pages:
Jump to: