Author

Topic: Distributed downloading of blocks using a merkle hash tree for the block chain. (Read 1272 times)

full member
Activity: 385
Merit: 110
Hello,

Here is an idea how distributed downloading of blocks for the blockchain could happen:

1. All blocks which currently exist on other peers are added as leaves to a merkle hash tree.

2. The leaves are then hashed, into intermediate nodes.

3. The intermediate nodes are then hashed upwards until a merkle root node exists which contains the "root hash".

The clients then download the root hash, and all intermediate nodes, this will increase the overhead two fold, however it will also allow distributed downloading.

Once the client has retrieved the lowest leaves, these leaves can be downloaded distributedly.

Their hashes can also be calculated, and compared to the nodes above it in the tree. So these blocks can immediately be verified as being correct.

(Clients who's hashes are not correct could then be disconnected).

One problem might remain: the merkle hash root could change/be sliding, some peers might not have all yet, perhaps the root could even move down and become a node on the left side of the tree, with a new tree on the right of it.. I think this will work just fine.

Diagram of tree:

   O  <- root
  /\
O  O
^ two blocks

New block is added (block chain grows):
           O  <- new root
         /     \
*-> O         O
     /  \      /
   O    O  O

^ three blocks.
* = old root (moved down).

What would be needed for a protocol is a "virtual numbering scheme" which numbers all possible nodes which could possibly exist in the tree and when the tree grows these numbers would stay the same as the nodes move down in depth.

Then the client can ask the other client what his maximum node number is, and simply request node numbers at random (shuffled see below) or simply round robin for as far as possible, and stuff their information/hashes into the tree at the correct position.

Something like:

RequestDownloadTreeNodeMaxNumber( output MaxNodeNumber );
RequestDownloadTreeNodeNumber( input NodeNumber, output NodeInformation );

The client can then connect to different nodes and ask all nodes what their maximum node number is.

(Once this has been established the client can initialize an array of node numbers and shuffle them, though this is not necessary but could provide extra security ?)

The client then walks this array as follows:

// request all nodes
for NodeIndex := 0 to MaxNodeNumber do
begin

     // this assures a client is found which has the node, in case the node number is real high and only few have it.
     while NodeIndex > OtherClient[OtherClientIndex].MaxNodeNumber do
     begin
         OtherClientIndex := OtherClientIndex + 1; // wrap this around as necessary, not gonna code this here to keep it clean.
     end;

     OtherClient[ OtherClientIndex  ].RequestDownloadTreeNodeNumber( NodeIndex, NodeInformation );  // *
     OtherClientIndex := OtherClientIndex + 1; // wrap this around as necessary, not gonna code this here to keep it clean.
end;

The shuffling idea isn't really necessary but could be added as well as follows:

Line * is modified as follows:

         OtherClient[ OtherClientIndex  ].RequestDownloadTreeNodeNumber( Shuffle[ NodeIndex ], NodeInformation );

Seems pretty nice and somewhat easy to implement protocol, could be nice ! Wink

Bye,
  Skybuck.
Jump to: