Author

Topic: Bitcoin overlay network (Read 3055 times)

member
Activity: 70
Merit: 10
GNU is not UNIX
July 08, 2011, 04:47:29 PM
#19
I've said before that saying "DHT" should trigger an automatic ban in IRC.  It's an almost perfect metric for identifying people who don't understand bitcoin, don't have any interesting in understanding bitcoin, and whom are going to just spout distracting buzzwords rather than contribute.

Depending on the interpretation this is a generalization representing your opinion or an instance of the genetic fallacy.

What problem exactly do you plan to solve with your DHTs. What measurements have you made to determine that the problem exists?  What analysis have you performed that supports that the DHT's will actually improve that problem?

Measurements are of limited utility. If we were to act until the measurements showed a problem then we're lagging. I don't claim than there exists a problem currently with the random connection model. You can find an analysis of the relevant characteristics of each DHT model in academic papers if you care. There is no single "the DHT" and hence no single reference.

Absent the existence of lite clients bitcoin is a flooding network: All participating nodes need to hear about everything eventually.   The randomly wired graphs appear to work perfectly well at this.  Because of the inv process any inefficiency in the topology doesn't actually increase bandwidth usage substantially, and global coverage of connected nodes seems to be no worse than a few seconds, which is well below what is strictly required.

What measurements have you made to determine that the global coverage is no worse than a few seconds?. Are you saying lite clients are to avoid in the future?.

Moreover, the 'random' wiring makes the network fairly resistant to various attacks short of isolating a node or adding tons of well distributed sybils.  This kind of resistance is hardware in more structured topologies.

I'm ignorant of the meaning of "hardware" as used in this context. Maybe "hardwired" is the word you're looking for?. It has been already said in this thread than there are ways to avoid targeted attacks in distributed DHTs, the general procedure is to take a hash of the public IP as the node ID. Other techniques can be applied depending on the specific model.
sr. member
Activity: 294
Merit: 250
July 08, 2011, 11:42:42 AM
#18
I'm concerned that any network structure where not all nodes are equal (as possible), may potentially allow an attacker to more easily gain control of the network.

While a peer accepting inbound connections vs. one not accepting them is not a big problem (and the resulting hub/leaf like structure) as it doesn't limit where a node connects, but just how many connections there are... when a structure would be introduced where you specifically choose to be a hub or leaf, I think it would be easier for an attacker to just set up a lot of 'hubs' (he can choose to make his peers hubs anyway) and easily gain control over a lot of leafs (which would likely be the default setting for the 'average user').

Eh. I agree with your conclusions, disagree with your reasoning.
Listening / non-listening does limit who you connect to— you can't connect to someone who isn't listening.

If an attacker can bring up a super abundance of listening nodes in diverse /16s then there is chance that some nodes will connect only to his and he can selectively partition the network.  For example, if you control have 91.7% of the "nodes" (really just proxies) with the same /16 distribution as the regular network, any random non-listening node will have a 50% chance of connecting to you exclusively. This would take perhaps 50,000 hosts— not out of the question for a botnet operator.

If a large miner or two and a large merchant end up connected exclusively to his evil nodes, he can cause that miner to mine on a fork which he uses to make transactions in a partition which will be later reversed by the main chain when he de-partitions the network.

The fact that many nodes are non-listening makes this attack much easier, but the real solution to that is getting more listening nodes.

Where I agree with you is that adding _more_ node type diversity (especially attacker controllable ones) can only make the situation worse. The attacker will choose the types/positions of his nodes in order to maximize his effectiveness.

Absent a reliable measure of node independence I don't believe any automatic topology can have superior security to the randomly wired one for a given node order (obviously increasing the order helps) assuming the attacker has many potential targets and can run many fake nodes.

There are things we can do to improve things...  Automatic connection rotation, better support for manually configured trusted adjacencies (all the major miners full meshing with each other kills most zero-hashpower partitioning&respend attacks dead),  network health monitoring probes (and data feeds), and network health adjusted confirmation delays.





Fair point.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
July 08, 2011, 10:45:41 AM
#17
While a peer accepting inbound connections vs. one not accepting them is not a big problem (and the resulting hub/leaf like structure) as it doesn't limit where a node connects, but just how many connections there are... when a structure would be introduced where you specifically choose to be a hub or leaf, I think it would be easier for an attacker to just set up a lot of 'hubs' (he can choose to make his peers hubs anyway) and easily gain control over a lot of leafs (which would likely be the default setting for the 'average user').
He can do the same thing now just as easily by simply setting up a lot of nodes that accept inbound connections.

I'm not sure it's needed, but if people felt it was desired, it wouldn't be terribly difficult to set up partition resistance where nodes search for imbalances in the network and try to balance the tree and reduce its maximum path length.

Honestly, I think the hub and leaf code would improve the situation massively. The number of hubs could be large. The only thing stopping there from being lots of hubs now is the fear of consuming inbound connection slots. The more interconnections, the harder it is for an attacker to partition the network.

I'm envisioning hubs with thousands of connections. With just three of those each under independent operation, partitioning the network would be nearly impossible.
kjj
legendary
Activity: 1302
Merit: 1026
July 08, 2011, 10:40:15 AM
#16
If you want to speed it up, I think the right place would be the services field in the version and addr messages.  I can't tell from the wiki if that field is intended to be a bit field, or an integer.  Hopefully a bit field.
I don't think that's currently propagated indirectly though. I believe it would have to be in the address structure that nodes can request from each other.

Quote
Also, you might want to make the default behavior to avoid having more than 25% of all connections be to hubs.  Otherwise, the forums will fill up with posts about how that one pool hub is getting close to having 51% of all connections.
For a hub, you would likely only want to make outbound connections to other hubs. There's no reason you'd want to actively seek a leaf. (Though I wouldn't 100% shut it off though, bit it would be rare.) Most leaves can't accept inbound connections anyway. Hubs would acquire lots of connections to leaves because the leaves would connect in to them.

According to the wiki, the services field is included in addr messages.  And it seems like exactly the sort of thing that the field was intended for.

And I meant the hub limit to be for regular nodes, not hubs.  For security, you want most of your connections to be to regular folks, even though you want at least a couple of well connected nodes for network efficiency.  But it wouldn't have any ill effect on a hub if the same rule was followed there too.
staff
Activity: 4284
Merit: 8808
July 08, 2011, 10:37:18 AM
#15
I'm concerned that any network structure where not all nodes are equal (as possible), may potentially allow an attacker to more easily gain control of the network.

While a peer accepting inbound connections vs. one not accepting them is not a big problem (and the resulting hub/leaf like structure) as it doesn't limit where a node connects, but just how many connections there are... when a structure would be introduced where you specifically choose to be a hub or leaf, I think it would be easier for an attacker to just set up a lot of 'hubs' (he can choose to make his peers hubs anyway) and easily gain control over a lot of leafs (which would likely be the default setting for the 'average user').

Eh. I agree with your conclusions, disagree with your reasoning.
Listening / non-listening does limit who you connect to— you can't connect to someone who isn't listening.

If an attacker can bring up a super abundance of listening nodes in diverse /16s then there is chance that some nodes will connect only to his and he can selectively partition the network.  For example, if you control have 91.7% of the "nodes" (really just proxies) with the same /16 distribution as the regular network, any random non-listening node will have a 50% chance of connecting to you exclusively. This would take perhaps 50,000 hosts— not out of the question for a botnet operator.

If a large miner or two and a large merchant end up connected exclusively to his evil nodes, he can cause that miner to mine on a fork which he uses to make transactions in a partition which will be later reversed by the main chain when he de-partitions the network.

The fact that many nodes are non-listening makes this attack much easier, but the real solution to that is getting more listening nodes.

Where I agree with you is that adding _more_ node type diversity (especially attacker controllable ones) can only make the situation worse. The attacker will choose the types/positions of his nodes in order to maximize his effectiveness.

Absent a reliable measure of node independence I don't believe any automatic topology can have superior security to the randomly wired one for a given node order (obviously increasing the order helps) assuming the attacker has many potential targets and can run many fake nodes.

There are things we can do to improve things...  Automatic connection rotation, better support for manually configured trusted adjacencies (all the major miners full meshing with each other kills most zero-hashpower partitioning&respend attacks dead),  network health monitoring probes (and data feeds), and network health adjusted confirmation delays.




sr. member
Activity: 294
Merit: 250
July 08, 2011, 09:36:57 AM
#14
I'm concerned that any network structure where not all nodes are equal (as possible), may potentially allow an attacker to more easily gain control of the network.

While a peer accepting inbound connections vs. one not accepting them is not a big problem (and the resulting hub/leaf like structure) as it doesn't limit where a node connects, but just how many connections there are... when a structure would be introduced where you specifically choose to be a hub or leaf, I think it would be easier for an attacker to just set up a lot of 'hubs' (he can choose to make his peers hubs anyway) and easily gain control over a lot of leafs (which would likely be the default setting for the 'average user').
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
July 08, 2011, 09:28:12 AM
#13
If you want to speed it up, I think the right place would be the services field in the version and addr messages.  I can't tell from the wiki if that field is intended to be a bit field, or an integer.  Hopefully a bit field.
I don't think that's currently propagated indirectly though. I believe it would have to be in the address structure that nodes can request from each other.

Quote
Also, you might want to make the default behavior to avoid having more than 25% of all connections be to hubs.  Otherwise, the forums will fill up with posts about how that one pool hub is getting close to having 51% of all connections.
For a hub, you would likely only want to make outbound connections to other hubs. There's no reason you'd want to actively seek a leaf. (Though I wouldn't 100% shut it off though, bit it would be rare.) Most leaves can't accept inbound connections anyway. Hubs would acquire lots of connections to leaves because the leaves would connect in to them.
full member
Activity: 372
Merit: 114
July 08, 2011, 09:15:28 AM
#12
I proposed structuring as a Skip-list or hypercube, and instead of broadcasting to all I'd love to have a broadcast to prefix only. A hypercube of dynamic degree, multiple nodes at each edge, taking care of a given prefix. Miners would regularly fetch all transactions and include them into blocks. It would reduce broadcasts and scale better.

The idea was shot down pretty early with the usual arguments: too hard, breaks too much,...
But I still think it should be considered :-)

Chord is basically a giant circular skip-list.  The structure of the hypercube and the Chord graph are quite similar.  In general you could easily use either.

The hypercube is a bit nicer when you're assigning addresses randomly/by hash, because the distance metric is much more relevant: the distance is the actual number of hops in the network.  There's really no reason for high-order bits to be more significant than low-order bits as in Chord/Kamelia.  Google "oblivious routing in the hypercube" for an idea of one of the benefits of this.  In short, if you always route in a way to "fix high order bits first"( or even "fix low order bits first"), there are bad examples that can massively congest single links (e.g., if x0000000 all want to send messages to 000000000x, where x is a n/2-bit string, and they all route by fixing bits left-to-right, all the messages end up passing through node 000000000000!).


Note also it would be unwise to do "perfect broadcast" with an embedded multicast tree for settings like bitcoin that require extreme fail-tolerance, as cutting a single multicast link would partition the network.  But you can use the topology to do something inbetween a tree and broadcasting to everyone.

kjj
legendary
Activity: 1302
Merit: 1026
July 08, 2011, 08:52:30 AM
#11
I made a set of patches to interconnect more aggressively, but it was pointed out that this consumes the network's limited number of available incoming connection slots. There's no good way to get other clients to connect to you more aggressively. I'd love to see a more 'hub and spoke' system where clients can identify themselves as hubs that desire high connectivity and fast responses and 'spokes' that desire low bandwidth demands.

To be a hub, you'd have to be able to accept incoming connections. Hubs would preferentially connect to other hubs when needed, and wait for spokes to connect to them. Spokes would connect primarily to a small number of hubs.

I think this would benefit everyone.

Would people be interested in nodes being able to set flags that are broadcast to other nodes and tracked along with their IP addresses? One flag could be "I am a hub".

This pretty much happens automatically with the current setup.  But it happens slowly.  If you set your maxconnections high, your box will eventually make a couple of long lived connections to other nodes that are acting more or less as hubs.

If you want to speed it up, I think the right place would be the services field in the version and addr messages.  I can't tell from the wiki if that field is intended to be a bit field, or an integer.  Hopefully a bit field.

Also, you might want to make the default behavior to avoid having more than 25% of all connections be to hubs.  Otherwise, the forums will fill up with posts about how that one pool hub is getting close to having 51% of all connections.
hero member
Activity: 489
Merit: 505
July 08, 2011, 07:37:57 AM
#10
I proposed structuring as a Skip-list or hypercube, and instead of broadcasting to all I'd love to have a broadcast to prefix only. A hypercube of dynamic degree, multiple nodes at each edge, taking care of a given prefix. Miners would regularly fetch all transactions and include them into blocks. It would reduce broadcasts and scale better.

The idea was shot down pretty early with the usual arguments: too hard, breaks too much,...
But I still think it should be considered :-)
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
July 08, 2011, 07:28:27 AM
#9
I made a set of patches to interconnect more aggressively, but it was pointed out that this consumes the network's limited number of available incoming connection slots. There's no good way to get other clients to connect to you more aggressively. I'd love to see a more 'hub and spoke' system where clients can identify themselves as hubs that desire high connectivity and fast responses and 'spokes' that desire low bandwidth demands.

To be a hub, you'd have to be able to accept incoming connections. Hubs would preferentially connect to other hubs when needed, and wait for spokes to connect to them. Spokes would connect primarily to a small number of hubs.

I think this would benefit everyone.

Would people be interested in nodes being able to set flags that are broadcast to other nodes and tracked along with their IP addresses? One flag could be "I am a hub".
full member
Activity: 372
Merit: 114
July 07, 2011, 08:19:26 PM
#8
Could you please elaborate in the aforesaid upcoming project?. Have you considered other DHT models like Kademlia?.

I had not heard of Kademlia, I will read the paper tonight.   Many of these schemes (CAN, Chord, Pastry) are quite similar -- what I planned to do was pick whatever I like best from each, rather than arbitrarily pick one.  I will add Kademlia to that lst.

As for my project, search through my posts to get an idea.  Basically a new client/new cointype to address some technical and philosophical issues I see in bitcoin.  Technically: network topology, storage requirements, instant transactions, built-in pool.  Philosophically: all parameters tweakable and set by miners, such as # of coins generated per block.   Generated block includes extra "notes" space, intended for miners to write their parameters like "generated coins/block I'd like to see, maximum generated coins/block I will accept as valid".  There could potentially be many other parameters.  The idea is then one could maintain a "scoreboard" of current mining shares scored by how low the hash is.  The scoreboard + the notes gives you a verifiable, trustworthy "current state of mind of coin miners".  This allows "rapid dissent", where a group of people who are unhappy with something can very quickly signal to one another via their notes that they are ready/willing to make changes to their parameters if there are others who agree with them, effectively solving the problem of "I want change but I don't want to waste my vote" (where vote here means mining a block that others wont accept as valid). 

Anyway I don't want to derail your thread further so let that be the end of that.  It will be 6 weeks until I can begin work on this project due to real-work deadlines; when I'm ready I will make a thread for it.

I've said before that saying "DHT" should trigger an automatic ban in IRC.  It's an almost perfect metric for identifying people who don't understand bitcoin, don't have any interesting in understanding bitcoin, and whom are going to just spout distracting buzzwords rather than contribute.

What problem exactly do you plan to solve with your DHTs. What measurements have you made to determine that the problem exists?  What analysis have you performed that supports that the DHT's will actually improve that problem?

Absent the existence of lite clients bitcoin is a flooding network: All participating nodes need to hear about everything eventually.   The randomly wired graphs appear to work perfectly well at this.  Because of the inv process any inefficiency in the topology doesn't actually increase bandwidth usage substantially, and global coverage of connected nodes seems to be no worse than a few seconds, which is well below what is strictly required.

Moreover, the 'random' wiring makes the network fairly resistant to various attacks short of isolating a node or adding tons of well distributed sybils.  This kind of resistance is hardware in more structured topologies.

I agree that if you are going to change nothing else about the architecture, structured overlay adds little (it is still helpful though, as there are broadcast protocols for structured overlays that are perfect in the sense they avoid
repeat messages).

But it is not clear at all to me that having every client's tx pool being arbitrary and inconsistent is a wise architecture.  There seems to be room for improvement here.
staff
Activity: 4284
Merit: 8808
July 07, 2011, 02:30:07 PM
#7
Could you please elaborate in the aforesaid upcoming project?. Have you considered other DHT models like Kademlia?.

I've said before that saying "DHT" should trigger an automatic ban in IRC.  It's an almost perfect metric for identifying people who don't understand bitcoin, don't have any interesting in understanding bitcoin, and whom are going to just spout distracting buzzwords rather than contribute.

What problem exactly do you plan to solve with your DHTs. What measurements have you made to determine that the problem exists?  What analysis have you performed that supports that the DHT's will actually improve that problem?

Absent the existence of lite clients bitcoin is a flooding network: All participating nodes need to hear about everything eventually.   The randomly wired graphs appear to work perfectly well at this.  Because of the inv process any inefficiency in the topology doesn't actually increase bandwidth usage substantially, and global coverage of connected nodes seems to be no worse than a few seconds, which is well below what is strictly required.

Moreover, the 'random' wiring makes the network fairly resistant to various attacks short of isolating a node or adding tons of well distributed sybils.  This kind of resistance is hardware in more structured topologies.



member
Activity: 70
Merit: 10
GNU is not UNIX
July 07, 2011, 10:58:48 AM
#6
Yes, this is (one of) a bad architecture decision in bitcoin IMO.  I plan to use the Chord topology for my upcoming project.  There have also been papers on dealing with malicious parties in Chord/Pastry (a google search should get the refs).  The main conclusion is make node IDs pseudorandom (e.g., hash of IP address) so that it is infeasible for adversaries to insert themselves into the network strategically.
Could you please elaborate in the aforesaid upcoming project?. Have you considered other DHT models like Kademlia?.
legendary
Activity: 1708
Merit: 1010
July 07, 2011, 01:44:49 AM
#5
You could release a modified bitcoin client that structures peer connections in any way that you like.  If I think that it is safe, I'd use it.
full member
Activity: 372
Merit: 114
July 07, 2011, 01:38:42 AM
#4
Yes, this is (one of) a bad architecture decision in bitcoin IMO.  I plan to use the Chord topology for my upcoming project.  There have also been papers on dealing with malicious parties in Chord/Pastry (a google search should get the refs).  The main conclusion is make node IDs pseudorandom (e.g., hash of IP address) so that it is infeasible for adversaries to insert themselves into the network strategically.
legendary
Activity: 1708
Merit: 1010
July 06, 2011, 10:48:27 PM
#3

Has anyone considered to implement a structured overlay network instead?.

It would be difficult to impliment this in code at this point, and potentially subject to manipulation.  The apparent randomness of the clients' peer discovery process is a defense against a particular kind of attack that involves an attacker 'owning' all of the peers of a target node.  Any kind of structured network method would permit the attacker to predict the nodes that the target is connected to, and thus make such an attack easier to impliment.  As it is, the random nature of the network means that an attacker can never know with certainty that he actually has all of the peers of the target, and it only takes one honest node to muck up that kind of attack vector.

That said, you can structure your own client however you see fit via forced peer selection.  Eventually, clients will have a peers page in the GUI like most bittorrent clients do, that will give the user more direct control of peers.  Until then, you can do the same thing from the command line.
kjj
legendary
Activity: 1302
Merit: 1026
July 06, 2011, 10:33:38 PM
#2
There is really no point, at least not now or in the near future.  Network traffic is very low.  Even a pathological organization would be good enough.
member
Activity: 70
Merit: 10
GNU is not UNIX
July 06, 2011, 09:42:19 PM
#1
Currently as far as I know the Bitcoin reference implementation connects to other active nodes nearly at random, it excludes numerically close IP addresses. This leads to a suboptimal overlay network regarding graph degree and hence propagation delay to all nodes, although it seems to be working just fine.

Has anyone considered to implement a structured overlay network instead?. This also may overlap with the proposal of storing the blocks in a distributed hash table. Most DHT models can also be used for efficient message broadcasting.
Jump to: