Pages:
Author

Topic: Loads of fake peers advertised on bitcoin network (Read 656 times)

legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.
It smells (read stinks) to me that it could be some government agency, and this is happening in same time when they are pushing hard for some crazy bill regulations for Bitcoin and all crypto.
I can't prove anything but it could be they are collecting information for some tracking and surveillance of all nodes and transactions.

I doubt it, connection between nodes isn't encrypted (unless Tor or good VPN is used), so government agency could intercept the connection and gain more information.

So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html
They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.

@piotr_n mention Luke's DNS seed down for a while, so i wonder how accurate is it.
newbie
Activity: 2
Merit: 36
Sorry for the late reply.

The conclusion interest me. Bitcoin have 12775 nodes (according to https://bitnodes.io/ which exclude node which don't accept incoming connection), so DoS cost is quite expensive and it probably only reduce propagation speed to whole network. However, it's major concern for altcoin which have very few full node count.

I agree that a DoS attack against the whole Bitcoin P2P network would probably be very expensive. However, attacking the most connected nodes might be a more cost-effective attack strategy than attacking random nodes. Therefore it is desirable to protect the identity of well-connected nodes.

They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.

See https://luke.dashjr.org/programs/bitcoin/files/charts/historical.html which differentiates between listening and non-listening nodes.
legendary
Activity: 3472
Merit: 10611
@piotr_n mention Luke's DNS seed down for a while, so i wonder how accurate is it.
The two don't necessarily have to be related. The DNS server may be hosted somewhere else or use an entirely different setup specially since DNS seeds report listening nodes not all nodes. Also dnsseed.bitcoin.dashjr.org is currently up and running right now. Here is an online to tool to quickly dig DNS: https://toolbox.googleapps.com/apps/dig/#A/
legendary
Activity: 3472
Merit: 10611
So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html
They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.
legendary
Activity: 2212
Merit: 7064
We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.
It smells (read stinks) to me that it could be some government agency, and this is happening in same time when they are pushing hard for some crazy bill regulations for Bitcoin and all crypto.
I can't prove anything but it could be they are collecting information for some tracking and surveillance of all nodes and transactions.

The conclusion interest me. Bitcoin have 12775 nodes
So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html
newbie
Activity: 2
Merit: 36
We run a monitoring (https://www.dsn.kastel.kit.edu/bitcoin/) of the Bitcoin P2P network at Karlsruhe Institute of Technology (KIT) and noticed the invalid addresses, too. Based on the findings posted above, we analyzed what the spamming could be useful for. We found that the propagation of addr messages can be used to estimate the number of neighbors of public peers running Bitcoin Core and to match multiple addresses to the same public peer.
You can find our report at https://arxiv.org/abs/2108.00815.
We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
Even worse is that sometimes a big percentage of the addresses returned by DNS seeds are not valid.
For example last time I was testing about 20% of addresses returned by seed.bitcoin.sipa.be were dead ends
legendary
Activity: 2058
Merit: 1416
aka tonikt
They seem to have stopped now - about a week ago, actually.

However, talking about the addresses, three of the bitcoin's DNS seeds seem to have been down for awhile already:
Code:
dnsseed.bitcoin.dashjr.org
seed.bitcoinstats.com
seed.bitcoin.jonasschnelli.ch
newbie
Activity: 3
Merit: 27
the thousand 'new buckets' approach and each node being able to access only 64 of them, does not seem to be helping much, considering that all the nodes advertise incoming addresses without checking them.

It helps against a personal attack - without it one could connect to a victim node and immediately fill all of its "new" buckets with junk. It only slows down a network-wide junk spread.

Now imagine scenario that you're starting a new node, with a brand new IP.
It is going to have a hard time getting incoming connections anytime soon, considering that it competes with hundreds of thousands of fake IPs.

Competes in which way? If a node has a set of 20 addresses, 19 of which are junk and just 1 real and it wants to connect to somebody, then after some failed attempts it is going to connect to the 1 real one. There is no hurry. Trying to connect to a non-listening node (junk) takes a few seconds.

Plus every node looses time trying to connect to these fake addresses.
Not sure what is the core's algo of choosing a new IP to connect to, but whatever it is, it will surely also have to deal with a lot of dead tries.

Yes, the failed attempts will waste some time.

Opening new connections in Bitcoin Core happens in CConnman::ThreadOpenConnections():
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/net.cpp#L1832
the address to connect to is chosen by calling CAddrMan::Select():
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/net.cpp#L2047
with some further filtering afterwards. CAddrMan::Select() is defined here:
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/addrman.cpp#L413
it can chose from the "new" and "tried" tables.
legendary
Activity: 2058
Merit: 1416
aka tonikt
What I've learned about this, there are bots out there that feed this attack.
Here are some of their IPs:
Code:
103.107.198.132
104.149.35.146
104.200.131.120
139.5.177.224
141.98.103.172
144.48.37.76
152.89.163.172
165.231.253.44
172.104.10.187
173.244.211.90
174.127.84.12
176.107.184.136
176.113.74.253
176.222.34.111
178.175.133.100
180.149.231.156
185.104.185.164
185.106.102.204
185.122.168.248
185.130.184.115
185.156.175.109
185.169.233.205
185.189.114.27
185.191.204.131
185.203.122.18
185.216.34.99
185.225.28.44
185.236.201.133
185.236.201.230
185.240.244.5
185.93.2.199
185.99.3.105
189.1.168.147
192.145.125.36
193.148.18.28
193.27.12.46
193.32.210.165
194.150.167.78
194.36.110.182
195.158.248.4
195.206.104.157
195.206.105.93
196.245.151.4
2.58.46.236
208.78.41.68
217.138.197.76
217.146.92.233
31.13.191.132
37.221.112.62
43.249.36.137
45.141.153.237
45.144.113.44
45.249.222.252
45.34.7.4
45.83.91.196
45.89.174.116
46.102.153.68
5.101.145.47
5.101.145.50
64.120.88.150
68.232.180.194
77.81.191.3
86.105.9.92
87.239.255.38
89.164.99.107
89.249.64.171
91.132.136.238
91.205.230.194
92.119.18.253
93.190.143.97
94.46.223.22
95.174.66.28
But you can't connect to them - you have to wait for them to connect to you.

Such a fake node seem to be connecting to (all?) the known bitcoin nodes - somehow randomly.
Upon connecting, it does the versions handshake pretending to be bitcoin core (I've seen /Satoshi:0.21.0/ and /Satoshi:0.21.1/)
Then, without any delays, it start sending addr messages, each containing 10 records.
After sending 500 of such messages (so 5000 addresses total), it just disconnects, literally a few seconds from connecting.
Later it will come back, minutes or hours later, to do the same...

legendary
Activity: 2058
Merit: 1416
aka tonikt
Thanks @vasild.
Yes, I'm running my own software and was asking people running core to tell me what they see at their nodes.
And I asked if the core had a limit, just to know what to expect or how it would handle the problem.
And with the 700k number, I just wanted to give you and idea on how big the problem was.
BTW, I cleared the DB exactly two days ago and now it has over 800k records. So its definitely still ongoing...

Anyway.
So, please correct me if I'm wrong, but the thousand 'new buckets' approach and each node being able to access only 64 of them, does not seem to be helping much, considering that all the nodes advertise incoming addresses without checking them.

That's basically what I'm seeing.


Now imagine scenario that you're starting a new node, with a brand new IP.
It is going to have a hard time getting incoming connections anytime soon, considering that it competes with hundreds of thousands of fake IPs.

Plus every node looses time trying to connect to these fake addresses.
Not sure what is the core's algo of choosing a new IP to connect to, but whatever it is, it will surely also have to deal with a lot of dead tries.

Any solutions?
newbie
Activity: 3
Merit: 27
What is the core's algorithm for selecting addresses to return after receiving getaddr request?

It starts from here:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L3681

Does it only pick those from the "tried" buckets?

No. But it deliberately avoids addresses it has tried to connect without success and some other "terrible" ones:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/addrman.cpp#L46
called from here:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/addrman.cpp#L571
`vRandom` contains all addresses, both "new" and "tried".

Same for sending spontaneous addr messages: does it have to "try" it first, before it can route a new addr to its peeers?
I am not completely sure, but it seems like I'm getting (most of) those fake addresses from a legit bitcoin core peers.
I have a suspicion that because of the algorithm bitcoin core uses for routing new addresses, it's somehow facilitating this problem.

Here is what happens when an `addr` message is received:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L2753
under some conditions, every address from the `addr` message is relayed to ~2 other, random peers:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L2801

No attempt is made to verify that a bitcoin client is listening on that address by connecting to it. That would be too slow and create another problem - DoS by sending the victim's address to e.g. 10k bitcoin nodes, all of them rushing immediately to verify if somebody is listening there. Also, it could be that the address is of a legit node, which is just shut down temporarily.

You shouldn't assume everyone use Bitcoin Core. OP is developer of alternative full node client, so it's likely he's talking about his gocoin node.

I assumed it is Bitcoin Core because the post reads like "I have 700k addresses ... does Bitcoin Core have limit...". Thanks for the clarification!
legendary
Activity: 2058
Merit: 1416
aka tonikt
Here's the reply I got from Pieter Wuille about this subject
Quote
Each group of source IPs (/16s etc) selects a subset of just 64 buckets (salted using a host-specific secret key), and inserts the newly received IPs in a position in a bucket in one of those, if certain criteria are met (the position was empty, or it held an IP address that also occurs elsewhere in the table already). This limits the impact an attacker can have, because they cannot under any circumstances affect IPs in buckets outside of the 64 their group maps to.
Thanks. That's very helpful.
Will probably look into implementing something like this.


What is the core's algorithm for selecting addresses to return after receiving getaddr request?
Does it only pick those from the "tried" buckets?

Same for sending spontaneous addr messages: does it have to "try" it first, before it can route a new addr to its peeers?
I am not completely sure, but it seems like I'm getting (most of) those fake addresses from a legit bitcoin core peers.
I have a suspicion that because of the algorithm bitcoin core uses for routing new addresses, it's somehow facilitating this problem.

And I don't send any getaddr, FWIW.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
I think it's the same as the the result of getnodeaddresses 0 RPC.
Note: it will take a while (GUI will freeze while loading) if you have hundred of thousands of entries in your peers database.
Limit of it is 2500 or 23%, as indicated in the docs. Unfortunately not very useful, because you need the entire peers list to be displayed or at least the relevant parts and not just a sampling. Thanks though!
newbie
Activity: 3
Merit: 27
All my nodes' peers databases are now over 700k records
...
Does bitcoin core have a limit of peers upon witch it won't accept new addresses into the database?

Yes, the limit is (1024+256)*64 = 81920 addresses (see https://github.com/bitcoin/bitcoin/blob/7e1ba37b5daceda222b138cbf61bbdeda87d21fd/src/addrman.h#L159-L162). How come you have 700k+ records?

How can you tell whether an address is "completely random" or an address of a legit node that is down?
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Here's the reply I got from Pieter Wuille about this subject:

> This is an interesting read: https://bitcointalksearch.org/topic/loads-of-fake-peers-advertised-on-bitcoin-network-5348856
>
> So according to this, somebody is spamming the bitcoin network with addr message pointing to invalid addresses and ports, which bloats the peers.dat and corresponding structure in memory.

The peers.dat file and the structure in memory have a fixed size, so those are not a problem.

> Since peers.dat uses a custom record type which I don't know how to parse, I wasn't able to check specifics of IP addresses listed in there, but I believe I have a workaround to prevent this kind of thing from happening. Exactly how easy or difficult it will be to implement this change I don't know.

The "addrman" database is organized into 1024 buckets with "new" addresses (which we haven't tried to connect to), and 256 buckets with "tried" addresses (which we have connected to ourselves). Each bucket consists of 64 positions, and each of those can hold 1 address. Along with the addresses we remember where we originally heard about them (which IP).

Each group of source IPs (/16s etc) selects a subset of just 64 buckets (salted using a host-specific secret key), and inserts the newly received IPs in a position in a bucket in one of those, if certain criteria are met (the position was empty, or it held an IP address that also occurs elsewhere in the table already). This limits the impact an attacker can have, because they cannot under any circumstances affect IPs in buckets outside of the 64 their group maps to.

This database structure is a design from 2012, which was significantly improved following recommendations in the Eclipse Attacks paper (https://cs-people.bu.edu/heilman/eclipse/).

> - Change the AddrDb updating functionality so that it does not add nodes that are unreachable. Not unreachable by timeout, but "connection refused" kind of errors.

In a way we have that; there are separate tables in peers.dat for new and tried addresses. I don't think it's feasible to not add untried addresses at all, as our ability to create connections is far too low to try everything we receive. But I think the existing structure should reasonably protect against spam (in terms of database poisoning; there is certainly a processing cost to it).

Cheers,

--
Pieter

And this is my message on the mailing list: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019265.html

TL;DR the addrman database and peers.dat are capped to a specific size so at least excessive memory usage, and in the worst case a memory leak, can't happen.
legendary
Activity: 2646
Merit: 6681
Self-proclaimed Genius
How would I be able to check something like this myself with my own node?
Are you periodically retrieving the data from peers.dat or are you using a better method? If it's the former, then I'll parse the peers.dat later on at fixed intervals and see what I can find.
I think it's the same as the the result of getnodeaddresses 0 RPC.
Note: it will take a while (GUI will freeze while loading) if you have hundred of thousands of entries in your peers database.
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
Yes, they are still coming.

You will only see them in your node's peers database.
I took a quick look at the node on my server. I'm seeing a few addr messages with less than 10 IPs being sorted into the bucket every few seconds.

Are you periodically retrieving the data from peers.dat or are you using a better method? If it's the former, then I'll parse the peers.dat later on at fixed intervals and see what I can find.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
I am still not seeing anything out of the ordinary. So either they are hitting specific IPs / Nodes or my SonicWall is blocking them for some reason.
I do have the sonic configured to block botnets, so if the connections are coming from known bad IPs they might never make it in. But other then that I have no idea.

This could actually be a good feature to implement in Bitcoin Core, no? Botnet detection and blocking. Without hooking up to any third-party software or API, you could make Core read a certain file that has a blocked subnet on each line, nodes discovered in those subnets won't even be added to the bucket or queried for additional peers, which can thwart an attack like this.

To supply the actual IPs themselves you could change Core to record invalid IP address/port combos in the file and advertise a ZMQ message for peers to retrieve your "peer ignore list" so they can update their files too. In this way, the entire network becomes resilient to this kind of attack (the nodes that upgrade, at least).

edit: I just sent this idea to the bitcoin-dev mailing list, let's see what they say.
legendary
Activity: 2058
Merit: 1416
aka tonikt
I am still not seeing anything out of the ordinary. So either they are hitting specific IPs / Nodes or my SonicWall is blocking them for some reason.
I do have the sonic configured to block botnets, so if the connections are coming from known bad IPs they might never make it in. But other then that I have no idea.

@piotr_n  are you still seeing the attack?

-Dave
Yes, they are still coming.

You will only see them in your node's peers database.
Pages:
Jump to: