Loads of fake peers advertised on bitcoin network

ABCbits

legendary

Activity: 2870

Merit: 7490

Crypto Swap Exchange

Quote from: dkbit98 on August 08, 2021, 02:11:37 PM

Quote from: matthias.kit on August 08, 2021, 05:54:53 AM

We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.

It smells (read stinks) to me that it could be some government agency, and this is happening in same time when they are pushing hard for some crazy bill regulations for Bitcoin and all crypto.
I can't prove anything but it could be they are collecting information for some tracking and surveillance of all nodes and transactions.

I doubt it, connection between nodes isn't encrypted (unless Tor or good VPN is used), so government agency could intercept the connection and gain more information.

Quote from: pooya87 on August 08, 2021, 11:49:59 PM

Quote from: dkbit98 on August 08, 2021, 02:11:37 PM

So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html

They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.

@piotr_n mention Luke's DNS seed down for a while, so i wonder how accurate is it.

matthias.kit

newbie

Activity: 2

Merit: 36

Sorry for the late reply.

Quote from: ?? on ??

The conclusion interest me. Bitcoin have 12775 nodes (according to https://bitnodes.io/ which exclude node which don't accept incoming connection), so DoS cost is quite expensive and it probably only reduce propagation speed to whole network. However, it's major concern for altcoin which have very few full node count.

I agree that a DoS attack against the whole Bitcoin P2P network would probably be very expensive. However, attacking the most connected nodes might be a more cost-effective attack strategy than attacking random nodes. Therefore it is desirable to protect the identity of well-connected nodes.

Quote from: pooya87 on August 08, 2021, 11:49:59 PM

They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.

See https://luke.dashjr.org/programs/bitcoin/files/charts/historical.html which differentiates between listening and non-listening nodes.

pooya87

legendary

Activity: 3472

Merit: 10611

Quote from: ABCbits on August 09, 2021, 04:51:34 AM

@piotr_n mention Luke's DNS seed down for a while, so i wonder how accurate is it.

The two don't necessarily have to be related. The DNS server may be hosted somewhere else or use an entirely different setup specially since DNS seeds report listening nodes not all nodes. Also dnsseed.bitcoin.dashjr.org is currently up and running right now. Here is an online to tool to quickly dig DNS: https://toolbox.googleapps.com/apps/dig/#A/

pooya87

legendary

Activity: 3472

Merit: 10611

Quote from: dkbit98 on August 08, 2021, 02:11:37 PM

So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html

They are both correct (ignoring the errors coming from their methodology). Bitnodes shows nodes that are listening for incoming connection (among spy nodes) but the other is showing all nodes in existence including nodes that don't listen for incoming connections.

dkbit98

legendary

Activity: 2212

Merit: 7064

Quote from: matthias.kit on August 08, 2021, 05:54:53 AM

We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.

It smells (read stinks) to me that it could be some government agency, and this is happening in same time when they are pushing hard for some crazy bill regulations for Bitcoin and all crypto.
I can't prove anything but it could be they are collecting information for some tracking and surveillance of all nodes and transactions.

Quote from: ?? on ??

The conclusion interest me. Bitcoin have 12775 nodes

So is this number of nodes from Bitnodes more correct than Luke Dashjr version that is showing much more nodes?
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html

matthias.kit

newbie

Activity: 2

Merit: 36

We run a monitoring (https://www.dsn.kastel.kit.edu/bitcoin/) of the Bitcoin P2P network at Karlsruhe Institute of Technology (KIT) and noticed the invalid addresses, too. Based on the findings posted above, we analyzed what the spamming could be useful for. We found that the propagation of addr messages can be used to estimate the number of neighbors of public peers running Bitcoin Core and to match multiple addresses to the same public peer.
You can find our report at https://arxiv.org/abs/2108.00815.
We do not know who the spammers are but this might suggest that one of their objectives is learning information about the Bitcoin P2P network, esp. about its topology.

Coding Enthusiast

legendary

Activity: 1042

Merit: 2805

Bitcoin and C♯ Enthusiast

Even worse is that sometimes a big percentage of the addresses returned by DNS seeds are not valid.
For example last time I was testing about 20% of addresses returned by seed.bitcoin.sipa.be were dead ends

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

~~They seem to have stopped now - about a week ago, actually.~~

However, talking about the addresses, three of the bitcoin's DNS seeds seem to have been down for awhile already:

Code:

dnsseed.bitcoin.dashjr.org
seed.bitcoinstats.com
seed.bitcoin.jonasschnelli.ch

vasild

newbie

Activity: 3

Merit: 27

Quote from: piotr_n on July 14, 2021, 09:37:39 AM

the thousand 'new buckets' approach and each node being able to access only 64 of them, does not seem to be helping much, considering that all the nodes advertise incoming addresses without checking them.

It helps against a personal attack - without it one could connect to a victim node and immediately fill all of its "new" buckets with junk. It only slows down a network-wide junk spread.

Quote from: piotr_n on July 14, 2021, 09:37:39 AM

Now imagine scenario that you're starting a new node, with a brand new IP.
It is going to have a hard time getting incoming connections anytime soon, considering that it competes with hundreds of thousands of fake IPs.

Competes in which way? If a node has a set of 20 addresses, 19 of which are junk and just 1 real and it wants to connect to somebody, then after some failed attempts it is going to connect to the 1 real one. There is no hurry. Trying to connect to a non-listening node (junk) takes a few seconds.

Quote from: piotr_n on July 14, 2021, 09:37:39 AM

Plus every node looses time trying to connect to these fake addresses.
Not sure what is the core's algo of choosing a new IP to connect to, but whatever it is, it will surely also have to deal with a lot of dead tries.

Yes, the failed attempts will waste some time.

Opening new connections in Bitcoin Core happens in CConnman::ThreadOpenConnections():
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/net.cpp#L1832
the address to connect to is chosen by calling CAddrMan::Select():
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/net.cpp#L2047
with some further filtering afterwards. CAddrMan::Select() is defined here:
https://github.com/bitcoin/bitcoin/blob/1488f55fa57a1400a57be837b574183f019c7855/src/addrman.cpp#L413
it can chose from the "new" and "tried" tables.

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

What I've learned about this, there are bots out there that feed this attack.
Here are some of their IPs:

Code:

103.107.198.132
104.149.35.146
104.200.131.120
139.5.177.224
141.98.103.172
144.48.37.76
152.89.163.172
165.231.253.44
172.104.10.187
173.244.211.90
174.127.84.12
176.107.184.136
176.113.74.253
176.222.34.111
178.175.133.100
180.149.231.156
185.104.185.164
185.106.102.204
185.122.168.248
185.130.184.115
185.156.175.109
185.169.233.205
185.189.114.27
185.191.204.131
185.203.122.18
185.216.34.99
185.225.28.44
185.236.201.133
185.236.201.230
185.240.244.5
185.93.2.199
185.99.3.105
189.1.168.147
192.145.125.36
193.148.18.28
193.27.12.46
193.32.210.165
194.150.167.78
194.36.110.182
195.158.248.4
195.206.104.157
195.206.105.93
196.245.151.4
2.58.46.236
208.78.41.68
217.138.197.76
217.146.92.233
31.13.191.132
37.221.112.62
43.249.36.137
45.141.153.237
45.144.113.44
45.249.222.252
45.34.7.4
45.83.91.196
45.89.174.116
46.102.153.68
5.101.145.47
5.101.145.50
64.120.88.150
68.232.180.194
77.81.191.3
86.105.9.92
87.239.255.38
89.164.99.107
89.249.64.171
91.132.136.238
91.205.230.194
92.119.18.253
93.190.143.97
94.46.223.22
95.174.66.28

But you can't connect to them - you have to wait for them to connect to you.

Such a fake node seem to be connecting to (all?) the known bitcoin nodes - somehow randomly.
Upon connecting, it does the versions handshake pretending to be bitcoin core (I've seen /Satoshi:0.21.0/ and /Satoshi:0.21.1/)
Then, without any delays, it start sending addr messages, each containing 10 records.
After sending 500 of such messages (so 5000 addresses total), it just disconnects, literally a few seconds from connecting.
Later it will come back, minutes or hours later, to do the same...

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

Thanks @vasild.
Yes, I'm running my own software and was asking people running core to tell me what they see at their nodes.
And I asked if the core had a limit, just to know what to expect or how it would handle the problem.
And with the 700k number, I just wanted to give you and idea on how big the problem was.
BTW, I cleared the DB exactly two days ago and now it has over 800k records. So its definitely still ongoing...

Anyway.
So, please correct me if I'm wrong, but the thousand 'new buckets' approach and each node being able to access only 64 of them, does not seem to be helping much, considering that all the nodes advertise incoming addresses without checking them.

That's basically what I'm seeing.

Now imagine scenario that you're starting a new node, with a brand new IP.
It is going to have a hard time getting incoming connections anytime soon, considering that it competes with hundreds of thousands of fake IPs.

Plus every node looses time trying to connect to these fake addresses.
Not sure what is the core's algo of choosing a new IP to connect to, but whatever it is, it will surely also have to deal with a lot of dead tries.

Any solutions?

vasild

newbie

Activity: 3

Merit: 27

Quote from: piotr_n on July 13, 2021, 10:13:17 AM

What is the core's algorithm for selecting addresses to return after receiving getaddr request?

It starts from here:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L3681

Quote from: piotr_n on July 13, 2021, 10:13:17 AM

Does it only pick those from the "tried" buckets?

No. But it deliberately avoids addresses it has tried to connect without success and some other "terrible" ones:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/addrman.cpp#L46
called from here:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/addrman.cpp#L571
`vRandom` contains all addresses, both "new" and "tried".

Quote from: piotr_n on July 13, 2021, 10:13:17 AM

Same for sending spontaneous addr messages: does it have to "try" it first, before it can route a new addr to its peeers?
I am not completely sure, but it seems like I'm getting (most of) those fake addresses from a legit bitcoin core peers.
I have a suspicion that because of the algorithm bitcoin core uses for routing new addresses, it's somehow facilitating this problem.

Here is what happens when an `addr` message is received:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L2753
under some conditions, every address from the `addr` message is relayed to ~2 other, random peers:
https://github.com/bitcoin/bitcoin/blob/531c2b7c04898f5a2097f44e8c12bfb2f53aaf9b/src/net_processing.cpp#L2801

No attempt is made to verify that a bitcoin client is listening on that address by connecting to it. That would be too slow and create another problem - DoS by sending the victim's address to e.g. 10k bitcoin nodes, all of them rushing immediately to verify if somebody is listening there. Also, it could be that the address is of a legit node, which is just shut down temporarily.

Quote from: ?? on ??

You shouldn't assume everyone use Bitcoin Core. OP is developer of alternative full node client, so it's likely he's talking about his gocoin node.

I assumed it is Bitcoin Core because the post reads like "I have 700k addresses ... does Bitcoin Core have limit...". Thanks for the clarification!

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

Quote from: NotATether on July 13, 2021, 02:42:55 AM

Here's the reply I got from Pieter Wuille about this subject

Quote

Each group of source IPs (/16s etc) selects a subset of just 64 buckets (salted using a host-specific secret key), and inserts the newly received IPs in a position in a bucket in one of those, if certain criteria are met (the position was empty, or it held an IP address that also occurs elsewhere in the table already). This limits the impact an attacker can have, because they cannot under any circumstances affect IPs in buckets outside of the 64 their group maps to.

Thanks. That's very helpful.
Will probably look into implementing something like this.

What is the core's algorithm for selecting addresses to return after receiving getaddr request?
Does it only pick those from the "tried" buckets?

Same for sending spontaneous addr messages: does it have to "try" it first, before it can route a new addr to its peeers?
I am not completely sure, but it seems like I'm getting (most of) those fake addresses from a legit bitcoin core peers.
I have a suspicion that because of the algorithm bitcoin core uses for routing new addresses, it's somehow facilitating this problem.

And I don't send any getaddr, FWIW.

ranochigo

legendary

Activity: 3038

Merit: 4418

Crypto Swap Exchange

Quote from: nc50lc on July 13, 2021, 12:30:09 AM

I think it's the same as the the result of getnodeaddresses 0 RPC.
Note: it will take a while (GUI will freeze while loading) if you have hundred of thousands of entries in your peers database.

Limit of it is 2500 or 23%, as indicated in the docs. Unfortunately not very useful, because you need the entire peers list to be displayed or at least the relevant parts and not just a sampling. Thanks though!

vasild

newbie

Activity: 3

Merit: 27

Quote from: piotr_n on July 12, 2021, 05:32:28 AM

All my nodes' peers databases are now over 700k records
...
Does bitcoin core have a limit of peers upon witch it won't accept new addresses into the database?

Yes, the limit is (1024+256)*64 = 81920 addresses (see https://github.com/bitcoin/bitcoin/blob/7e1ba37b5daceda222b138cbf61bbdeda87d21fd/src/addrman.h#L159-L162). How come you have 700k+ records?

How can you tell whether an address is "completely random" or an address of a legit node that is down?

NotATether

legendary

Activity: 1568

Merit: 6660

bitcoincleanup.com / bitmixlist.org

Here's the reply I got from Pieter Wuille about this subject:

Quote from: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019266.html

> This is an interesting read: https://bitcointalksearch.org/topic/loads-of-fake-peers-advertised-on-bitcoin-network-5348856
>
> So according to this, somebody is spamming the bitcoin network with addr message pointing to invalid addresses and ports, which bloats the peers.dat and corresponding structure in memory.

The peers.dat file and the structure in memory have a fixed size, so those are not a problem.

> Since peers.dat uses a custom record type which I don't know how to parse, I wasn't able to check specifics of IP addresses listed in there, but I believe I have a workaround to prevent this kind of thing from happening. Exactly how easy or difficult it will be to implement this change I don't know.

The "addrman" database is organized into 1024 buckets with "new" addresses (which we haven't tried to connect to), and 256 buckets with "tried" addresses (which we have connected to ourselves). Each bucket consists of 64 positions, and each of those can hold 1 address. Along with the addresses we remember where we originally heard about them (which IP).

Each group of source IPs (/16s etc) selects a subset of just 64 buckets (salted using a host-specific secret key), and inserts the newly received IPs in a position in a bucket in one of those, if certain criteria are met (the position was empty, or it held an IP address that also occurs elsewhere in the table already). This limits the impact an attacker can have, because they cannot under any circumstances affect IPs in buckets outside of the 64 their group maps to.

This database structure is a design from 2012, which was significantly improved following recommendations in the Eclipse Attacks paper (https://cs-people.bu.edu/heilman/eclipse/).

> - Change the AddrDb updating functionality so that it does not add nodes that are unreachable. Not unreachable by timeout, but "connection refused" kind of errors.

In a way we have that; there are separate tables in peers.dat for new and tried addresses. I don't think it's feasible to not add untried addresses at all, as our ability to create connections is far too low to try everything we receive. But I think the existing structure should reasonably protect against spam (in terms of database poisoning; there is certainly a processing cost to it).

Cheers,

--
Pieter

And this is my message on the mailing list: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019265.html

TL;DR the addrman database and peers.dat are capped to a specific size so at least excessive memory usage, and in the worst case a memory leak, can't happen.

nc50lc

legendary

Activity: 2646

Merit: 6681

Self-proclaimed Genius

Quote from: titular on July 12, 2021, 06:27:59 AM

How would I be able to check something like this myself with my own node?

Quote from: ranochigo on July 12, 2021, 11:50:26 PM

Are you periodically retrieving the data from peers.dat or are you using a better method? If it's the former, then I'll parse the peers.dat later on at fixed intervals and see what I can find.

I think it's the same as the the result of getnodeaddresses 0 RPC.
Note: it will take a while (GUI will freeze while loading) if you have hundred of thousands of entries in your peers database.

ranochigo

legendary

Activity: 3038

Merit: 4418

Crypto Swap Exchange

Quote from: piotr_n on July 12, 2021, 02:00:10 PM

Yes, they are still coming.

You will only see them in your node's peers database.

I took a quick look at the node on my server. I'm seeing a few addr messages with less than 10 IPs being sorted into the bucket every few seconds.

Are you periodically retrieving the data from peers.dat or are you using a better method? If it's the former, then I'll parse the peers.dat later on at fixed intervals and see what I can find.

NotATether

legendary

Activity: 1568

Merit: 6660

bitcoincleanup.com / bitmixlist.org

Quote from: DaveF on July 12, 2021, 01:42:44 PM

I am still not seeing anything out of the ordinary. So either they are hitting specific IPs / Nodes or my SonicWall is blocking them for some reason.
I do have the sonic configured to block botnets, so if the connections are coming from known bad IPs they might never make it in. But other then that I have no idea.

This could actually be a good feature to implement in Bitcoin Core, no? Botnet detection and blocking. Without hooking up to any third-party software or API, you could make Core read a certain file that has a blocked subnet on each line, nodes discovered in those subnets won't even be added to the bucket or queried for additional peers, which can thwart an attack like this.

To supply the actual IPs themselves you could change Core to record invalid IP address/port combos in the file and advertise a ZMQ message for peers to retrieve your "peer ignore list" so they can update their files too. In this way, the entire network becomes resilient to this kind of attack (the nodes that upgrade, at least).

edit: I just sent this idea to the bitcoin-dev mailing list, let's see what they say.

piotr_n

legendary

Activity: 2058

Merit: 1416

aka tonikt

Quote from: DaveF on July 12, 2021, 01:42:44 PM

I am still not seeing anything out of the ordinary. So either they are hitting specific IPs / Nodes or my SonicWall is blocking them for some reason.
I do have the sonic configured to block botnets, so if the connections are coming from known bad IPs they might never make it in. But other then that I have no idea.

@piotr_n are you still seeing the attack?

-Dave

Yes, they are still coming.

You will only see them in your node's peers database.

Topic: Loads of fake peers advertised on bitcoin network (Read 656 times)