Author

Topic: Added ping time measurement to bitcoind, some questions (Read 1784 times)

member
Activity: 106
Merit: 10
OK, I cleaned up what I originally had, and got it working.

It now works correctly when peer sends pong with nonce of zero (overlapping pongs will confuse the timing, as noted in the rather nice comment in the existing ping receiver).

I added a ratelimit to RPC-requested ping, so that it will limit to 1/second per peer.  The regular automated keepalive ping is immune to this.

The "pingtime" field in getpeerinfo is newly added, and has the pingtime in decimal seconds.  I assume that Bitcoin users are well used to small decimal numbers with many digits by now Smiley

It's now clean enough that I feel it's a pull request candidate:

https://github.com/bitcoin/bitcoin/pull/2937

Some more thoughts:

1) If ping is still outstanding, and peer is taking a very long time to respond, this isn't indicated anywhere.  The pingtime field contains the previously completed ping's time, and updates atomically (it doesn't change until the next ping is fully completed).  So, a naive consumer of the data could be fooled into thinking a badly lagged node is still responsive.  Thought about indicating the current waiting time in the pingtime field instead, if the current waiting time is greater than the previously completed ping's time.  Good idea, or not?

2) If peer is flooding me with pings, I do not ratelimit those.  Reason is, if network is lagged and it suddenly gets better, a lot of previous requests will come through all at once, having been bunched up in buffers.  So, peer could be obeying ratelimit on their end, but by the time it got to me, it would wrongly look to me as if peer is flooding.  Also, if I ratelimit and drop excessive pings, it would mean the first ping gets responded to, but the later pings don't.  The nonce system expects the opposite: it throws away previous nonces when it rolls up a new nonce for a later ping.  That would mean that the ping would essentially be lost in the shuffle.  This is pedantic, anyway, and some kind of (generous) rate limit is probably a good idea as a sanity check.  Since ping commands are limited to 1/second, perhaps limit incoming ping responses (pongs) to 2/second or 3/second?

3) Similarly, if peer is flooding me with pongs, I just swallow them, since they have no reply.  The nonce checking code does not allow a previous nonce to be reused, so duplicated pongs will merely be dropped on the floor.  I wonder if should be more aggressive against peers who are flooding, perhaps Misbehaving() them?

Josh
legendary
Activity: 1526
Merit: 1134
Yeah, manual pings are fine, no need to rate limit them. I was thinking of automatic pinging.

By the way, getpeerinfo should tell you what software isn't implementing the protocol correctly. There is a subver field that we use like a user agent field.
member
Activity: 106
Merit: 10
Good point.

The current thinking is to ratelimit outbound ping requests to maximum of one ping per second.  If faster than that, outbound pings would not be sent.  Similarly, inbound pings would be dropped (no pong would be sent) if they come in too fast.  User action is necessary to generate these additional pings, anyway, they would never be done automatically at such a high rate.  The existing keepalive formula, despite its hairiness/shortcomings, remains unchanged by my patch.

As for not pinging inbound connections, I think it would still be best to ping all connections, that way, ping times can be measured/compared across all peers.  It makes the data more useful, to collect it from everybody, not just a subset.

Good to know about bitcoinj already having a good implementation of ping nonces.
staff
Activity: 4284
Merit: 8808
I think the system already sends a ping every 180 seconds, anyway?
It does not, only if the channel is otherwise inactive.
legendary
Activity: 1232
Merit: 1094
Adding the ping info to peerinfo is a good idea, but doesn't require adding a manual ping system.  I think the system already sends a ping every 180 seconds, anyway?

Peers could disconnect if there is ping spam.  I don't see how it is different from other bandwidth spam though.
staff
Activity: 4284
Merit: 8808
I added the ping/pong system for bitcoinj users. That's the code you're seeing send pings with nonces, I suspect.
Please only ping nodes that you connect outbound to. This allows nodes on mobile connections to better manage bandwidth and battery life.
I don't see any problem with ping on user demand, or sending rare pings (e.g. at timescale of 1/minute) at the same time as other messages, do you?
legendary
Activity: 1526
Merit: 1134
I added the ping/pong system for bitcoinj users. That's the code you're seeing send pings with nonces, I suspect.

Please only ping nodes that you connect outbound to. This allows nodes on mobile connections to better manage bandwidth and battery life.
member
Activity: 106
Merit: 10
I added ping time measurement to bitcoind, and it works, using the "ping" and "pong" commands (ping sends a unique nonce, pong echoes that nonce).

I added the ping time as an additional output to getpeerinfo.  The ping time is in seconds, as a decimal number, so it often appears as a very small decimal number with many places, looking rather appropriate for Bitcoin Smiley

Now, I can see, in the table of connected peers, the ping time for each peer.  The ping command is handled in the normal command handling function, it isn't given additional priority or anything like that, so the ping time also includes all command processing backlog time, revealing which nodes are heavily congested with commands to be processed (such as nodes trying to catch up on blockchain synchronization).

I also added a new RPC command, "ping", to request that a ping command be sent out to all peers.  This is in additional to the usual case of sending out the ping command as a keepalive.

It's not ready for pull request yet, but to take a look at it, it's here:

https://github.com/Krellan/bitcoin/commit/fbc91c10d818432070ef8bac56b07707f43ef0b1

Interesting observations:

1) The reference bitcoind sends ping commands with 0 (zero) as a nonce, however, during my testing I encountered other peers that sent me valid nonces (nonzero, looks random enough to me).  So, there must be other patches floating around that also add valid nonces to the ping command?

2) I noticed a lot of nonce mismatches.  Upon further inspection, there are a lot of peers that send a pong reply with 0 (zero) as the nonce, regardless of what I had sent to it in the original ping command.  I wonder what causes this?  The reference bitcoind doesn't do this, it is correct, and it looks like a bug in some other client.

3) In my implementation I only keep track of one outstanding nonce/timestamp at a time.  There's nothing stopping the user from sending another ping while still waiting for the first to complete, though.  I wonder if it is worth it to complicate the implementation by using a vector so I can remember more than one nonce/timestamp at a time per node, or only consider the most recent ping attempt to be the only valid ping?

4) I'm using a unique random nonce for each peer, which is rather wasteful of valuable random number entropy.  Perhaps use the same nonce for all peers in a single ping command, or would that open up other problems?

5) Adding the new RPC command "ping" opens up an opportunity for a user with RPC access to attempt to DoS the rest of the Bitcoin network, by sending out ping commands as fast as possible.  I need to ratelimit this.  Any suggestions on what a reasonable ratelimit would be?

6) Similarly to the above, if a peer is sending me ping commands too quickly, I should also ratelimit my pong replies.  Any suggestions for that?  There are other things as well, such as sending me unsolicited pong commands (without a corresponding ping).  If peer is doing these, should I punish them with Misbehaving() or just silently ignore them?

Thanks!

This is a rather cool feature, I'm happy about it, and it was inspired by the "pings" webpage of p2pool.

Josh
Jump to: