Author

Topic: Bad bitcoin neighbor permanently prevents progress of good bitcoin node (Read 1515 times)

sr. member
Activity: 278
Merit: 251
Thanks for following up, ... yep, known and expected behaviour, a side effect of the initial sync process.  The next major version (which will have headers first sync) should greatly improve it.

Thanks for reassuring me that this minor performance problem is known and is being addressed.

 FWIW, I unexpectedly experienced a variant today, when my Internet service was out for 30 minutes.  My outside bitcoin node was forced to break all of the exterior connections and when Internet service was restored didn't begin the catchup process until a new block crossed the network.
staff
Activity: 4172
Merit: 8419
Thanks for following up, ... yep, known and expected behaviour, a side effect of the initial sync process.  The next major version (which will have headers first sync) should greatly improve it.
sr. member
Activity: 278
Merit: 251
It doesn't, your front node would have begun pulling the chain after a new block was observed on the network and relayed to it by one of its external peers.


I was able to repeat this "hang" and watch what happened by looking at my node and at blockchain.info.  My node remained hung for more than 24 minutes between blocks, but eventually a new block appeared on blockchain.info and within a few seconds appeared at my node. So, as you suggested the node did begin pulling the chain at that time.

In summary, I would call this situation a "start-up performance non-feature" rather than a "bug", since a few extra minutes on occasions when starting a node isn't really important because nodes shouldn't be going up or down frequently.



sr. member
Activity: 278
Merit: 251
Each time I booted the front node as soon as it exited the "checking blocks" screen and went to the main window the green progress bar at the bottom said "2 hours behind" and where the green check mark icon was supposed to be was the busy icon.  The debug/information window came up with block number 327221. The debug log shows this block as well.  Going to the back node bitcoind getinfo showed it was at the same block.  The Debug log shows the unsuccessful restarts, followed by the successful restart and the rapid catchup.

 I'd be happy to send you a copy of the debug log from today if you send a PM.




staff
Activity: 4172
Merit: 8419
It doesn't, your front node would have begun pulling the chain after a new block was observed on the network and relayed to it by one of its external peers.

Quote
knew perfectly well that it was out of date, displaying it in the UI.
You don't say what it was displaying, precisely.
sr. member
Activity: 278
Merit: 251
This is related to how nodes "catch up" when started. Apologies if this thread is redundant.

I have been running bitcoin-core 0.9.3 for some time on on one of my computers ("front node").  Until today this machine has always started up and resynchronized reasonably quickly after a restart.  A few weeks ago I configured a separate computer on my LAN to run a copy of bitcoind. To ensure that this node synchronized quickly and to avoid placing any load on the bitcoin network I configured this new node ("back node") to connect only to the front node.  (This enabled resynchronizing the entire block chain in under a day.)  My policy has been to shut down the back node, then shut down the front node, then bring up the front node, then wait for the front node to sync and finally bring up the back node.  This has worked fine until today.

Today I made a mistake and brought the back node up first.  After this node was up and running I started the front node.  At this point both nodes were hung up, 2 hours behind. The front node showed how many hours behind, and an inspection of its debug log shows that its first connection was to the back node.  After many minutes, both nodes were hung up at the same block number.  The situation persisted after repeated shut downs and restarts of the front node.  Each time it connected first to the back node and each time it learned that it was behind, eventually growing to 3 hours behind. To fix this situation after I deduced what was happening, I just suspended the back node and restarted the front node.  This time the front node resynchornized in a few seconds.

The front node knew perfectly well that it was out of date, displaying it in the UI.  Despite this knowledge and despite connections to 8 up to date nodes, it persisted in its behavior, even after repeated restarts.  I consider this behavior a bug.  A single misconfigured  neighbor should not permanently block the progress of a correctly configured node.  (Arguably, this is also a denial of service vulnerability.)

Jump to: