I don't think IO is the bottleneck here, my SSD is optimised as much as possible and iowait seems negligible when I have issues. It rather seems that bitcoind enters into some kind of spin loop when executing getmempoolinfo or getrawtransaction and a connectivity issue with the RPC client occurs?
Edit: So I attached gdb to the process while it was stuck on getrawtransaction. Here is the info about the thread that is stuck and using 100% CPU (this is version v23.0):
Thread 7 (Thread 0x7fb37fdf40 (LWP 850614) "b-httpworker.0"):
#0 0x0000007fbda6a92c in mcount_internal (selfpc=8709628, frompc=1142896) at ../gmon/mcount.c:153
#1 __mcount (frompc=0x53c880 , std::allocator >, std::__cxx11::basic_string, std::allocator >, std::vector >)+240>) at ../gmon/mcount.c:180
#2 0x000000000084e5fc in RPCResult::CheckInnerDoc (this=this@entry=0x7fb37fbe70) at rpc/util.cpp:903
#3 0x000000000053c880 in RPCResult::RPCResult (inner=..., description=..., optional=false, m_key_name=..., type=, this=0x7fb37fbe70) at ./rpc/util.h:301
#4 RPCResult::RPCResult (this=0x7fb37fbe70, type=, m_key_name=..., description=..., inner=...) at ./rpc/util.h:309
#5 0x0000000000589d4c in getrawtransaction () at rpc/rawtransaction.cpp:194
#6 0x000000000053b1b0 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string, std::allocator >, RPCHelpMan (*)())::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}::operator()(JSONRPCRequest const&, UniValue&, bool) const (request=..., result=..., __closure=, __closure=) at ./rpc/server.h:109
#7 0x00000000005b71cc in std::function::operator()(JSONRPCRequest const&, UniValue&, bool) const (__args#2=, __args#1=..., __args#0=..., this=0xd1c650 ) at /usr/include/c++/10/bits/std_function.h:622
#8 ExecuteCommand (command=..., request=..., result=..., last_handler=true) at rpc/server.cpp:480
#9 0x00000000005b840c in ExecuteCommands (result=..., request=..., commands=...) at rpc/server.cpp:444
#10 CRPCTable::execute (this=, request=...) at rpc/server.cpp:464
#11 0x000000000066e218 in HTTPReq_JSONRPC (context=..., req=0x7f9c0035a0) at httprpc.cpp:202
#12 0x000000000067aa84 in std::function, std::allocator > const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string, std::allocator > const&) const (__args#1=..., __args#0=, this=) at /usr/include/c++/10/bits/std_function.h:622
#13 HTTPWorkItem::operator() (this=) at httpserver.cpp:54
#14 WorkQueue::Run (this=this@entry=0x16e99150) at httpserver.cpp:112
#15 0x0000000000676480 in HTTPWorkQueueRun (queue=0x16e99150, worker_num=) at httpserver.cpp:342
#16 0x0000007fbdca4cac in ?? () from target:/lib/aarch64-linux-gnu/libstdc++.so.6
#17 0x0000007fbe048648 in start_thread (arg=0x7fb37fd840) at pthread_create.c:477
#18 0x0000007fbda67c1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Here is a second snapshot
Thread 7 (Thread 0x7fb37fdf40 (LWP 850614) "b-httpworker.0"):
#0 0x0000007fbda6a91c in mcount_internal (selfpc=8709628, frompc=1142896) at ../gmon/mcount.c:152
#1 __mcount (frompc=0x53c880 , std::allocator >, std::__cxx11::basic_string, std::allocator >, std::vector >)+240>) at ../gmon/mcount.c:180
#2 0x000000000084e5fc in RPCResult::CheckInnerDoc (this=this@entry=0x7fb37fbe70) at rpc/util.cpp:903
#3 0x000000000053c880 in RPCResult::RPCResult (inner=..., description=..., optional=false, m_key_name=..., type=, this=0x7fb37fbe70) at ./rpc/util.h:301
#4 RPCResult::RPCResult (this=0x7fb37fbe70, type=, m_key_name=..., description=..., inner=...) at ./rpc/util.h:309
#5 0x0000000000589d4c in getrawtransaction () at rpc/rawtransaction.cpp:194
#6 0x000000000053b1b0 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string, std::allocator >, RPCHelpMan (*)())::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}::operator()(JSONRPCRequest const&, UniValue&, bool) const (request=..., result=..., __closure=, __closure=) at ./rpc/server.h:109
#7 0x00000000005b71cc in std::function::operator()(JSONRPCRequest const&, UniValue&, bool) const (__args#2=, __args#1=..., __args#0=..., this=0xd1c650 ) at /usr/include/c++/10/bits/std_function.h:622
#8 ExecuteCommand (command=..., request=..., result=..., last_handler=true) at rpc/server.cpp:480
#9 0x00000000005b840c in ExecuteCommands (result=..., request=..., commands=...) at rpc/server.cpp:444
#10 CRPCTable::execute (this=, request=...) at rpc/server.cpp:464
#11 0x000000000066e218 in HTTPReq_JSONRPC (context=..., req=0x7f9c0035a0) at httprpc.cpp:202
#12 0x000000000067aa84 in std::function, std::allocator > const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string, std::allocator > const&) const (__args#1=..., __args#0=, this=) at /usr/include/c++/10/bits/std_function.h:622
#13 HTTPWorkItem::operator() (this=) at httpserver.cpp:54
#14 WorkQueue::Run (this=this@entry=0x16e99150) at httpserver.cpp:112
#15 0x0000000000676480 in HTTPWorkQueueRun (queue=0x16e99150, worker_num=) at httpserver.cpp:342
#16 0x0000007fbdca4cac in ?? () from target:/lib/aarch64-linux-gnu/libstdc++.so.6
#17 0x0000007fbe048648 in start_thread (arg=0x7fb37fd840) at pthread_create.c:477
#18 0x0000007fbda67c1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
and a third one:
Thread 7 (Thread 0x7fb37fdf40 (LWP 850614) "b-httpworker.0"):
#0 0x0000007fbda6a92c in mcount_internal (selfpc=8709628, frompc=1142896) at ../gmon/mcount.c:153
#1 __mcount (frompc=0x53c880 , std::allocator >, std::__cxx11::basic_string, std::allocator >, std::vector >)+240>) at ../gmon/mcount.c:180
#2 0x000000000084e5fc in RPCResult::CheckInnerDoc (this=this@entry=0x7fb37fbe70) at rpc/util.cpp:903
#3 0x000000000053c880 in RPCResult::RPCResult (inner=..., description=..., optional=false, m_key_name=..., type=, this=0x7fb37fbe70) at ./rpc/util.h:301
#4 RPCResult::RPCResult (this=0x7fb37fbe70, type=, m_key_name=..., description=..., inner=...) at ./rpc/util.h:309
#5 0x0000000000589d4c in getrawtransaction () at rpc/rawtransaction.cpp:194
#6 0x000000000053b1b0 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string, std::allocator >, RPCHelpMan (*)())::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}::operator()(JSONRPCRequest const&, UniValue&, bool) const (request=..., result=..., __closure=, __closure=) at ./rpc/server.h:109
#7 0x00000000005b71cc in std::function::operator()(JSONRPCRequest const&, UniValue&, bool) const (__args#2=, __args#1=..., __args#0=..., this=0xd1c650 ) at /usr/include/c++/10/bits/std_function.h:622
#8 ExecuteCommand (command=..., request=..., result=..., last_handler=true) at rpc/server.cpp:480
#9 0x00000000005b840c in ExecuteCommands (result=..., request=..., commands=...) at rpc/server.cpp:444
#10 CRPCTable::execute (this=, request=...) at rpc/server.cpp:464
#11 0x000000000066e218 in HTTPReq_JSONRPC (context=..., req=0x7f9c0035a0) at httprpc.cpp:202
#12 0x000000000067aa84 in std::function, std::allocator > const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string, std::allocator > const&) const (__args#1=..., __args#0=, this=) at /usr/include/c++/10/bits/std_function.h:622
#13 HTTPWorkItem::operator() (this=) at httpserver.cpp:54
#14 WorkQueue::Run (this=this@entry=0x16e99150) at httpserver.cpp:112
#15 0x0000000000676480 in HTTPWorkQueueRun (queue=0x16e99150, worker_num=) at httpserver.cpp:342
#16 0x0000007fbdca4cac in ?? () from target:/lib/aarch64-linux-gnu/libstdc++.so.6
#17 0x0000007fbe048648 in start_thread (arg=0x7fb37fd840) at pthread_create.c:477
#18 0x0000007fbda67c1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
What is going on with this?
Thanks!
Edit: Node is stuck again this morning, this time with getmempoolentry. Here is the stuck thread:
Thread 7 (Thread 0x7f8ca26f40 (LWP 1037331) "b-httpworker.0"):
#0 mcount_internal (selfpc=5489172, frompc=1141804) at ../gmon/mcount.c:153
#1 __mcount (frompc=0x53c43c , std::allocator >, bool, std::__cxx11::basic_string, std::allocator >, std::vector >)+108>) at ../gmon/mcount.c:180
#2 0x000000000053c214 in std::vector >::vector (this=0x7f8ca247e0, __x=...) at /usr/include/c++/10/bits/stl_vector.h:553
#3 0x000000000053c43c in RPCResult::RPCResult (this=0x7f8ca247b8, type=RPCResult::Type::STR_AMOUNT, m_key_name=..., optional=true, description=..., inner=...) at /usr/include/c++/10/bits/move.h:101
#4 0x0000000000512434 in MempoolEntryDescription () at rpc/blockchain.cpp:463
#5 0x000000000052c90c in getmempoolentry () at rpc/blockchain.cpp:764
#6 0x000000000053b1b0 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string, std::allocator >, RPCHelpMan (*)())::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}::operator()(JSONRPCRequest const&, UniValue&, bool) const (request=..., result=..., __closure=, __closure=) at ./rpc/server.h:109
#7 0x00000000005b71cc in std::function::operator()(JSONRPCRequest const&, UniValue&, bool) const (__args#2=, __args#1=..., __args#0=..., this=0xd1a738 ) at /usr/include/c++/10/bits/std_function.h:622
#8 ExecuteCommand (command=..., request=..., result=..., last_handler=true) at rpc/server.cpp:480
#9 0x00000000005b840c in ExecuteCommands (result=..., request=..., commands=...) at rpc/server.cpp:444
#10 CRPCTable::execute (this=, request=...) at rpc/server.cpp:464
#11 0x000000000066e218 in HTTPReq_JSONRPC (context=..., req=0x7f7c00f0d0) at httprpc.cpp:202
#12 0x000000000067aa84 in std::function, std::allocator > const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string, std::allocator > const&) const (__args#1=..., __args#0=, this=) at /usr/include/c++/10/bits/std_function.h:622
#13 HTTPWorkItem::operator() (this=) at httpserver.cpp:54
#14 WorkQueue::Run (this=this@entry=0x20a6e150) at httpserver.cpp:112
#15 0x0000000000676480 in HTTPWorkQueueRun (queue=0x20a6e150, worker_num=) at httpserver.cpp:342
#16 0x0000007f926fdcac in ?? () from target:/lib/aarch64-linux-gnu/libstdc++.so.6
#17 0x0000007f92aa1648 in start_thread (arg=0x7f8ca26840) at pthread_create.c:477
#18 0x0000007f924c0c1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Edit: In the release notes for v23.1, there is something about a fix for a race condition. As I think it might be the issue I am facing, I tried upgrading to v24.0.1. I will see it the issue is gone...
Edit: So far it looks like the problem is not occurring with v24.0.1, so it appears I was experiencing the race condition from v23.0 ...