Author

Topic: Need help monkey-patching Core: /src/zmq/zmqnotificationinterface.cpp (Read 140 times)

newbie
Activity: 9
Merit: 186
See this PR https://github.com/bitcoin/bitcoin/pull/23624 for a new rawmempooltx publisher publishing only mempool transactions.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Unless you are keeping track of all addresses, there is no reason to keep track of all transactions. You can compare the output of a received tx to a list of addresses you need to keep track of. If the tx has one or more outputs going to an address in the list of relevant addresses, you can add the tx to your DS, and otherwise ignore it.

The only negative consequence of implementing the above is that you might not have a tx that is dependant on another unconfirmed tx in your DS.

Again, I can't choose to only watch some addresses - the ZMQ topic Core provides only spits out all transactions, and I have to filter them to only process the transactions with addresses I'm interested in.



Anyway, I figured out how to block the confirmed transactions from being broadcasted again by commenting out this part in BlockDisconnected:

Code:
void CZMQNotificationInterface::BlockDisconnected(const std::shared_ptr& pblock, const CBlockIndex* pindexDisconnected)
{
/* ----- this part is deleted
    for (const CTransactionRef& ptx : pblock->vtx) {
        const CTransaction& tx = *ptx;
        TryForEachAndRemoveFailed(notifiers, [&tx](CZMQAbstractNotifier* notifier) {
            return notifier->NotifyTransaction(tx);
        });
    }
--------- */
    // Next we notify BlockDisconnect listeners for *all* blocks
    TryForEachAndRemoveFailed(notifiers, [pindexDisconnected](CZMQAbstractNotifier* notifier) {
        return notifier->NotifyBlockDisconnect(pindexDisconnected);
    });
}
copper member
Activity: 1666
Merit: 1901
Amazon Prime Member #7
This messes with my software, causing it to add txids, addresses, etc. a second time inside arrays (this means that the same transaction is received twice in total).

CMIIW, but isn't it possible to get around the problem using different data structure which doesn't allow duplicate items (for example, Python has set() and dict())? I don't know about time complexity between data structure on programming language you use, but it's likely cheaper than resize the array.

It is but I don't want to store all the raw transactions inside a unique array. There's a reason why the default publishrawtx watermark was set to 1000 - it only takes an hour or so to reach that many transactions inside the queue. As you can guess, the memory costs become increasingly enormous. And that's just on testnet - it's even more frequent on mainnet.
Unless you are keeping track of all addresses, there is no reason to keep track of all transactions. You can compare the output of a received tx to a list of addresses you need to keep track of. If the tx has one or more outputs going to an address in the list of relevant addresses, you can add the tx to your DS, and otherwise ignore it.

The only negative consequence of implementing the above is that you might not have a tx that is dependant on another unconfirmed tx in your DS.
sr. member
Activity: 310
Merit: 727
---------> 1231006505
It is but I don't want to store all the raw transactions inside a unique array. There's a reason why the default publishrawtx watermark was set to 1000 - it only takes an hour or so to reach that many transactions inside the queue. As you can guess, the memory costs become increasingly enormous. And that's just on testnet - it's even more frequent on mainnet.
Why not store a hash of the rawtx in a list/array as well? That way you could check if that value is in there instead of the entire rawtx.

Something like this
Code:
You calculate sha-256(zmq->rawtx) to create a hash like 6b0599145092f00e21a2135c5197ff023a65e23939b310277e445efda1f50c53
you lookup this hash value in a list of processed transactions
if not found:
   process the raw-transaction handled by zmq, and don't store the complete rawtx after processing.
That way you do not have to alter the bitcoind sourcecode.

Please note: the hashing suggested above could be anything you want, it just meant as a unique identifier for a longer raw transaction and should not be mistaken for an actual txid!
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
This messes with my software, causing it to add txids, addresses, etc. a second time inside arrays (this means that the same transaction is received twice in total).

CMIIW, but isn't it possible to get around the problem using different data structure which doesn't allow duplicate items (for example, Python has set() and dict())? I don't know about time complexity between data structure on programming language you use, but it's likely cheaper than resize the array.

It is but I don't want to store all the raw transactions inside a unique array. There's a reason why the default publishrawtx watermark was set to 1000 - it only takes an hour or so to reach that many transactions inside the queue. As you can guess, the memory costs become increasingly enormous. And that's just on testnet - it's even more frequent on mainnet.



EDIT: I just realized after patching the source code cloned from github on the "master" branch at v22.0 tag, I still get an "experimental"-labeled build, and no zeromq notification options (all my ZMQ arguments to the new bitcoind binary were ignored).

Maybe I forgot to install libzmq dev dependencies first. I'll try that first before resorting to more draconian options.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
It appears that the ZeroMQ topic I'm listening to, "rawtx", not only emits a raw transaction when it appears on the mempool, but once it's already confirmed too.

This messes with my software, causing it to add txids, addresses, etc. a second time inside arrays (this means that the same transaction is received twice in total).

Array de-duping is not a viable solution long-term (because the array will quickly grow to be big eventually and then this has to happen every time a new element is added), so I'm trying to nip the problem from the source by instructing Core to only publish unconfirmed bitcoin transactions.

According to https://bitcoin.stackexchange.com/questions/52848/is-it-possible-to-configure-the-bitcoin-daemon-to-only-broadcast-unconfirmed-tra , it is not possible to configure this from a configuration or command-line option. The source code must directly be edited. But since the codebase has changed greatly, the proposed solution no longer works.

The following function inside src/zmq/zmqnotificationinterface.cpp needs to be patched, but how?

Code:
void CZMQNotificationInterface::TransactionRemovedFromMempool(const CTransactionRef& ptx, MemPoolRemovalReason reason, uint64_t mempool_sequence)
{
    // Called for all non-block inclusion reasons
    const CTransaction& tx = *ptx;

    TryForEachAndRemoveFailed(notifiers, [&tx, mempool_sequence](CZMQAbstractNotifier* notifier) {
        return notifier->NotifyTransactionRemoval(tx, mempool_sequence);
    });
}

//... or maybe this function needs to be patched:

void CZMQNotificationInterface::BlockDisconnected(const std::shared_ptr& pblock, const CBlockIndex* pindexDisconnected)
{
    for (const CTransactionRef& ptx : pblock->vtx) {
        const CTransaction& tx = *ptx;
        TryForEachAndRemoveFailed(notifiers, [&tx](CZMQAbstractNotifier* notifier) {
            return notifier->NotifyTransaction(tx);
        });
    }

    // Next we notify BlockDisconnect listeners for *all* blocks
    TryForEachAndRemoveFailed(notifiers, [pindexDisconnected](CZMQAbstractNotifier* notifier) {
        return notifier->NotifyBlockDisconnect(pindexDisconnected);
    });
}

Jump to: