Pages:
Author

Topic: ANN: Announcing code availability of the bitsofproof supernode - page 11. (Read 35182 times)

hero member
Activity: 836
Merit: 1030
bits of proof
It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions.

I think offering sufficient fee in the transaction is the incentive that will keep working as the network grows, since CPU spent for validation does not come for free.
hero member
Activity: 836
Merit: 1030
bits of proof
Seeing a signature twice is the normal case:
  • once as an unconfirmed transaction via "tx" message
  • once as a confirmed transaction, via "block" message
I see, I was thinking only in the context of the chain, not chain and unconfirmed. Thank you.
legendary
Activity: 1106
Merit: 1004
It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions.  Thus, a signature cache ensures that "block" messages are largely already verified, by the time they are received.

If miners are already shrinking their blocks due to propagation time, maybe it's time to consider that protocol improvement suggestion in the scalability page, about not sending the entire block every time, just the portion your peer doesn't have.

I don't know what exactly could be send instead of the body, though. Even if my peer has all transactions in his cache, doesn't he need to know the exact order I put them in my block in order to rebuild the Merkle tree? So just sending the header is not enough. Is there a shorter way to identify a transaction than its hash? Perhaps just an ordinal of the hash... you assume your peer has the same transaction pool you had, then as the body of your block you send a series of numbers which represent the index a tx hash would have in a  sorted array with all of them? Not sure that would work frequently... plus, the hash is probably small enough, it shouldn't be a big deal to send all of them.
legendary
Activity: 1596
Merit: 1100
For long running nodes, the signature cache is also very helpful.  Over time, transactions are accepted into the memory pool and signature cache.

I do not yet see how the signature cache actually helps, would you please elaborate?

A signature is a function of the hash of a modified version of the input transaction. The modification depends on which output of that transaction is to be spent. Aren't chances of seeing the same signature twice negligible?

Seeing a signature twice is the normal case:
  • once as an unconfirmed transaction via "tx" message
  • once as a confirmed transaction, via "block" message

It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions.  Thus, a signature cache ensures that "block" messages are largely already verified, by the time they are received.

hero member
Activity: 836
Merit: 1030
bits of proof
For long running nodes, the signature cache is also very helpful.  Over time, transactions are accepted into the memory pool and signature cache.

I do not yet see how the signature cache actually helps, would you please elaborate?

A signature is a function of the hash of a modified version of the input transaction. The modification depends on which output of that transaction is to be spent. Aren't chances of seeing the same signature twice negligible?
legendary
Activity: 1596
Merit: 1100
Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago.  A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful.  Over time, transactions are accepted into the memory pool and signature cache.  When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.



are you honestly adding more info.. or just trying to point out that your "pynode" is *so* ahead ??

pynode is quite incomplete, as the TODO notes.  bitsofproof is farther along in many ways.

The point is that these are well known techniques, and it is disappointing that these are being "discovered" when the knowledge is readily available for anyone who looks.

newbie
Activity: 32
Merit: 0
Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago.  A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful.  Over time, transactions are accepted into the memory pool and signature cache.  When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.



are you honestly adding more info.. or just trying to point out that your "pynode" is *so* ahead ??
newbie
Activity: 56
Merit: 0
This looks very promising. Backend bitcoin engines need a nice modular format to allow for the many creative uses people will be coming up with.
legendary
Activity: 1596
Merit: 1100
Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago.  A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful.  Over time, transactions are accepted into the memory pool and signature cache.  When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.

hero member
Activity: 836
Merit: 1030
bits of proof
Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...
hero member
Activity: 836
Merit: 1030
bits of proof
It is time for a project update.

[HYPE]
I enumerate some challenges and the bitsofproof answers to them.

1. Responsiveness to a large number of peers.
This is addressed using non-blocking-IO and a thread pool that forwards fully read messages to their listener, that are otherwise isolated from the complexity of the communication.

2. CPU hunger for signature validation.
This is not a current concern for the bitsofproof node since validation of transactions in a block and inputs the same transaction are executed simultaneously. CPUs are utilized to their limit, as many as you provide.

3. Resolving the inputs of a transaction against the database of previous transactions.
Using a faster database (some NoSQL) or SSD drives only give you a short break until the bottleneck re-appears as the network grows.
A sustainable approach to address the problem is to constantly prune the data set searched. The bitsofproof node uses a separate archive transaction table(s) into which a parallel thread moves fully spent transactions much like a garbage collector reclaims storage. The live transaction table that is searched to satisfy new transaction inputs is kept close to the UTXO set.

4. The client API is implemented as a clone of BCCAPI, that provides services to lightweight clients trusting the server.

5. A server API is in draft, that will provide a high level access to persistent and transient data structures, notifications. The aim is to provide all features provided in bitcoind via JSON-RPC, but in bitsofproof's case using the invoker of your choice on an API defined as a proper java interface.

6. The next layer of scale will be implemented through distributing both storage and computation, considerations for this already influence the design.
[/HYPE]

This is code in development. I do not care of backward compatibility of interfaces or database for now.
hero member
Activity: 668
Merit: 501
The project just got mavenized
sounds great! will have more looks at at and try to get it running later this week.
hero member
Activity: 836
Merit: 1030
bits of proof
.... my suggestion about ivy vs maven means that the code would get more accessible to other developers, plus i am lazy and do not want to spend a lot of time setting up my ide for this project Smiley

The project just got mavenized by pulling from the first contributor: https://github.com/bartfaitamas/supernode/tree/mavenizing
hero member
Activity: 836
Merit: 1030
bits of proof
I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.
The additional data that is not available from the blockchain:

1) true timestamps for transactions first (and last) seen over p2p
2) true timestamps for blocks first seen over p2p
3) non-volatile database of transactions seen over p2p (a.k.a. mempool, but non-volatile) pruned of the entries actually recorded in the blockchain.
4) a way to quickly locate only orphaned blocks without having to scan the whole table of blocks, really just an additional index to the blockchain storage, but explicitly indexing all the losing chains. This in theory is available from the blockchain, but under current architecture it is difficult to obtain this information. In effect the whitewash of the bitcoin transaction history is ongoing when the tools keep only the winning chain.

Thanks.
None of these sounds difficult to add.
In fact 4 is already there since the HEAD table contains pointer to all leaf nodes not only the trunk.
legendary
Activity: 2128
Merit: 1074
I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.
The additional data that is not available from the blockchain:

1) true timestamps for transactions first (and last) seen over p2p
2) true timestamps for blocks first seen over p2p
3) non-volatile database of transactions seen over p2p (a.k.a. mempool, but non-volatile) pruned of the entries actually recorded in the blockchain.
4) a way to quickly locate only orphaned blocks without having to scan the whole table of blocks, really just an additional index to the blockchain storage, but explicitly indexing all the losing chains. This in theory is available from the blockchain, but under current architecture it is difficult to obtain this information. In effect the whitewash of the bitcoin transaction history is ongoing when the tools keep only the winning chain.

Thanks.
hero member
Activity: 836
Merit: 1030
bits of proof
I meant with audit the proof of consistency of the stored chain independent of the process that maintains it.

Since the bitsofproof node stores all data in a normalized relational database, you should be able to use any data mining package you prefer.

The data currently collected is what is needed to work.

I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.
legendary
Activity: 2128
Merit: 1074
This is about independent global fiscal checks.
Are you talking about pure binary yes/no consistency check audit?

Or are you talking about open-ended auditing for the purpose of discovering normal vs. abnormal transaction patterns?

I've spoken a while back with an experienced forensic accounting professional and we discussed various things that would require keeping true time vs. block time stamp. It would also require keeping a separate database of the in-flight transactions, ones that aren't yet recorded in the blockchain.

4) provide a way of obtaining sincere ledger / audit log for the transactions with multiple timestamps:
4a) true time when first seen on the p2p net
4b) true time when first seen in a block
4c) block time when seen in a block
4d) true time when some block caused a reorg and un-confirmed the transaction
4e) true time when other block reconfirmed the transaction
4f) block time when reconfirmed
4g) etc.. for each subsequent chain reorganization

The observation he had made is that the blockchain is a perfect whitewash: it leaves no trace of the attempted double-spends or other malarkey that would be of great interest to a forensic auditor, e.g.

a) list of addresses involved in the attempted double-spends
b) list of addresses involved in transactions recorded in the orphaned sub-chains but not recorded on the winning chain.

I'm not actually suggesting that you actually implement those auditing reports. But it would be nice if your architecture made possible generating such reports.
hero member
Activity: 836
Merit: 1030
bits of proof
That is already implemented, in the server code.

This is about independent global fiscal checks.
legendary
Activity: 1596
Merit: 1100
hero member
Activity: 836
Merit: 1030
bits of proof
Audit-ability is prerequisite for serious business use, therefore I created a utility that reads the database filled by the bitsofproof node and performs off-line audit.

This tool shares only basic data access and serialization primitives with the server code, so it is fairly independently validating the database content.

You see below an example output:
Code:
[INFO] Audit main Check 1. All paths lead to genesis...
[INFO] Audit main Check 1. Passed.
[INFO] Audit main Check 2. There are no orphan blocks...
[INFO] Audit main Check 2. Passed.
[INFO] Audit main Check 3. Sufficient proof of work on all blocks and correct cumulative work on all paths...
[INFO] Audit main Check 3. Passed.
[INFO] Audit main Check 4. Check transaction hashes and Merkle roots on all blocks...
[INFO] Audit main Check 4. Passed.
[INFO] Audit main Check 5. Block reward amount correct for all blocks...
[INFO] Audit main Check 5. Passed.
[INFO] Audit main Check 6. Total coinbase matches sum of unspent output...
[INFO] Audit main Check 6. Passed.
[INFO] Audit main All requested checks PASSED.

Let me know if you have further ideas for health check of the chain stored.
Pages:
Jump to: