ANN: Announcing code availability of the bitsofproof supernode - page 11.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: jgarzik on November 30, 2012, 02:01:06 AM

It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions.

I think offering sufficient fee in the transaction is the incentive that will keep working as the network grows, since CPU spent for validation does not come for free.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: jgarzik on November 30, 2012, 02:01:06 AM

Seeing a signature twice is the normal case:

once as an unconfirmed transaction via "tx" message
once as a confirmed transaction, via "block" message

I see, I was thinking only in the context of the chain, not chain and unconfirmed. Thank you.

caveden

legendary

Activity: 1106

Merit: 1004

Quote from: jgarzik on November 30, 2012, 02:01:06 AM

It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions. Thus, a signature cache ensures that "block" messages are largely already verified, by the time they are received.

If miners are already shrinking their blocks due to propagation time, maybe it's time to consider that protocol improvement suggestion in the scalability page, about not sending the entire block every time, just the portion your peer doesn't have.

I don't know what exactly could be send instead of the body, though. Even if my peer has all transactions in his cache, doesn't he need to know the exact order I put them in my block in order to rebuild the Merkle tree? So just sending the header is not enough. Is there a shorter way to identify a transaction than its hash? Perhaps just an ordinal of the hash... you assume your peer has the same transaction pool you had, then as the body of your block you send a series of numbers which represent the index a tx hash would have in a sorted array with all of them? Not sure that would work frequently... plus, the hash is probably small enough, it shouldn't be a big deal to send all of them.

jgarzik

legendary

Activity: 1596

Merit: 1100

Quote from: grau on November 30, 2012, 01:10:39 AM

Quote from: jgarzik on November 29, 2012, 11:57:05 PM

For long running nodes, the signature cache is also very helpful. Over time, transactions are accepted into the memory pool and signature cache.

I do not yet see how the signature cache actually helps, would you please elaborate?

A signature is a function of the hash of a modified version of the input transaction. The modification depends on which output of that transaction is to be spent. Aren't chances of seeing the same signature twice negligible?

Seeing a signature twice is the normal case:

once as an unconfirmed transaction via "tx" message
once as a confirmed transaction, via "block" message

It is critical to keep "block" message relaying (propagation) times as low as possible, to avoid creating incentives for miners to skip transactions. Thus, a signature cache ensures that "block" messages are largely already verified, by the time they are received.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: jgarzik on November 29, 2012, 11:57:05 PM

For long running nodes, the signature cache is also very helpful. Over time, transactions are accepted into the memory pool and signature cache.

I do not yet see how the signature cache actually helps, would you please elaborate?

A signature is a function of the hash of a modified version of the input transaction. The modification depends on which output of that transaction is to be spent. Aren't chances of seeing the same signature twice negligible?

jgarzik

legendary

Activity: 1596

Merit: 1100

Quote from: xblitz on November 29, 2012, 10:24:23 PM

Quote from: jgarzik on November 29, 2012, 05:11:50 PM

Quote from: grau on November 29, 2012, 04:42:14 PM

Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago. A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful. Over time, transactions are accepted into the memory pool and signature cache. When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.

are you honestly adding more info.. or just trying to point out that your "pynode" is *so* ahead ??

pynode is quite incomplete, as the TODO notes. bitsofproof is farther along in many ways.

The point is that these are well known techniques, and it is disappointing that these are being "discovered" when the knowledge is readily available for anyone who looks.

xblitz

newbie

Activity: 32

Merit: 0

Quote from: jgarzik on November 29, 2012, 05:11:50 PM

Quote from: grau on November 29, 2012, 04:42:14 PM

Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago. A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful. Over time, transactions are accepted into the memory pool and signature cache. When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.

are you honestly adding more info.. or just trying to point out that your "pynode" is *so* ahead ??

J-Norm

newbie

Activity: 56

Merit: 0

This looks very promising. Backend bitcoin engines need a nice modular format to allow for the many creative uses people will be coming up with.

jgarzik

legendary

Activity: 1596

Merit: 1100

Quote from: grau on November 29, 2012, 04:42:14 PM

Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

Yep, pynode figured many of these things out, long ago. A block (including TXs) cache is very useful.

For long running nodes, the signature cache is also very helpful. Over time, transactions are accepted into the memory pool and signature cache. When a new block arrives, the majority of transactions found in that block will have had their signatures pre-checked and cached, making acceptance of that block go more rapidly.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Introducing an opportunistic cache of the last 100.000 transactions gave such a boost to the server that I thought it is worth writing about.
Apparently outputs die generally rather young and saving the db roundtrip for them is the single biggest boost I found until now...

grau

hero member

Activity: 836

Merit: 1030

bits of proof

It is time for a project update.

[HYPE]
I enumerate some challenges and the bitsofproof answers to them.

1. Responsiveness to a large number of peers.
This is addressed using non-blocking-IO and a thread pool that forwards fully read messages to their listener, that are otherwise isolated from the complexity of the communication.

2. CPU hunger for signature validation.
This is not a current concern for the bitsofproof node since validation of transactions in a block and inputs the same transaction are executed simultaneously. CPUs are utilized to their limit, as many as you provide.

3. Resolving the inputs of a transaction against the database of previous transactions.
Using a faster database (some NoSQL) or SSD drives only give you a short break until the bottleneck re-appears as the network grows.
A sustainable approach to address the problem is to constantly prune the data set searched. The bitsofproof node uses a separate archive transaction table(s) into which a parallel thread moves fully spent transactions much like a garbage collector reclaims storage. The live transaction table that is searched to satisfy new transaction inputs is kept close to the UTXO set.

4. The client API is implemented as a clone of BCCAPI, that provides services to lightweight clients trusting the server.

5. A server API is in draft, that will provide a high level access to persistent and transient data structures, notifications. The aim is to provide all features provided in bitcoind via JSON-RPC, but in bitsofproof's case using the invoker of your choice on an API defined as a proper java interface.

6. The next layer of scale will be implemented through distributing both storage and computation, considerations for this already influence the design.
[/HYPE]

This is code in development. I do not care of backward compatibility of interfaces or database for now.

apetersson

hero member

Activity: 668

Merit: 501

Quote from: grau on November 22, 2012, 12:07:24 AM

The project just got mavenized

sounds great! will have more looks at at and try to get it running later this week.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: apetersson on November 19, 2012, 11:25:36 AM

.... my suggestion about ivy vs maven means that the code would get more accessible to other developers, plus i am lazy and do not want to spend a lot of time setting up my ide for this project

The project just got mavenized by pulling from the first contributor: https://github.com/bartfaitamas/supernode/tree/mavenizing

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: 2112 on November 21, 2012, 04:26:09 PM

Quote from: grau on November 21, 2012, 04:09:35 PM

I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.

The additional data that is not available from the blockchain:

1) true timestamps for transactions first (and last) seen over p2p
2) true timestamps for blocks first seen over p2p
3) non-volatile database of transactions seen over p2p (a.k.a. mempool, but non-volatile) pruned of the entries actually recorded in the blockchain.
4) a way to quickly locate only orphaned blocks without having to scan the whole table of blocks, really just an additional index to the blockchain storage, but explicitly indexing all the losing chains. This in theory is available from the blockchain, but under current architecture it is difficult to obtain this information. In effect the whitewash of the bitcoin transaction history is ongoing when the tools keep only the winning chain.

Thanks.

None of these sounds difficult to add.
In fact 4 is already there since the HEAD table contains pointer to all leaf nodes not only the trunk.

2112

legendary

Activity: 2128

Merit: 1074

Quote from: grau on November 21, 2012, 04:09:35 PM

I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.

The additional data that is not available from the blockchain:

1) true timestamps for transactions first (and last) seen over p2p
2) true timestamps for blocks first seen over p2p
3) non-volatile database of transactions seen over p2p (a.k.a. mempool, but non-volatile) pruned of the entries actually recorded in the blockchain.
4) a way to quickly locate only orphaned blocks without having to scan the whole table of blocks, really just an additional index to the blockchain storage, but explicitly indexing all the losing chains. This in theory is available from the blockchain, but under current architecture it is difficult to obtain this information. In effect the whitewash of the bitcoin transaction history is ongoing when the tools keep only the winning chain.

Thanks.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

I meant with audit the proof of consistency of the stored chain independent of the process that maintains it.

Since the bitsofproof node stores all data in a normalized relational database, you should be able to use any data mining package you prefer.

The data currently collected is what is needed to work.

I would be interested to know what extra data would support forensic accounting, and would attempt to derive it from the server's interaction with peers.

2112

legendary

Activity: 2128

Merit: 1074

Quote from: grau on November 21, 2012, 03:19:29 PM

This is about independent global fiscal checks.

Are you talking about pure binary yes/no consistency check audit?

Or are you talking about open-ended auditing for the purpose of discovering normal vs. abnormal transaction patterns?

I've spoken a while back with an experienced forensic accounting professional and we discussed various things that would require keeping true time vs. block time stamp. It would also require keeping a separate database of the in-flight transactions, ones that aren't yet recorded in the blockchain.

Quote from: 2112 on September 08, 2012, 11:41:05 AM

4) provide a way of obtaining sincere ledger / audit log for the transactions with multiple timestamps:
4a) true time when first seen on the p2p net
4b) true time when first seen in a block
4c) block time when seen in a block
4d) true time when some block caused a reorg and un-confirmed the transaction
4e) true time when other block reconfirmed the transaction
4f) block time when reconfirmed
4g) etc.. for each subsequent chain reorganization

The observation he had made is that the blockchain is a perfect whitewash: it leaves no trace of the attempted double-spends or other malarkey that would be of great interest to a forensic auditor, e.g.

a) list of addresses involved in the attempted double-spends
b) list of addresses involved in transactions recorded in the orphaned sub-chains but not recorded on the winning chain.

I'm not actually suggesting that you actually implement those auditing reports. But it would be nice if your architecture made possible generating such reports.

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Quote from: jgarzik on November 21, 2012, 03:13:57 PM

Here is one list: https://en.bitcoin.it/wiki/Protocol_rules

That is already implemented, in the server code.

This is about independent global fiscal checks.

jgarzik

legendary

Activity: 1596

Merit: 1100

Here is one list: https://en.bitcoin.it/wiki/Protocol_rules

grau

hero member

Activity: 836

Merit: 1030

bits of proof

Audit-ability is prerequisite for serious business use, therefore I created a utility that reads the database filled by the bitsofproof node and performs off-line audit.

This tool shares only basic data access and serialization primitives with the server code, so it is fairly independently validating the database content.

You see below an example output:

Code:

[INFO] Audit main Check 1. All paths lead to genesis...
[INFO] Audit main Check 1. Passed.
[INFO] Audit main Check 2. There are no orphan blocks...
[INFO] Audit main Check 2. Passed.
[INFO] Audit main Check 3. Sufficient proof of work on all blocks and correct cumulative work on all paths...
[INFO] Audit main Check 3. Passed.
[INFO] Audit main Check 4. Check transaction hashes and Merkle roots on all blocks...
[INFO] Audit main Check 4. Passed.
[INFO] Audit main Check 5. Block reward amount correct for all blocks...
[INFO] Audit main Check 5. Passed.
[INFO] Audit main Check 6. Total coinbase matches sum of unspent output...
[INFO] Audit main Check 6. Passed.
[INFO] Audit main All requested checks PASSED.

Let me know if you have further ideas for health check of the chain stored.

Topic: ANN: Announcing code availability of the bitsofproof supernode - page 11. (Read 35182 times)