Pages:
Author

Topic: We need to split up the Satoshi client - page 2. (Read 3158 times)

legendary
Activity: 2128
Merit: 1073
October 25, 2012, 04:45:03 PM
#12
Can you point me to some resources that demonstrate this (e.g. forum threads)?
For one example of a difficult-to-solve problem search the board for the phrase "inversion of control". You can also see for yourself by experimentally comparing "bitcoin-qt -server" and "bitcoind" behave when sending coins via RPC with a fee required.

I don't see how a transaction can accidentally be performed twice.
Consider a following (incorrect) attempt to implement an http://en.wikipedia.org/wiki/Two-phase_commit_protocol
between an SQL server and a bitcoind daemon:

1) START  TRANSACTION
2) SELECT amount and destination address
3) call "sendtoaddress" bitcoind via RPC
4) if OK then UPDATE account balances and COMMIT TRANSACTION
5) if not OK then ROLLBACK TRANSACTION

Now consider that the bitcoind had a really complex wallet containing mutiple thousands of unspent coins and also a backup was running on the disk cointaining the .bitcoin directory. The RPC call timed out. What do you do now? How to fix this problem?

While concepts like ACID are familiar to me, I might not understand everything immediately, especially in the field of accounting-the-official-way.
If you are not allergic to Windows the easiest way to learn how to correctly perform distributed transaction is with Microsoft Office. Just create an Excel macro that causes an update in a separate database running under Access, while Access is running some repeated updates on its own. This can be made to work correctly as far as Windows 95 OSR2 and Office 97 with the help of MS DTC. Obviously, newer version of Microsoft product will work too.
legendary
Activity: 2128
Merit: 1073
October 25, 2012, 04:13:58 PM
#11
When properly isolated, a wallet module shouldn't need to deal with asynchronous anything, or anything to do with peers.  A wallet module (which I define as something that manages a collection of private keys and produces/signs transactions desired by its owner), should interact only with a "knowledge center" and its user.

It should be able to get by simply by consuming a stream containing notifications of the following:

* A new transaction you(the wallet) may be interested in has arrived.  Here it is.
* A transaction you're interested in has had a change in confirmation status (where confirmation statuses include "unconfirmed", "X confirmations", and "invalidated") and/or spend status ("confirmed spent", "unconfirmed spent", "believed unspent").

A wallet doesn't need to listen for notifications if it can simply query for an update on all that information when it's needed.


Correct handling of asynchronicity is a necessary requirement for transactionally correct and efficient implementation of Bitcoin. This has been rehashed in detail in Slush'es "Stratum" thread. Stratum is essentially an attempt to implement RPC-like protocol separating walet client from the p2p network+blockchain storage server.

https://bitcointalksearch.org/topic/stratum-overlay-network-protocol-over-bitcoin-55842

You can attempt to fake asynchronicity by frequent polling (you called it "query for an update") or hacks like long-poll. But this isn't scalable solution and ultimately it will also fail, just in a different way than the typical damn-the-ACID hacks.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
October 25, 2012, 01:44:44 PM
#10
But based on my over a year of participation on this forum I'm very sceptical. I haven't seen anyone proposing a new architecture for the "wallet" class/module/client that would correctly handle the asynchronous nature of P2P interface while submitting outgoing transactions. When the blockchain changes underneath the wallet the various implementations I've seen either deadlock, miscompute fees or fail in really complex ways requiring manual modification afterwards.

The worst failure mode is lack of idempotency which causes duplicate payments being made. And in Bitcoin the payments are not reversible by design.

When properly isolated, a wallet module shouldn't need to deal with asynchronous anything, or anything to do with peers.  A wallet module (which I define as something that manages a collection of private keys and produces/signs transactions desired by its owner), should interact only with a "knowledge center" and its user.

It should be able to get by simply by consuming a stream containing notifications of the following:

* A new transaction you(the wallet) may be interested in has arrived.  Here it is.
* A transaction you're interested in has had a change in confirmation status (where confirmation statuses include "unconfirmed", "X confirmations", and "invalidated") and/or spend status ("confirmed spent", "unconfirmed spent", "believed unspent").

A wallet doesn't need to listen for notifications if it can simply query for an update on all that information when it's needed.

cjp
full member
Activity: 210
Merit: 124
October 25, 2012, 01:07:47 PM
#9
But based on my over a year of participation on this forum I'm very sceptical. I haven't seen anyone proposing a new architecture for the "wallet" class/module/client that would correctly handle the asynchronous nature of P2P interface while submitting outgoing transactions. When the blockchain changes underneath the wallet the various implementations I've seen either deadlock, miscompute fees or fail in really complex ways requiring manual modification afterwards.

The worst failure mode is lack of idempotency which causes duplicate payments being made. And in Bitcoin the payments are not reversible by design.

I'll venture to guess that Nefario is the most recent victim of that last problem.

Can you point me to some resources that demonstrate this (e.g. forum threads)? Since I might be working on a wallet module, this might be relevant to me. Please note that, while I have developed (simple, small-scale) financial software in the past, and I am currently a professional software developer, I have never professionally developed financial software. While concepts like ACID are familiar to me, I might not understand everything immediately, especially in the field of accounting-the-official-way.

The way I see it, the wallet module only acts as a storage of private keys, and signs transactions on behalf of its user (typically a UI module). It is TBD whether the wallet is also the module that creates the to-be-signed transactions.

The balance of a user is determined by requesting the public keys from the wallet module, and then requesting transaction information about these keys from the knowledge center module. I think a user has two balances:
  • Based only on confirmed transactions
  • Based on both confirmed and unconfirmed transactions (this gets complicated if there are both incoming and outgoing unconfirmed transactions)
It might be best to do this calculation in the knowledge center module: so far my concept of the wallet module does not yet have an interface to retrieve data from the  knowledge center module, and putting it in the UI module would mean duplicating a (possibly tricky) calculation into different UI module implementations.

I don't see how a transaction can accidentally be performed twice. The only potential problem I see is that between publishing a transaction and having it confirmed, the commit state of the transaction is "unknown", and due to the decentralized nature of Bitcoin, there is no authority who is responsible for deciding the final commit state. In theory a transaction can stay "unknown" indefinitely, and if miners start requiring high fees in the future, this is actually a plausible scenario for low-fee transactions. You can increase the probability of a commit by resending the transaction with a higher fee (making sure they spend the same outputs, so they will never be both committed); you can even try to unroll an unconfirmed transaction that way, by resending it to yourself.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
October 25, 2012, 10:23:47 AM
#8
I would agree with the essence of this proposal.

Yes, that would make things more complex on the whole.  But it would also make the entire thing more powerful.

Here are the immediate obvious benefits I would see from this:

1. The ability for people or companies to provide whole replacements for subsystems of bitcoind.  For example, right now the block database is in a flat file and indexed in a way that makes it impractical to query.  But somebody else might come along and provide a novel replacement that uses a full-blown DBMS as a back-end, allowing other services (like websites, or corporate workstations) to interact with bitcoind by inserting database records into a work queue, and could query information about transactions from it.  It could be rigged so multiple instances of bitcoind can function from that same database, and use appropriate record locking so they don't duplicate work or step on each other.  This would add resilience, because DBMS software is more mature and offers more options for scalability and high availability, and if nodes running bitcoind become inoperative, everything still runs as long as at least one node is still good.  It would add compatibility, because far more development platforms come ready to talk to a running database than a running instance of bitcoind.  It would add flexibility, because people could build their own indexes on whatever they needed without waiting for someone to code it into bitcoind.  Most importantly, it would be an OPTION - so those who simply want to run a desktop version of the client can just run the regular flat-file database so they aren't burdened with huge dependencies.

2. The ability to reuse components as a developer without worrying that you are biting off more you can chew.  If you want to build a bitcoin client but have no aspirations to replace the P2P communication code, you can drop the existing implementation in and be done.  Likewise, since that code is hopefully a monolithic unit rather than something you surgically ripped from bitcoin's source code, you will have a much easier time consuming and implementing updates to that code.

3. The ability to deprecate the "wallet.dat", transaction history, and "accounts" function from bitcoind, as well as to free the definition of a "wallet" file as being a file of any certain type.  To me, the wallet being in bitcoind is as out of place as support for "sending rich text e-mail with emoticons" in an SMTP server daemon.  It belongs as an optional feature for a GUI / wallet management subsystem (one that ideally can open and close wallets at will, sort of how Microsoft Word can open and close .docx files at will).

I see several other benefits, but they have already been listed to some extent.
legendary
Activity: 2128
Merit: 1073
October 25, 2012, 09:49:27 AM
#7
What can't bitcoind do right now that requires re-doing four years of programming?
Satoshi bitcoin client can't participate in transactions as required by any serious financial software, eg. CICS, Tuxedo, Microsoft Distributed Transaction Coordinator, etc. Also, bitcoind requires extensive modification to integrate with any GAAP-compliant accounting software and ACID-compliant database.

I wouldn't call it "re-doing", more like extensive "re-factoring" or "re-architecting".

But based on my over a year of participation on this forum I'm very sceptical. I haven't seen anyone proposing a new architecture for the "wallet" class/module/client that would correctly handle the asynchronous nature of P2P interface while submitting outgoing transactions. When the blockchain changes underneath the wallet the various implementations I've seen either deadlock, miscompute fees or fail in really complex ways requiring manual modification afterwards.

The worst failure mode is lack of idempotency which causes duplicate payments being made. And in Bitcoin the payments are not reversible by design.

I'll venture to guess that Nefario is the most recent victim of that last problem.
b!z
legendary
Activity: 1582
Merit: 1010
October 25, 2012, 04:28:31 AM
#6
I think this is absolutely pointless. The Satoshi client is good enough as it is.
legendary
Activity: 1512
Merit: 1036
October 25, 2012, 03:20:20 AM
#5
You should define the problem first. Then look for the solution. If only virus writers had a bitcoin.dll to use?

Who does this benefit? End users? Web merchants? Mining pools? Developers? What can't bitcoind do right now that requires re-doing four years of programming?

I would think a better project would be to pull out some of the libraries such as openssl, leveldb, or qt so they can be independently built or use project-distributed binaries to keep up with version changes or security fixes more easily, or use libraries that are already included in some OS distributions instead of side-by-side shared object installs.
cjp
full member
Activity: 210
Merit: 124
October 25, 2012, 01:44:55 AM
#4
Pull requests welcome at https://github.com/bitcoin/bitcoin/

I might just do that. So this is on my TODO list:
  • Learn how to use Git and Github
  • Learn how the current RPC source code works and how bitcoind and bitcoin-qt are compiled from a single source tree
  • Make a generic framework for modules that have extensible interfaces, and can either be linked as libraries or use RPC (compile-time choice)
  • Convert bitcoind to become a module in this framework
  • Convert the QT GUI to use this framework, and build bitcoin-qt by linking the bitcoind and QT GUI modules (not using RPC). From this point on, changes don't break functionality, and can be pulled into the main branch.
  • Make a separate wallet module (the most urgent split-up IMO), and let the QT GUI module use it (sending raw transactions to the bitcoind module)
  • Make a "compatibility layer" RPC interface that can act as a drop-in replacement of the old bitcoind executable, but uses the separated wallet module internally. This can be used e.g. in web services that are built on the old bitcoind RPC.
  • Remove wallet functionality from the "bitcoind" module

I think that, as long as you just use library function calls instead of RPC, the split-up will not significantly reduce performance.

For now, I just started the thread to get an idea about the (developer) community opinions on this, before I am putting a lot of work into this. Now that opinions seem to be moderately positive, I'll just get started, and see how far I get. I'll keep you informed.

With such a new framework, I think I'll need to offer a lot of changes at once, in order to make it work. Will such a big pull request be accepted from a newbie in the group, like me?
legendary
Activity: 980
Merit: 1008
October 24, 2012, 03:53:10 PM
#3
I definitely think this is the future for Bitcoin. The difficulty, however, isn't coming up with this, but actually imlementing it. It's a huge task. Don't get me wrong, suggestions are appreciated, but the scarcest resource in these forums isn't ideas, it's labor (coding).

As gweedo mentions, splitting things up like this is probably not going to make anything simpler. Getting the interfaces right is the hard part, and for a long period the interfaces can't even be relied upon because we can't settle on a certain interface, since we might need to add more functions to that interface, or otherwise change its form as we learn more about how well this interface works for what we're going to use it for.

This is goes back to the old monolithic kernel vs. microkernel argument, which basically boils down to: yes, seperating each functionality in a separate module does have many advantages, but the main disadvantage is an increased amount of work, which detracts severely from the advantages. So much, actually, that you don't really see any successful pure microkernel operating systems (designed as a microkernel from the ground-up) out there today. OS'es are slowly moving in that direction, with its various functions slowly becoming more separated. Bitcoin is, of course, a lot less code than an OS, but the same principles apply.

I think this change will happen naturally as more developers start working on the code, not the other way around.
legendary
Activity: 1596
Merit: 1100
October 24, 2012, 03:02:58 PM
#2
Pull requests welcome at https://github.com/bitcoin/bitcoin/

cjp
full member
Activity: 210
Merit: 124
October 24, 2012, 01:30:09 PM
#1
My proposal is to split up the Satoshi client into several smaller software projects. It should be possible to run each component as a separate executable (and let the components communicate e.g. through RPC), but it should also be possible to compile them into static or shared libraries, which can be combined into a single executable.

I was thinking of the following subdivisions, but core Bitcoin developers might have better ideas:
  • Knowledge center: this keeps track of known transactions and blocks, and their status
  • P2P protocol handler: exchanges information between knowledge center and other nodes on the Internet
  • Block chain storage: performs loading/storing of transaction information to/from knowledge center
  • Verifier: checks validity of transactions and blocks, and notifies knowledge center of the result
  • Miner: creates new blocks (gets transactions from knowledge center and submits blocks into it)
  • UI: shows information to user and allows user to perform actions.
  • Wallet: stores private keys and creates/signs transactions on request by UI

This will have several advantages:
  • Each individual component is smaller, and hence its code is easier to understand than its equivalent in a monolithic client. This is mostly because the split-up architecture itself creates a good overview, and because the interfaces between the components are (assumed) well documented.
  • Derived from this: this allows more developers to get involved in development of the software. It can also act as a guideline for organizational subdivision, where development of each component can have a different "lead developer".
  • Also derived from having better understandable code: security will improve.
  • Security will also improve because each module will have only a subset of all threats to worry about. The more paranoid / high-volume traders can improve security by running each component as a separate process in a specialized minimum-privilege security context.
  • Innovations in different components can be developed more or less independently from each other. People can use a drop-in replacement for one component, while keeping the other components unchanged. For instance, people can make UI components with added functionality, or use a brain wallet component instead of encrypted storage on disk. Or one can replace the knowledge center with something based on services like blockchain.info (or use a less-radical idea for making a lightweight client). Some innovations require interface changes between components; to allow this, I think the interfaces should be extensible (similar to the OpenGL API)

For efficiency, I think there can be some "shared code" between these components, e.g. class definitions with serialization functionality, and code for making the (RPC?) interfaces. In order not to undo the advantages of splitting up the code, the "shared code" should contain as little functionality as possible.
Pages:
Jump to: