Author

Topic: PyBtcEngine: BTC backend in Python (with C++/SWIG) (Read 3010 times)

legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Finally!  I am about a week away from releasing Armory.  I have started a thread in the Alternate clients forum:  https://bitcointalksearch.org/topic/armory-discussion-thread-56424

Please let me know if you want to help test the pre-alpha version, but it might be kind of bumpy while I work out the platform/configuration variations, etc.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Watching this, thank you for your contribution, i like python.  Grin

Ditto.  Development has been remarkably fast and efficient since I "finished" the C++ blockchain code and have been able to work in python 100%.   It is delightful Smiley

Oh, I forgot I also added light networking to Armory using twisted -- connection to localhost bitcoin/bitcoind only, but can broadcast tx and retrieve tx not in the blockchain Smiley
hero member
Activity: 714
Merit: 500
Watching this, thank you for your contribution, i like python.  Grin
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Big Updates!

(1)  I just found a huge oversight in the way I was scanning the blockchain for wallet-relevant transactions.  I just optimized the hell out of it and got a full wallet-scan down to 0.75 seconds once the blockchain is loaded!  Sure, my computer is decent:  i5-2500K with 8GB RAM, but this is down from 5-15s it was before.  So, I can now get all the transactions for a set of wallets from a cold start in less than 20 seconds -- that includes reading blk0001.dat, indexing all the data, organizing the blockchain, and finding all transactions for my wallet.  That is FAST.

(2)  I have finished all sorts of crazy new features, but have forked my own project to Armory and been continuing development there, where I should be finishing a new client before the end of the year (alpha version).  I forked the project because I needed to merge the python code and the C++/SWIG code into a single, hybrid library, but I didn't want to disrupe the pure-pythonness of the original pybtcengine module.  So, if you are looking for pure-python, keep using PyBtcEngine, but otherwise you should start following Armory instead.

Teaser:  Armory will include multiple encrypted wallets, watching-only wallets, easy address-import, and will support multi-signature transactions (experimental)!   And that's not even all of it, but I have to keep some things a surprise Smiley

I'll be posting more in a new thread about this soon!  GUI development is slow, though...
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Thanks fivebells, I'm so glad that someone else is getting some use out of this library.  It will eventually be used for a client (I'm getting really close to having all the tools I need!), but  if it helps others understand Bitcoin in the meantime, then all the better!  Python is always easier to understand than C++ Smiley

And thanks for reminding me to update the status on this page!   I have all sorts of new stuff in there, though most of it isn't for education -- it's mostly stuff that will be needed to implement a client:  secure binary data handling, encryption, key-derivation functions, wallet formats, and a spiffy new SelectCoins algorithm for tx construction.  I have had this stuff floating around in the dev branch, but forgot to merge it... until just now. 

sr. member
Activity: 462
Merit: 250
Thank you for writing this.  There are aspects of the bitcoin protocol which I have been struggling to understand, and this is making it so much easier.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
optimal Smiley
So you are saying that you wrote an NP-hard Bellman solver in Python? My interest is certainly piqued.

Hah.  No NP-hard problems for me -- I'll stick to the NP-complete problem of trying to find a decent SelectCoins solution...  (I believe an "optimal" SelectCoins solution would be NP-hard)

When I say "optimal" in reference to an offline wallet, I mean optimal usability/convenience.  I think I can make a very clean GUI/interface that allows people to do offline transactions in only a couple steps, and won't require any JSON or CLI magic.  However, I still have a couple things to work out before I get there, so perhaps it was a little pre-mature to bring it up, yet Smiley 


legendary
Activity: 2128
Merit: 1073
optimal Smiley
So you are saying that you wrote an NP-hard Bellman solver in Python? My interest is certainly piqued.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
I love python.  Keep up the good work.

Thanks!  I hope others who like python, will appreciate having a solid python interface to the blockchain.  

Just as an update, I have been working steadily on more functionality.  I just recently got a SelectCoins algorithm implemented, and my first real product is going to be an optimal, offline wallet tool Smiley   I have all the pieces, I just gotta put them together and battle PyQt4 a bit more.  I'll also be adding an "examples" directory to demonstrate how to best use the library.  There's already some good example code in testswig.py and the PBE blockexplorer demo, but those are designed for testing, not for clarity of code usage.
hero member
Activity: 742
Merit: 500
I love python.  Keep up the good work.
legendary
Activity: 1652
Merit: 2301
Chief Scientist
If you're looking for a good library/framework, then Twisted is a really good choice. It's a piece of battle-hardened software and used by several big vendors (like Facebook, Rackspace .etc).

Mmmm.... Twisted....

I started defining a BitcoinProtocol class derived from twisted.internet.protocol.Protocol for my cross-implementation at-the-network-level testing project.  I plan on using Twisted and Trial (the Twisted unit testing framework) to feed canned block-chains to "empty" nodes and make sure they Do The Right Thing (reject blocks that violate the blockchain rules or contain invalid transactions, accept blocks that contain weird-but-valid transactions, etc).
  https://github.com/gavinandresen/Bitcoin-protocol-test-harness/blob/master/BitcoinClient.py

Anyway, the BitcoinProtocol class might be a good place to start for anybody who wants to do some python-based bitcoin network programming.  Good example to demonstrate is a little dump-blocks tool I wrote to spit out a blockchain in JSON format:
  https://github.com/gavinandresen/Bitcoin-protocol-test-harness/blob/master/dumpblocks.py


(I'm a Twisted newbie, so improvements, suggestions, etc are very much appreciated)
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
This is great Smiley We need more clients. Since you're working on the network part feel free to ask me any questions. I pretty much know the source code inside out. I'm genjix on Freenode IRC in #bitcoinconsultancy or Skype: zgenjix (although IRC is better Smiley

If you're looking for a good library/framework, then Twisted is a really good choice. It's a piece of battle-hardened software and used by several big vendors (like Facebook, Rackspace .etc).

Also AGPL is fine. That poster is wrong when he says it cannot be used in commercial settings- it sure can. Think of other AGPL'ed software like SQL databases. You're communicating over the wire with a daemon. Same with bitcoin. The only time you have to release the source code is when you a) extend the program or b) link against the program/make function calls. Everything else is kosher.

I'm going to continue "battle-hardening" my existing code through a series of unit-tests and GUI features, making sure it gracefully handles reorgs, double-spends, tx construction, etc, correctly.  I also want to make the features as accessible as possible so that others can use the library as a base without worrying too much about the details.  I'd rather have a rock-solid half of the puzzle, instead of a mediocre implementation of the full puzzle.  Then, when I/others actually get to the other half of the puzzle, the problems are easier to find.

I'm a n00b when it comes to OSS licensing, so I don't know for sure how I want to deal with it.  But, so far I'm a big fan of the free-for-OSS-not-for-commercial philosophy.  I'm happy to receive recommendations, but please PM them to me, as I don't want this thread to become a debate on OSS licensing
legendary
Activity: 1232
Merit: 1076
This is great Smiley We need more clients. Since you're working on the network part feel free to ask me any questions. I pretty much know the source code inside out. I'm genjix on Freenode IRC in #bitcoinconsultancy or Skype: zgenjix (although IRC is better Smiley

If you're looking for a good library/framework, then Twisted is a really good choice. It's a piece of battle-hardened software and used by several big vendors (like Facebook, Rackspace .etc).

Also AGPL is fine. That poster is wrong when he says it cannot be used in commercial settings- it sure can. Think of other AGPL'ed software like SQL databases. You're communicating over the wire with a daemon. Same with bitcoin. The only time you have to release the source code is when you a) extend the program or b) link against the program/make function calls. Everything else is kosher.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
Thanks!   So far I haven't seen another library that enables one to efficiently access the entire blockchain from Python.  I'm hoping that this will enable some people to get involved that didn't want to battle the Satoshi C++ code (myself included).  I'm hoping that I can shield the developer from the under-the-hood details, and they can just focus on using the Python/SWIG interface to create their software.

I actually picked the AGPL for the same reasons you denounce it -- I want derivative works to be open-source, to enable others to have access to the code of those who might improve my library.  But I'm also interested in seeing Bitcoin expand, and providing code that companies can use to jumpstart their software development would certainly promote Bitcoin as a whole. 

I'm talking with a friend of mine who's a lawyer, and he's suggested I add a statment that a dual-license can be negotiated upon request.  I believe this option makes the code available for free to those that will develop more OSS, but require companies that plan to use it in closed-source software to "share" some of their profits with me.  I'll look into this a little bit more.



full member
Activity: 218
Merit: 100
This is great.  As someone who has also been playing around with Python hybrids (C libraries using ctypes), I'm very impressed with what you've done.  I look forward to building it and checking it out in detail.

As glad as I am that you've released this into the public domain, I have to disagree with your licensing choice.  Of course, it's your code, so it's your choice, but in my opinion, ALL open source code relating to Bitcoin should be released under permissive licenses like MIT or Apache, not copyleft licenses.  Firstly, because permissive license are more in line with the freedom-based ideology undergirding Bitcoin, but most importantly, because we should be targeting businesses when we create Bitcoin software.  And no business will touch Affero GPL, unless it's part of a dual commercial/GPL agreement.  Affero is the strictest GPL of them all, and under AGPL, not even an exchange will be able to use your code without activating GPL' s provisions over their entire system (which is why they'll never use it).

Can you imagine if the canonical Bitcoin client had been released under a copyleft license like AGPL?  We'd almost certainly have no exchanges, no bitcoin-notify, no bitpay, no instawallet, and people would still be ordering a pizza for 10,000 BTC.

May I ask why should chose AGPL?
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
NOTE:  Please see: Armory Bitcoin Client which was forked from this project to be the most advanced Bitcoin client to date! This project remains as a useful set of pure-python tools.  But Armory is a full Bitcoin client!

I am releasing my Python/C++/SWIG code into the wild under the GNU Affero General Public License:

      PyBtcEngine on Github by etotheipi

if you are in Linux/Ubuntu, you can easily compile and execute (Windows is available too, but it's a lot of work to get it compiling):

      The PBE block-explorer demo.

In a nutshell:
PyBtcEngine is a computational backend that could be used as a starting point for Python-based BTC tools & software.  It does not include any networking code at all, but has a fairly complete set of everything else unrelated to networking.  Most library components are heavily unit-tested, so there should be a fair degree of robustness built-in.  

This library enables you to read, scan and organize the entire blockchain, perform all ECDSA operations, evaluate most scripts, detect non-std scripts, collect balances and Tx lists for wallets/addresses, detect/handle blockchain reorganizations gracefully, and can even be used to test your blk0001.dat for bit-errors.  And it can do all this ridiculously fast! (see timings below).  In its current state it is perfect for an offline BlockExplorer (in progress), but could easily be expanded into other tools, or used for the backend of an alternative client.  See below for a more-complete list of implemented features.

Below, I have copied the "STATUS" section of the README which shows the current capabilities in each language (which are combined in SWIG):
Code:
********************************************************************************
*  STATUS:   Last Updated - 28 Oct, 2011
*            Legend:  
*                      _    not implemented
*                      .    implemented but not tested
*                      +    implemented and partially tested
*                      X    implemented and tested
*  
*                                          C++       Python     SWIG
*     ---------------------------------------------------------------
*      (01)  Ser/Unser Block Objects        X          X         X
*      (02)  Hash160/Hash256                X          X         X
*      (03)  Difficulty calcs               X          X         X
*      (04)  Address Generation                        X         X
*      (05)  Address Verify/Manip                      X         X
*     ---------------------------------------------------------------
*      (06)  BlkHeaders read/scan/org       X          X         X
*      (07)  BlkHeaders reorgs              X                    X
*      (08)  Blockchain read/scan/org       X                    X
*      (09)  Blockchain reorgs              X                    X
*      (10)  Blockchain verify integrity    X                    X
*     ---------------------------------------------------------------
*      (11)  NonStd Tx Detection            +          +         +
*      (12)  Script pprint                  X          X         X
*      (13)  Script OP_CHECKSIG                        X         X
*      (14*) Arbitrary script eval                     X         X
*      (15)  ECDSA Sign/Verify              X          X         X
*     ---------------------------------------------------------------
*      (16)  Address/Wallet tracking        X          X         X
*      (17)  Scan blkchain for Tx           X                    X
*      (18)  Scan blkchain for NonStd       X                    X
*      (19)  Reorg w/ double-spend          X                    X
*      (20)  Add new blockdata real-time    X                    X
*     ---------------------------------------------------------------
*      (20)  SelectCoins for tx                        X         X
*      (21)  Tx construct given inputs                 X         X
*      (22+) Distr Proposals for multi-sig             +         +                
*      (23)  Tx broadcast
*      (24)  Tx fee detect/calc/handle                 X         X
*      (25)  Blockchain download
  
     + please see https://gist.github.com/1321518
     * all scripts have been implemented, most of them are tested.  
       OP_IF/NOTIF/ELSE/ENDIF are the only codes not implemented yet.
********************************************************************************


Timings:
The current implementation holds everything in memory, and so it takes up about 1.2 GB of RAM right now.  I plan to improve this in the future, but my computer has 8GB so I'm not in any hurry to make it more lightweight.  On the other hand, because of this, and my painstakingly-careful memory management, the library is extremely fast.  Here's the timings, measured on a single thread of an AMD Phenom X4 840 CPU with 8GB of 1333 MHz DDR3.
  • Read entire blockchain into RAM:  5s
  • Scan entire blockchain, collect headers/txs:  10s
  • Organize and find longest chain:  0.5s
  • Verify blkfile integrity:  2.5s
  • Get balances/ledger for a set of addresses/wallets, from scratch:  ~0.75s/wallet

Yes, you can load, organize and scan all 600 MB of blockchain, and find transactions for a given wallet in less than 20s.  My careful memory management guarantees that there are virtually no extraneous copy operations at any step.  As such, some of the code is a bit complicated, but no one can say it isn't fast!  Most of the C++ code is documented in the base directory, in the file, Using_PyBtcEngine.README.  There is also a ton of example/unit-testing code that will be critical for anyone wanting to use it.  In particular, three files contains examples of nearly every available method:
  • (C++)  BlockUtilsTest.cpp
  • (Python) unittest.py
  • (Together) testswig.py

Recent Updates (08 Dec, 2011):
Development has been forked to Armory which will be used for a client will all sorts of new, innovative features.  If you are looking for pure-python tools/code for Bitcoin, keep following PyBtcEngine, but otherwise switch to Armory.  I should have finished GUI development and have an alpha client released by the end of the year!  (multiple encrypted wallets, address import, watching-only wallets, multi-sig tx, and even more!)

Recent Updates (16 Nov, 2011):  
  • Added lots of new C++ features:  secure binary data handling, AES encryption, ECDSA signing, and time-and-memory-bound key-derivation function!  A wallet using this library can now set the target compute time and memory for the key-derivation function to decrypt the private keys (the Satoshi client only has time-bound KDF)
  • Added elaborate SelectCoins algorithm, which actually works extremely well!  Created a SelectCoins-solution evaluator, and then threw in a few simple, a few elaborate coin selection algorithms.  And some random ones.  The SelectCoins evaluator gives each of them a sequence of scores, weights the scores by user preferences (with a default), and then chooses the best solution.  The idea is to throw in a ton of solutions, each one of which may be better for an initial starting condition.  Whichever one is best for the moment will be used
  • Started secure wallet format and serialization.  Will be using a simple binary format for the wallets, with all data encrypted via AES-256, and using a key-derivation function that nominally takes 0.5s of computation and 8 MB of RAM on the users' computer.  The timing calibration makes it difficult to brute-force a solution, and the memory requirement completely disarms GPUs from being able to help with such a search.
  • Created BIP 0010 to proactively figure out how clients can deal with multi-signature transactions.  The core concept is "Tx Distribution Proposals."  As such, I have implemented TxDPs and actually made them the basis for all transaction operations, even for single-signer transactions.  If the private key is on your computer, it will be signed and broadcast immediately.    If not, it will give you the TxDP which can be signed by the offline computer without needing access to the blockchain.  If multiple signatures are required, the TxDPs are easily copied ASCII blocks that to be include inline in emails or as attachments, and easily combined when multiple signatures are received.


I have done my best to make this code "usable," meaning well-formatted code and lots of comments.  Unfortunately, Bitcoin is complicated, and so there's only so much one can do to make the code easier to comprehend.   Feel free to offer recommendations for improving it -- but it is a lot of code, so any major refactorings will probably not happen unless you're volunteering.

License:
GNU Affero General Public License v3 (AGPL) for this project.  The license was picked to allow users to use it for free if they plan to create more OSS, but require a dual-licensing negotiation if someone wants to use it in closed-source software.  Please contact me if you're interested.
Jump to: