Author

Topic: Armory - Discussion Thread - page 228. (Read 521952 times)

legendary
Activity: 1400
Merit: 1005
January 04, 2012, 11:10:13 AM
#63
It's an all-or-nothing deal.  Either load the whole blockchain into RAM, or don't load any of it.  It's too inefficient and pointless to try and use virtual memory.

Very few transactions need to read the entire blockchain.  A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory.  As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks.  The OS virtual memory system should handle that access pattern efficiently.

One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared.  Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks.  With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk.
I can see what you mean if an index was created for addresses, but for transactions?  What good does a transaction index do when you are looking for transactions to do with specific addresses?

Regardless, I can see how an index would make it work.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
January 04, 2012, 10:45:10 AM
#62
It's an all-or-nothing deal.  Either load the whole blockchain into RAM, or don't load any of it.  It's too inefficient and pointless to try and use virtual memory.

Very few transactions need to read the entire blockchain.  A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory.  As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks.  The OS virtual memory system should handle that access pattern efficiently.

One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared.  Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks.  With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk.

I need to look more into mmap and the Windows equivalent.  I seem to remember concluding that you still needed the address space for the file in RAM, but instead behaved as a sort of RAM-based cache for the file.  Also, I didn't like the platform-dependence of it.  But for such a big change it might be worth fighting that battle... if it truly does save the RAM.

Right now Armory doesn't maintain any disk-index at all.  It completely rescans the blockchain on every load, and reaccumulates the balance and outputs of each wallet.  This is possible because of how extraordinarily fast my blockchain scanning code is... even on my slow computer, it takes less than 20s to cold-boot Armory on the main network and that's only single-threaded!  Sure, this is not a good long-term design, but it wasn't intended to be -- 10s-20s load time is perfectly acceptable to me for the next couple months until I get something more sane in there.   And there's no issues with synchronizing index files to the blk0001.dat file... there are no index files!

Data structures are my specialty, and I already know how to handle all the maps/indexes for a more-efficient, non-scanning-every-load client (even easier if mmap does what I need).  It's easy enough to maintain a master index of addresses and blockchain locations, I've even done implemented it and saw that it takes something like 150 MB.  It's just not a priority before my first release, since its runtime is already acceptable. 

vip
Activity: 447
Merit: 258
January 04, 2012, 10:11:46 AM
#61
It's an all-or-nothing deal.  Either load the whole blockchain into RAM, or don't load any of it.  It's too inefficient and pointless to try and use virtual memory.

Very few transactions need to read the entire blockchain.  A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory.  As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks.  The OS virtual memory system should handle that access pattern efficiently.

One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared.  Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks.  With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk.
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
January 03, 2012, 10:40:39 PM
#60
Btw, if anyone has any experience with PyQt/PySide, I could really use some help with tableView column sizing.  Everything I tried only gets me 80% of the way there.  I haven't found a robust way to do this:  I tried tableview.width() to get the table size and then setting columns to percentages of it, but it seems the size is always off, and the columns sizes end up being butchered.  

If anyone has a good paradigm for getting this right, I'd love to be educated and integrate it.  Right now there seems to be cross-platform SNAFU issues with the tableViews.
sr. member
Activity: 266
Merit: 250
The king and the pawn go in the same box @ endgame
January 03, 2012, 08:29:47 PM
#59
5 btc donated to the cause bro Cool
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
January 03, 2012, 07:46:12 PM
#58
Building in Ubuntu is ridiculously easy (Windows instructions are considerably more involved... I'll get to that soon!).   I'm not sure what state the code is in (remember, I'm planing to release next week).  Nonetheless, I know people want to try it, so here it is:

Code:
sudo apt-get install git-core build-essential libcrypto++-dev swig libqtcore4 libqt4-dev python-qt4 python-dev python-twisted
git clone git://github.com/etotheipi/BitcoinArmory.git
cd BitcoinArmory
git checkout qtdev
cd cppForSwig
make swig
cd ..
python ArmoryQt.py


The big issue I'm facing right now is that something went awry with address importing in encrypted wallets.  You can try that feature, but you might temporarily bork your wallet.  Although it might work to remove the address through the dialog....  

For now I really need to buckle down on development, and so I'm going to try to cut myself off from email/web for a while.  

P.S. -- I hardcoded testnet into the client, because I don't want anyone using real coins with it yet.  You can get some testnet coins from the Testnet Faucet

EDIT: REMEMBER:  make sure you run the Satoshi client with the -testnet option, and let it download the blockchain.  Then you can start Armory, it will connect automatically.  And since Testnet is so small, you could technically do this on any modern computer... testnet blockchain is only 30 MB (compared to 830 MB for main-network blockchain).
legendary
Activity: 1304
Merit: 1015
January 03, 2012, 07:27:32 PM
#57
PLEASE DONATE.  This was a TON of work, and I'm not sure I can continue neglecting my girlfriend without getting some kind of compensation for the development time.  Please use the "Donate" button on the send-bitcoins dialog in Armory.  Here's the address if you are feeling generous without using the program Smiley 1Gffm7LKXcNFPrtxy6yF4JBoe5rVka4sn1

What can we do for your girlfriend, with your permission, so she will let you continue working on this?  Smiley
legendary
Activity: 1400
Merit: 1005
January 03, 2012, 07:13:14 PM
#56
Uh, using mmap() or similar on Windows would let you access the blockchain through virtual memory, loading it from disk as needed.

Could this resolve your issue ?
That would defeat the purpose of loading the blockchain into RAM in the first place.  You'd have to read the entire blockchain from disk every time you did a scan, which wouldn't be any faster than the traditional client.

Besides, ram is incredibly cheap these days.  You can get an 8GB kit for $32.  Well, at least if your system accepts DDR3.  If you're still stuck on DDR2, it'll be a bit more expensive.

I wouldn't say it defeats the purpose... I do want a disk-based blockchain implementation--making the user wait 10-20s to import/sweep an address isn't the end of the world--some of them won't be able to even use the software if full-RAM is the only option.  But the full-RAM implementation does add some kick for the users that don't mind.  This is actually why I'm going to make sure both options are there (at least for the adv/dev modes).
Well, I wasn't saying it would defeat the purpose of the client, just that it would defeat the purpose of having any amount of the blockchain in RAM to start with.

Say the blockchain was 4GB in size, and you loaded it on a system with 2GB of RAM free, and the rest in virtual memory.  You'd read 2GB of the blockchain and stick it into RAM.  Then, while reading the last 2 GB, you'd be storing the first 2GB back on the hard drive, effectively making a copy of the information that is already there.  So you've already spent 1.5 times the amount of time it would take to scan one address, and that's just from the initial software loadup!  After that, it would still be just as slow importing/sweeping addresses as it would if you didn't put any of the blockchain into RAM, because as soon as you scan half of the blockchain, that half has to be written to the virtual memory on the HDD, and the other half has to be read from virtual memory on the HDD back into RAM.

It's an all-or-nothing deal.  Either load the whole blockchain into RAM, or don't load any of it.  It's too inefficient and pointless to try and use virtual memory.
sr. member
Activity: 266
Merit: 250
The king and the pawn go in the same box @ endgame
January 03, 2012, 07:11:25 PM
#55
Awesome! Keep up the good work, and don't neglect your gf!
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
January 03, 2012, 06:56:29 PM
#54
Uh, using mmap() or similar on Windows would let you access the blockchain through virtual memory, loading it from disk as needed.

Could this resolve your issue ?
That would defeat the purpose of loading the blockchain into RAM in the first place.  You'd have to read the entire blockchain from disk every time you did a scan, which wouldn't be any faster than the traditional client.

Besides, ram is incredibly cheap these days.  You can get an 8GB kit for $32.  Well, at least if your system accepts DDR3.  If you're still stuck on DDR2, it'll be a bit more expensive.

I wouldn't say it defeats the purpose... I do want a disk-based blockchain implementation--making the user wait 10-20s to import/sweep an address isn't the end of the world--some of them won't be able to even use the software if full-RAM is the only option.  But the full-RAM implementation does add some kick for the users that don't mind.  This is actually why I'm going to make sure both options are there (at least for the adv/dev modes).
legendary
Activity: 1428
Merit: 1093
Core Armory Developer
January 03, 2012, 06:52:10 PM
#53
Uh, using mmap() or similar on Windows would let you access the blockchain through virtual memory, loading it from disk as needed.
Could this resolve your issue ?
Is this really a solution?  I thought you still needed the full address space of the file in order to use mmap (i.e. 1 GB or RAM for a 1GB file) but that mmap prevents you from needing to read it all from disk until you actually access it.   If I have it wrong, please let me know.  The other issue was with portability.  Since mmap is linux-specific, I would have to do something completely different for Windows, and I was trying to avoid platform-specific code.  But if I have this wrong, maybe it will be an easy upgrade after all Smiley

very interesting, one note would be if you're allowing paper backups to integrate a QR reader into the app to recgonize a scaned image of your backup when you want to recover it!
My primary goal for putting the QR code there was so that your computer could easily transfer a wallet to your phone, if someone were to make a smartphone app.  While it is annoying to type in the root/chain codes... it should be a rare event, and I selected a character set that is easy to type.  

On the other hand, I have a two-factor authentication scheme nearly ready-to-go, requiring only 2-of-2 transactions without a third-party, but I need a phone app to do.  The computer would generate both wallets and the phone would scan the paper QR code and then be used for signing only (no blockchain).  Armory will delete the private keys on the computer but keep a watching-only version, then move all your money into a 2-of-2 transaction requiring both computer and phone. More details in this thread...

However, while the low-level code is well-commented, I haven't really done any high-level documentation of how to do high-level stuff -- for now you'll have to settle for looking at unittest.py until I get around to making a walk-thru of the high-level interface.
Donated a few btc for having unit tests and comments.  Sounds like great work, will check it out when it's in alpha.
Thanks!  I don't know if there's a way to do a project of this order of magnitude without testing and comments.  I'm constantly having to go back to code I haven't touched in months, and tweak it, or remind myself how the interface works.  On that note, the entire wallet file format is described in the PyBtcWallet class comments.  If anyone wants to see how the wallet file is constructed, it should all be described there.  The address format is described in the PyBtcAddress serialize function.  

I'm going to work on build instructions right now.  But for those of you jumping ahead please checkout the qtdev branch of the repo.  Virtually no GUI development has been merged into master yet.

This is great. I'm also interested in why you did it. After all there are several open source bitcoin clients that could need patches for these features. Hope you get and accept the support to push this client to a bright future.
Bitcoin is my calling.  It combines every one of my strongest skills (cryptography, math, programming, data structures, algorithm optimization, GUI design).  I started this project 6 months ago, and was disappointed with the utter lack of python support.  I quickly figured out that anything blockchain related is devastatingly slow in python, but I was determined to do it anyway (and hence the C++/SWIG layer).  After participating in the forums, I realized just how many features were missing from it that I knew exactly how to implement.  It would've been a tremendous amount of work to gut someone else's project (there were very few), and it might not even be quicker!  Maybe I'm just stubborn and like doing things my own way...

On the other hand, if you want C++ blockchain-only tools, the implementation I ended up with is the absolute fastest thing possible for reading and scanning the blockchain.  It's not verifying the blockchain, but in terms of collecting unspent outputs and computing balances, I don't think any single-threaded app could be any faster (0.65s to get the balance of a wallet with the blockchain in RAM, 10-15s if it's a cold-start from disk).  And of course, multi-threading this process is in my long-term plans Smiley
legendary
Activity: 1400
Merit: 1005
January 03, 2012, 06:50:43 PM
#52

  • You need a system with 4GB+ of RAM

Uh, using mmap() or similar on Windows would let you access the blockchain through virtual memory, loading it from disk as needed.

Could this resolve your issue ?
That would defeat the purpose of loading the blockchain into RAM in the first place.  You'd have to read the entire blockchain from disk every time you did a scan, which wouldn't be any faster than the traditional client.

Besides, ram is incredibly cheap these days.  You can get an 8GB kit for $32.  Well, at least if your system accepts DDR3.  If you're still stuck on DDR2, it'll be a bit more expensive.
legendary
Activity: 1862
Merit: 1114
WalletScrutiny.com
January 03, 2012, 06:38:43 PM
#51
This is great. I'm also interested in why you did it. After all there are several open source bitcoin clients that could need patches for these features. Hope you get and accept the support to push this client to a bright future.
sr. member
Activity: 387
Merit: 250
January 03, 2012, 06:17:41 PM
#50

However, while the low-level code is well-commented, I haven't really done any high-level documentation of how to do high-level stuff -- for now you'll have to settle for looking at unittest.py until I get around to making a walk-thru of the high-level interface.
Donated a few btc for having unit tests and comments.  Sounds like great work, will check it out when it's in alpha.
full member
Activity: 189
Merit: 100
January 03, 2012, 06:15:04 PM
#49
Wow, this is absolutely amazing.

My I ask what drove you to do this?

I will definitely be donating.
sr. member
Activity: 462
Merit: 250
It's all about the game, and how you play it
January 03, 2012, 06:08:19 PM
#48
very interesting, one note would be if you're allowing paper backups to integrate a QR reader into the app to recgonize a scaned image of your backup when you want to recover it!
full member
Activity: 184
Merit: 100
Feel the coffee, be the coffee.
January 03, 2012, 06:00:38 PM
#47

  • You need a system with 4GB+ of RAM

Uh, using mmap() or similar on Windows would let you access the blockchain through virtual memory, loading it from disk as needed.

Could this resolve your issue ?
legendary
Activity: 1400
Merit: 1005
January 03, 2012, 05:34:54 PM
#46
This is EXACTLY what I needed to be able to move forward with one of my projects.

Two questions:
1)  How well does it handle wallets with many thousands of generated addresses?  Say, 50-100k addresses in a single wallet file?
2)  When do we get a Windows exe?


The blockchain-wallet scanning should be O(log(n)) in the number of addresses in your wallet.  If you have the RAM to try it, I'd be interested to see how long it takes to do the scan and collect the balances, but I would guess it's less than 5s.  As for the wallet format:  there are no arbitrary limits on how big the wallets can be.  I have had 500 addresses in one of my wallets before, without problem -- there are no artificial limits.  Perhaps you will be able to help me find the breaking point for wallet sizes.  Smiley

I'm having a problem with PyQt and py2exe in Windows.  I have a MSVC++ runtime error when trying to install the pywin32 module, which linked but not actually needed by my software.  So I can comment out the imports and the software runs fine, but py2exe complains that it needs it in order to make the executables.  it's going to take a little bit more work (and maybe switching systems), before I get an executable made.   I spontaneously got an exe built once (so I know it works), but for some reason I couldn't repeat it...

For now, I still have some dev left before officially making exe's... maybe I shouldn't have posted so soon  (remember, releasing next week)  Smiley  But I'll put up build instructions tonight or tomorrow, and plan to have a Windows exe built by next week.  For now, I still have some pretty important bugs to quash before I would trust this software with anything but testnet coins...
I'm certainly willing to help experiment.  I have some wallets that are nearly inaccessible via the Satoshi client because it starts dragging badly after 150k or so addresses.  Am curious if this wallet is any better/worse when dealing with those wallet files.

Will look forward to the release next week then!  Smiley
If you have to keep all 150k of them in the wallet, you're doing it wrong.
I definitely was!  Had much better methods of achieving what I wanted with later... uh... methods, but I still need to keep the private keys for these larger wallets and access them in the future.  I hope to find a way to convert all of the wallet addresses to an Excel doc or something in the future, as that would make it easier to manage.
hero member
Activity: 560
Merit: 501
January 03, 2012, 05:00:45 PM
#45
This is EXACTLY what I needed to be able to move forward with one of my projects.

Two questions:
1)  How well does it handle wallets with many thousands of generated addresses?  Say, 50-100k addresses in a single wallet file?
2)  When do we get a Windows exe?


The blockchain-wallet scanning should be O(log(n)) in the number of addresses in your wallet.  If you have the RAM to try it, I'd be interested to see how long it takes to do the scan and collect the balances, but I would guess it's less than 5s.  As for the wallet format:  there are no arbitrary limits on how big the wallets can be.  I have had 500 addresses in one of my wallets before, without problem -- there are no artificial limits.  Perhaps you will be able to help me find the breaking point for wallet sizes.  Smiley

I'm having a problem with PyQt and py2exe in Windows.  I have a MSVC++ runtime error when trying to install the pywin32 module, which linked but not actually needed by my software.  So I can comment out the imports and the software runs fine, but py2exe complains that it needs it in order to make the executables.  it's going to take a little bit more work (and maybe switching systems), before I get an executable made.   I spontaneously got an exe built once (so I know it works), but for some reason I couldn't repeat it...

For now, I still have some dev left before officially making exe's... maybe I shouldn't have posted so soon  (remember, releasing next week)  Smiley  But I'll put up build instructions tonight or tomorrow, and plan to have a Windows exe built by next week.  For now, I still have some pretty important bugs to quash before I would trust this software with anything but testnet coins...
I'm certainly willing to help experiment.  I have some wallets that are nearly inaccessible via the Satoshi client because it starts dragging badly after 150k or so addresses.  Am curious if this wallet is any better/worse when dealing with those wallet files.

Will look forward to the release next week then!  Smiley
If you have to keep all 150k of them in the wallet, you're doing it wrong.
legendary
Activity: 1400
Merit: 1005
January 03, 2012, 04:06:26 PM
#44
This is EXACTLY what I needed to be able to move forward with one of my projects.

Two questions:
1)  How well does it handle wallets with many thousands of generated addresses?  Say, 50-100k addresses in a single wallet file?
2)  When do we get a Windows exe?


The blockchain-wallet scanning should be O(log(n)) in the number of addresses in your wallet.  If you have the RAM to try it, I'd be interested to see how long it takes to do the scan and collect the balances, but I would guess it's less than 5s.  As for the wallet format:  there are no arbitrary limits on how big the wallets can be.  I have had 500 addresses in one of my wallets before, without problem -- there are no artificial limits.  Perhaps you will be able to help me find the breaking point for wallet sizes.  Smiley

I'm having a problem with PyQt and py2exe in Windows.  I have a MSVC++ runtime error when trying to install the pywin32 module, which linked but not actually needed by my software.  So I can comment out the imports and the software runs fine, but py2exe complains that it needs it in order to make the executables.  it's going to take a little bit more work (and maybe switching systems), before I get an executable made.   I spontaneously got an exe built once (so I know it works), but for some reason I couldn't repeat it...

For now, I still have some dev left before officially making exe's... maybe I shouldn't have posted so soon  (remember, releasing next week)  Smiley  But I'll put up build instructions tonight or tomorrow, and plan to have a Windows exe built by next week.  For now, I still have some pretty important bugs to quash before I would trust this software with anything but testnet coins...
I'm certainly willing to help experiment.  I have some wallets that are nearly inaccessible via the Satoshi client because it starts dragging badly after 150k or so addresses.  Am curious if this wallet is any better/worse when dealing with those wallet files.

Will look forward to the release next week then!  Smiley
Jump to: