Author

Topic: Synchronizing wallet balance from a pruned node (Read 2425 times)

staff
Activity: 4284
Merit: 8808
November 01, 2015, 01:46:42 AM
#15
Doing that invalidates the security provided by running it in the first place. (also I make no promises that someone can't elevate giving you database files into full remote code execution).

If you're willing to trust a source of the database why not just dump out the utxo database and just give people a feed of it?
newbie
Activity: 12
Merit: 0
Mostly as a precaution as it hadn't been adequately tested.  It's enabled in master now and has been for a long time... running master should be a no brainer.

I'm really interested in adding a pruned node to my Tails + JoinMarket scripts for users. I think it is very important, especially now that Tails comes bundled with Electrum 1.9.8, which is obsolete. End users are having a horrible time trying to install newer version of Electrum, mostly because of the amnesiac properties of Tails.

Anyway, it would be very powerful for Tails end users to have a node at their disposal using only ~2GB of space. Right now most Tails + JoinMarket users are using the blockr.io api call for address balance/confirmations because they tend to not have a usb stick large enough to store a traditional unindexed node, and this is terrible for security even though it's over Tor (address correlation, etc.). If I could offer a small size Bitcoin Core node for Tails users... well you get it.

So I guess my questions are:

1. Do you think it's too early to offer to users as an option? Even with a strict warning?

2. Would it be a horrible idea to offer weekly updated downloads of the pruned chain? I could host them on github, and users can curl them over https. Also I could sign them and users can verify. My reasoning on this is bootstrapping the client is going to be my next hurdle in getting this adopted.
staff
Activity: 4284
Merit: 8808
if I didn't import enough watch-only addresses] and then use listunspent RPC.
So import a million, performance is acceptable with very large numbers of keys.

This sounds like you're asking about something for yourself to run.

Quote
Another issue is that currently pruning disables the Bitcoin Core wallet so
Mostly as a precaution as it hadn't been adequately tested.  It's enabled in master now and has been for a long time; considering the crazy things you were considering (concurrent access to the wallet database?!?) running master should be a no brainer.

If it would be useful to you, supporting rescan that goes back as far as the non-pruned data shouldn't be a big deal-- mostly an interface question of changing the rescan argument to take a depth and permit the import w/ rescan when the depth is compatible with the current level of pruning.  For what I think you're trying to do you really only need to rescan back a block or two, no?
newbie
Activity: 12
Merit: 0
It only really works, though, if you have another node with full chain ready for importing addresses as necessary.

What do you mean by this?

I just pruned a node two days ago, and I've been running a yieldgen bot from it for some hours now. I just imported my addresses to the pruned nodes wallet like normal with a new JoinMarket wallet, but no rescan because these are fresh addresses. I've already had two successful joins, and walletnotify works without a problem.
legendary
Activity: 1135
Merit: 1166
Good job on the pruned JoinMarket node(!) I assume you compiled your own version of Core with the wallet enabled?

Thanks!  I did compile myself, but I didn't have to change anything.  Wallet was already re-enabled on the development sources, I think.

I'm more worried about people who desire privacy (those doing repeated coinjoins again and again) and how they would manage. Bitcoin Core with a fully downloaded blockchain is the most private way of using things but it required 40gb+ of free hard disk space.

Yes, I fully agree.  The required disk space (although everyone usually just says "disk is cheap, bandwidth is the problem") would have required me to upgrade to a much more expensive VPS plan.  It only really works, though, if you have another node with full chain ready for importing addresses as necessary.
sr. member
Activity: 261
Merit: 523
@domob
Yes importing fresh addresses without rescanning will be done. The difficult thing I'm thinking about is synchronizing a wallet for the first time, maybe when it's being recovered from backup.

Good job on the pruned JoinMarket node(!) I assume you compiled your own version of Core with the wallet enabled?
I'm more worried about people who desire privacy (those doing repeated coinjoins again and again) and how they would manage. Bitcoin Core with a fully downloaded blockchain is the most private way of using things but it required 40gb+ of free hard disk space.

@goatpig
I may be wrong but since pruned nodes switch off the NODE_NETWORK flag they are not able to upload information about transactions or blocks to the p2p network.
legendary
Activity: 3766
Merit: 1364
Armory Developer
Yeah using the p2p layer could be used, with those bloom filters. The BitcoinJ implementation is bugged and doesnt provide privacy and reading through the documents it's not clear to me how that could be fixed.

I'm suggesting to use the P2P layer with your local node (instead of RPC only) for the added functionality. Unless pruned nodes won't accept bloom filters (I have no idea whether it does or not), this is the easiest path to achieve your functionality. And since the node is local, the privacy issue goes out of the way.

I don't think there is a technical limitation preventing pruned nodes from fulfilling a bloom filter request. After all they store the UTXO set and the payment address can be extracted from each one of them. There is no real difference in that regard when compared to a full node, besides that the set of TxOut is smaller.

Quote
Presumably I should read through the bitcoin dev mailing list to figure out how the developers imagined using a pruned node would work.
I don't see a way around requiring a pruning node to redownload the entire blockchain whenever a wallet is imported. Presumably this might end up as the accepted way of doing things, users should recover or import new wallets rarely.

Depends on the operating mode. If they do away with the wallet history feature and only stick to balances, it should be pretty straight forward. Or they could bootstrap a wallet ala SPV, once they replace these useless  bloom filters with committed maps (I think that's the name).
legendary
Activity: 1135
Merit: 1166
Presumably I should read through the bitcoin dev mailing list to figure out how the developers imagined using a pruned node would work.
I don't see a way around requiring a pruning node to redownload the entire blockchain whenever a wallet is imported. Presumably this might end up as the accepted way of doing things, users should recover or import new wallets rarely.

One thing I've been thinking myself recently is this:  As you say, it is actually enough to work through the UTXO set when looking for the balance of some address you try to import.  So this should, in theory, be possible with a pruned node without redownloading blocks (your suggestion 2 above).  I do not know what the developers think about making use of this, maybe it is already on their plan - the only "issue" is that you won't get a wallet with full tx history, "only" the full balance.  But that seems to be fine for your usecase.  One could add a new RPC call that lists the full UTXO set (as you suggest), or even just add a "light" address importing mode that only checks the UTXO set instead of the full blockchain.  This is not too hard to do, and I think that I could actually do it myself if there is any chance for it to be merged upstream.

Another suggestion:  As far as I can tell, the addresses you are going to import are most probably fresh ones.  Why not import them without any rescanning at all?  At least as a temporary and optional measure.

(By the way, I'm actually using a pruned node with JoinMarket on a VPS right now.  It works fine in normal operation, but I do have nodes with full history ready that I can use for importing more addresses when necessary.)
sr. member
Activity: 261
Merit: 523
As long as you have a Tx size you can guesstimate the top boundary for TxOut count per Tx. Short of that, block size could give you a broader range, but then you would have to resolve tx hashes to block height.

Not sure how much of that data is available through the RPC in pruned mode. I'm very familiar with block chain analysis but I work directly with raw block data, never through the RPC. Maybe you are better off using the P2P layer in an attempt to query more relevant data.

Unfortunately looks like you can't get the tx size while in pruning mode (or even in normal mode, in general you need -txindex=1)
Also I'm pretty sure that in pruning mode you actually cant use getblock and similar calls.

Yeah using the p2p layer could be used, with those bloom filters. The BitcoinJ implementation is bugged and doesnt provide privacy and reading through the documents it's not clear to me how that could be fixed.

It is my understanding that removing this restriction is a high priority task item for the Bitcoin Core developers. I don't exactly follow their progress, but many people expressed interest in having both pruning and wallet working at the same time.

Presumably I should read through the bitcoin dev mailing list to figure out how the developers imagined using a pruned node would work.
I don't see a way around requiring a pruning node to redownload the entire blockchain whenever a wallet is imported. Presumably this might end up as the accepted way of doing things, users should recover or import new wallets rarely.
legendary
Activity: 2128
Merit: 1073
Right now the wallet in Bitcoin Core is disabled if pruning is enable, so I don't think that will work
It is my understanding that removing this restriction is a high priority task item for the Bitcoin Core developers. I don't exactly follow their progress, but many people expressed interest in having both pruning and wallet working at the same time.

I should've quoted this portion of your original message in my reply.

3) A third way would be to open the UTXO BerkeleyDB database file (can this be done while Bitcoin Core is running?) and then check if our addresses match.

Yes, this can be done for BerkeleyDB but not for LevelDB. Apparently there exist code that is LevelDB-compatible and fully multiuser/multitasking but it is only available from Google for money and NDA.
legendary
Activity: 3766
Merit: 1364
Armory Developer
One issue I've just realised is the gettxout call also requires a numeric vout value. That's the number that goes with the txid.
There's no way to tell how many outputs a transaction has, so best you could do is try all numbers from zero to about 30 or 40 (?) And then you're wasting a lot of time and still might miss outputs.

As long as you have a Tx size you can guesstimate the top boundary for TxOut count per Tx. Short of that, block size could give you a broader range, but then you would have to resolve tx hashes to block height.

Not sure how much of that data is available through the RPC in pruned mode. I'm very familiar with block chain analysis but I work directly with raw block data, never through the RPC. Maybe you are better off using the P2P layer in an attempt to query more relevant data.
sr. member
Activity: 261
Merit: 523
3) is a bad idea since you should expect a DB engine to lock access to a single process by default. I would not base my code on this assumption, which imply you would need to bootstrap your own history DB without Core running, then use a different code path to maintain your own DB straight from blockdata.

2) is tedious and what are the chances that would be merged into Core?

1) is how I would do it, pull all blocks and check each for relevant UTXOs. This process can be very fast if you parallelize it, but you don't necessarily need to since it's just the original bootstrapping that will be resource intensive. Maintenance won't be nearly as costly and can use the exact same code path with bounds on block height.

Yes that's right. Core can open several RPC threads.

One issue I've just realised is the gettxout call also requires a numeric vout value. That's the number that goes with the txid.
There's no way to tell how many outputs a transaction has, so best you could do is try all numbers from zero to about 30 or 40 (?) And then you're wasting a lot of time and still might miss outputs.

Probably the easiest way is to run Bitcoin Core with -privdb=0 flag and reopen the wallet.dat in a shared mode. You'll still have to make RPC calls to populate addresses, but at least transactions can be queried live through BerkeleyDB API.


Right now the wallet in Bitcoin Core is disabled if pruning is enable, so I don't think that will work
legendary
Activity: 2128
Merit: 1073
Probably the easiest way is to run Bitcoin Core with -privdb=0 flag and reopen the wallet.dat in a shared mode. You'll still have to make RPC calls to populate addresses, but at least transactions can be queried live through BerkeleyDB API.
legendary
Activity: 3766
Merit: 1364
Armory Developer
3) is a bad idea since you should expect a DB engine to lock access to a single process by default. I would not base my code on this assumption, which imply you would need to bootstrap your own history DB without Core running, then use a different code path to maintain your own DB straight from blockdata.

2) is tedious and what are the chances that would be merged into Core?

1) is how I would do it, pull all blocks and check each for relevant UTXOs. This process can be very fast if you parallelize it, but you don't necessarily need to since it's just the original bootstrapping that will be resource intensive. Maintenance won't be nearly as costly and can use the exact same code path with bounds on block height.
sr. member
Activity: 261
Merit: 523
I'm working on a bitcoin application which talks to Bitcoin Core. It uses RPC commands to request information about which bitcoin addresses from a HD seed have UTXOs are on them. I'm thinking about making this wallet synchronization work with a pruned node.

Given a HD seed, one way to synchronize the wallet would be to import a lot of watch-only addresses into Bitcoin Core and then restart it with -rescan. This would require the pruned node to redownload the entire blockchain again [possibly multiple times(!) if I didn't import enough watch-only addresses] and then use listunspent RPC.

So I'm thinking to query the UTXO set. I wouldn't get transaction history information but that's okay for these uses. I'd need a way to go from address to UXTOs.
1) One way would be to simply use the RPC calls getbestblockhash, getblock and gettxout to step through all outputs in the entire blockchain, checking if our addresses match any of them.
2) A second way would be to write some new RPC calls that allow dumping of the UXTO set database with pagination in the manner of listtransaction, then check if our addresses match.
3) A third way would be to open the UTXO BerkeleyDB database file (can this be done while Bitcoin Core is running?) and then check if our addresses match.

Another issue is that currently pruning disables the Bitcoin Core wallet so -walletnotify probably wont work. This doesn't have to be a problem as I could just poll getrawmempool and use -blocknotify instead.

Thoughts? What's the best way to proceed?
Jump to: