BIP proposal: Automatic Wallet Backup scheme

bassmaster

copper member

Activity: 21

Merit: 3

Hi, whatever happened to this proposal?

oleganza

full member

Activity: 200

Merit: 104

Software design and user experience.

Now implemented in Objective-C and specified test vectors.

https://github.com/oleganza/bitcoin-papers/blob/master/AutomaticEncryptedWalletBackups.md

instagibbs

member

Activity: 114

Merit: 12

Have you discussed this scheme with hardware wallet makers?

It'd be great if there were any way possible to get this to work with Trezor/Ledger, but I'm unaware of their ability to do symmetric encryption schemes.

Otherwise, it seems quote well-thought out.

oleganza

full member

Activity: 200

Merit: 104

Software design and user experience.

I've updated the scheme:

1. It describes data format and crypto in full detail.
2. Key derivation and signing is simpler (HMACs instead of ECDSA and BIP32).
3. Merkle tree support to allow efficient periodical "proof of storage" requests.
4. Method to efficiently timestamp backups on the blockchain so you know which one is the latest one.
5. Method to do incremental backups if they unusually large.

Let me know what you think:
https://github.com/oleganza/bitcoin-papers/blob/master/AutomaticEncryptedWalletBackups.md

Thanks!

ASIC-8Tile

sr. member

Activity: 279

Merit: 250

Sorry, I misinterpreted the "server" statement. Thank you for the clarification.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: ASIC-8Tile on September 19, 2014, 10:27:43 AM

ASIC-8Tile - We are trying to get away from servers.

You misunderstand the sense in which I'm using 'server' there. In the sense I'm using it there is always a server (the counterparty to the client).

ASIC-8Tile

sr. member

Activity: 279

Merit: 250

Could BitTorrent Sync be used for p2p storage?
We are actually in the process of using this for physical/encrypted invoicing from Individual to Business/Business to Business etc...

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

Otherwise— this sounds useful! Should it perhaps specify more of the storage service? e.g. how much data can you expect to store, how would such a service be compensated? how would you know which service(s) you're using?

The last in particular seems to be a tough question... but in general we should probably try to specify a "minimum interoperable unit", and I'm not sure if the message alone is terribly interesting.

That would be nice, but can be added as an additional server-side BIP after a couple of actual implementation (and if people really want to produce generalized API).

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

[Hm. Wow, a synchronizing server would be super cool for this, if we had a good way of avoiding abuse.]

Maybe some proof of work would do? However, I'd prefer some payment scheme built-in. So we could pay a little bit upfront for X uploads and therefore has some incentive for the server to stick around when we need to retrieve the data. Maybe the payment is better be done afterwards. Or with some sort of 2-of-2 bilateral deposit.
[/quote]
ASIC-8Tile - We are trying to get away from servers.

oleganza

full member

Activity: 200

Merit: 104

Software design and user experience.

Thanks for the feedback! I'm glad someone validated this idea.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

Please do not use a #@$@ number without an assignment. Just call it BIP-oleganza-backup for the moment, until the text is ready. Otherwise we get a mess of number collisions and people calling things by colliding numbers they picked and not wanting to change them.

(this isn't nitpicking, it's happened multiple times)

Ok, noted.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

Otherwise— this sounds useful! Should it perhaps specify more of the storage service? e.g. how much data can you expect to store, how would such a service be compensated? how would you know which service(s) you're using?

The last in particular seems to be a tough question... but in general we should probably try to specify a "minimum interoperable unit", and I'm not sure if the message alone is terribly interesting.

That would be nice, but can be added as an additional server-side BIP after a couple of actual implementation (and if people really want to produce generalized API).

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

WRT the spec. The IV really should be non-determinstic, it's already stored in the encrypted message. With a constant IV an observer can tell with AES block precision where the first modification to an updated copy was (and perhaps some more elaborate attacks, e.g. it would be trivially insecure if the cipher mode selected was CTR—). There is no need for the IV to be deterministic that I'm aware of... If you're worried about embedded device RNG quality, you could recommend that the IV be constructed as H(time||other-random||pubkey).

IV is deterministic, but not static. I've made it more clear in BIP. For each backup wallet is supposed to pick next index and derive another unpredictable IV. This is not mandatory (IV is published anyway and can be random), but allows us to have a good default that does not depend on RNGs and can be verified with test vectors.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

You appear to have no length encoded for the plaintext. AES-CBC is only capable of encoding an integral number of blocks, so something must encode the plaintext length. I might suggest it use self-descriptive padding, e.g. there is always at least 1 byte of padding, and last byte says how many bytes of padding there are (up to 16, though perhaps some applications might want more padding to close a size sidechannel?). Another style of self-descriptive padding I've seen used is to pad with a 0 bit and then all ones until the end, and the receiver drops all trailing 1s and the last 0 (has the advantage of fewer decodings being invalid).

Thanks for noting this. I myself used PKCS7 padding which I think is exactly what you suggested. Now it's mentioned explicitly in the BIP.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

The signature encoding can be made constant length, and probably should be, doing so will save at least one byte (and probably several, depending on how you were planning on having a variable length signature encoding).

Is there a reason to keep the AuthFingerprint? It can be derived from the message itself and the signature (e.g. how bitcoin's signed message works), omitting it would save ~19 bytes.

Good point. I've replaced the auth fingerprint, signature and its length prefix with a single 65-byte long compact signature.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

Is there a particular motivation for using a digital signature instead of using a MAC? One reason I could see is that you might want to have multiple servers synchronizing their data without individually talking to the user, like the PGP SKS keyserver— but for that case you'd want to add a sequence number (so you know if an update you're getting is a newer message or not).

Should these encrypted data chunks have a good-until date coded in them? I'd say it could be provided out of band, but not if we wanted it to be authenticated by the signatures (for the imagined synchronization network).

Initially I had an idea about adding a timestamp and making the whole thing verifiable without access to the private keys. But it was not well-thought. Now I've clarified this: ~~auth key is non-hardened~~ auth pubkey can be kept in memory/stored on disk unencrypted so the wallet can verify various backup payloads without asking user for his password. When the fresh valid backup is found (or the user selected one of the available backups), wallet asks for a password or TouchID verification to unlock private master key and derive decryption keys.

Quote from: gmaxwell on August 06, 2014, 08:26:15 AM

[Hm. Wow, a synchronizing server would be super cool for this, if we had a good way of avoiding abuse.]

Maybe some proof of work would do? However, I'd prefer some payment scheme built-in. So we could pay a little bit upfront for X uploads and therefore has some incentive for the server to stick around when we need to retrieve the data. Maybe the payment is better be done afterwards. Or with some sort of 2-of-2 bilateral deposit.

gmaxwell

staff

Activity: 4326

Merit: 8951

Please do not use a #@$@ number without an assignment. Just call it BIP-oleganza-backup for the moment, until the text is ready. Otherwise we get a mess of number collisions and people calling things by colliding numbers they picked and not wanting to change them.

(this isn't nitpicking, it's happened multiple times)

Otherwise— this sounds useful! Should it perhaps specify more of the storage service? e.g. how much data can you expect to store, how would such a service be compensated? how would you know which service(s) you're using?

The last in particular seems to be a tough question... but in general we should probably try to specify a "minimum interoperable unit", and I'm not sure if the message alone is terribly interesting.

WRT the spec. The IV really should be non-determinstic, it's already stored in the encrypted message. With a constant IV an observer can tell with AES block precision where the first modification to an updated copy was (and perhaps some more elaborate attacks, e.g. it would be trivially insecure if the cipher mode selected was CTR—). There is no need for the IV to be deterministic that I'm aware of... If you're worried about embedded device RNG quality, you could recommend that the IV be constructed as H(time||other-random||pubkey).

You appear to have no length encoded for the plaintext. AES-CBC is only capable of encoding an integral number of blocks, so something must encode the plaintext length. I might suggest it use self-descriptive padding, e.g. there is always at least 1 byte of padding, and last byte says how many bytes of padding there are (up to 16, though perhaps some applications might want more padding to close a size sidechannel?). Another style of self-descriptive padding I've seen used is to pad with a 0 bit and then all ones until the end, and the receiver drops all trailing 1s and the last 0 (has the advantage of fewer decodings being invalid).

The signature encoding can be made constant length, and probably should be, doing so will save at least one byte (and probably several, depending on how you were planning on having a variable length signature encoding).

Is there a reason to keep the AuthFingerprint? It can be derived from the message itself and the signature (e.g. how bitcoin's signed message works), omitting it would save ~19 bytes.

Is there a particular motivation for using a digital signature instead of using a MAC? One reason I could see is that you might want to have multiple servers synchronizing their data without individually talking to the user, like the PGP SKS keyserver— but for that case you'd want to add a sequence number (so you know if an update you're getting is a newer message or not).

Should these encrypted data chunks have a good-until date coded in them? I'd say it could be provided out of band, but not if we wanted it to be authenticated by the signatures (for the imagined synchronization network).

[Hm. Wow, a synchronizing server would be super cool for this, if we had a good way of avoiding abuse.]

oleganza

full member

Activity: 200

Merit: 104

Software design and user experience.

Hi,

My name is Oleg Andreev, I work on iOS/OSX wallet and CoreBitcoin - a clean and well-documented Bitcoin toolkit in Objective-C.

As you all know, wallets are typically encrypted with a password (using some key stretching algorithm like PBKDF2 or Scrypt). Since the password is weaker than a purely random 128+ bit key, it's better if the user keeps their wallet in some private location that is relatively hard to access. Such backup is better not to be thrown around on popular hosting services like Gmail or Dropbox. HD wallets (BIP32) improve user experience by requiring to secure only the master key and only once. The rest of the keys can be derived later to retrieve the funds.

The problem is, wallets may have extra metadata which cannot be derived from the master key. E.g. user notes, invoice info, or even more importantly, multisig pubkeys and P2SH scripts. To redeem a P2SH payment one needs to know original script which must be stored somewhere and securely backed up before any transaction is made involving that script. Asking the user to backup his password-protected wallet before each such transaction would be cumbersome.

I suggest additional backup scheme where the user's wallet is encrypted using a truly unpredictable AES key derived from the wallet's master key. If the master key itself is not derived from a weak passphrase, but has 128+ bits on entropy, the AES key would be equally strong. Therefore the wallet can be automatically encrypted and uploaded to one or more backup services without any user action. When the user needs to restore the backup, he will have to restore the original master key first and then make his wallet connect to backup servers and retrieve the most recent backup of full wallet contents. Backup servers cannot possibly decrypt wallets with bruteforce, they only need to allow reliable retrieval. User's wallet may download the backup at regular intervals to detect if one of the servers lost his data or went offline. In such case, another server may be used or the user may be warned to make a manual backup as soon as possible.

Proposal:
~~https://github.com/oleganza/bips/blob/master/bip-0081.mediawiki~~
~~UPD: https://github.com/oleganza/bips/blob/master/bip-oleganza-backups.mediawiki~~
UPD2: https://github.com/oleganza/bitcoin-papers/blob/master/AutomaticEncryptedWalletBackups.md

PS. I didn't want to create a pull request as the text might change and I don't want to have troubles with rebase (and accidentally lose connection to a pull request). Github Issues seem to be disabled in the bitcoin/bips repo. So lets discuss it here for now.

Topic: BIP proposal: Automatic Wallet Backup scheme (Read 4588 times)