Sending large amounts of data using QR-codes is a pain. However, you can compress an unsigned transaction pretty well is both sides agree on a simple protocol.
...
Unless I have forgotten something crucial this should work.
Unfortunately, it can't be that simple. In fact, even for the simple case you describe, it is dramatically more complex. Why? Allow me to explain how offline wallets work in Armory, and why data sizes will never be that small.
Satoshi decided not to include the input values of the inputs being spent in the transaction inputs
or explicit declaration of the fee
in the transaction data to be signed. This sounds arbitrary, because that information is located in the blockchain, so the node only needs to go look it up the OutPoints in the blockchain to know. Right? Well, offline wallets can't do that. And part of the value of Armory for offline transactions is that the offline computer doesn't need the blockchain. You might say "okay, well just throw that information in with the tx to be signed so that the offline computer knows." If only it were that simple...
gmaxwell expressed concern, rightly, that if your online computer is infected, the next transaction you make might have a devastatingly malicious modification: it completes your transaction, but sends the rest of the balance of your wallet to transaction fee. But you don't know this, because the attacker also modified the "supplementary" information in the transaction, so that the offline computer
thinks it's only signing a 1.01 BTC input, with 0.5 to recip, 0.5 to change, and 0.01 to fee. But the attacker actually put a 300 BTC input on the tx-to-be-signed, but put in the "supplemental" information that the input is only 1.01 BTC. The result will be the offline computer showing you that you are sending 0.5 BTC to the recipient with 0.01 fee. But when you send the transaction, it's actually 299 BTC fee.
THEREFORE: my
BIP 0010 "protocol" includes the entirety of each transaction which supplies inputs to the transaction-to-be-signed. For each input in the tx-to-be-signed, Armory sees the OutPoint (txHash, txOutIndex), and verifies that it was passed a transaction with the same TxHash. From that transaction, it can
verify the value of the input and the final tx fee.
- If the attacker changes the recipient or the amount sent to recipient -- the user should notice because they can see the list of recipients and values before they sign it
- If the attacker changes the value specified on the supplementary tx -- the suppl tx hash will no longer match the OutPoint on the tx-to-be-signed, verification will fail
- If the attacker changes the supplementary tx value and the OutPoint hash -- the transaction is no longer valid, because that OutPoint doesn't actually exist
In fact, that pretty much clears up every possible avenue for tricking the offline computer. Now, every piece of important information is verifiable by the offline computer. If there is manipulation, the either the tx won't be valid, or the user will notice when they look at the transaction details.
Okay, so that gets us back to the original question of "how much data do we have to transfer between online and offline computer?" Unfortunately, the simplest case is not relevant to this discussion: you have to design the protocol around the 99.9'th percentile case: which is the case that someone has an offline donation address that they want to clear out. Let's say they have received 40 donations.
So the transaction will have 40 inputs and 2 outputs.
The bulk of the data is the supporting transactions which can be
anything (transactions created by the donors). Each one itself may have dozens of inputs, and the signatures are necessarily included! Let's assume 30 "standard" supporting transactions, and the other ten have 10 inputs each.
- Tx-to-be-signed: 30 inputs (unsigned) of 48 bytes each, and two outputs of 40 bytes each = 1.5 kB
- 30 standard supporting tx: 250 bytes each = 7.5 kB
- Ten larger tx: 180 bytes for each input (signed), so about 2 kB each = 20 kB
So the online computer needs to communicate 30 kB to the offline computer in this case. And the offline computer needs to transfer back 30 signatures, which is, at best, 2 kB at a minimum. The "maximum" a QR code can handle is 3 kB of binary, so that's 10 QR codes from online to offline. 1-2 QR codes the other way.
So the protocol should handle 30 kB without causing a lot of pain. If the user has to wait a little bit because of a slow communication rate, that's okay because this case is abnormal and waiting 60s for the transfer isn't the end of the world. But if they
can't succeed because it's confusing and they can't figure out how many and which QR codes have been scanned, or which webcam they're supposed to be pointing at which device, and frustrated there are wires everywhere, etc. Then there's a problem...
As you can tell, I'm very sensitive to the "convenience" of a given feature. I think the biggest barrier to security is convenience -- users just don't use things that are inconvenient. But I also don't want to sacrifice security, at all, no matter how much work it is for me. Which is why there are so many recommendations here that are great, but don't quite the bill. But I'm pretty sure a solution exists where the user can actually have both, in which case everyone wins