I think I worked out a fairly simple way for a Bitcoin Signing Class device like the sigsafe to support BIP32 address chains. It just requires adding one additional optional argument to the "SignRawTransaction" APDU command. First, let's take a look at BIP32.
Why would BIP32 be useful for sigsafe?
The BIP32 spec was originally published (2012-02-11) by Pieter Wuille. It describes a method to create hierarchal deterministic (HD) keychains and wallets that "can be shared partially or entirely with different systems, each with or without the ability to spend coins."
Here's an example of how an HD keychain would be useful for a Bitcoin Signing Class device like the sigsafe:
Imagine that a bitcoin user at a brick-and-mortar coffee shop wants to pay for a latte with his sigsafe but doesn't want to use the same bitcoin address over and over (for privacy reasons). When the point-of-sales (PoS) terminal issues the "GetInfo" APDU command, instead of returning a single bitcoin "spend address," the sigsafe could return an "public keychain" for spending. What is amazing about HD wallets is that with knowledge of the public keychain, the PoS terminal can figure out what bitcoin addresses the sigsafe can spend from (i.e., what addresses the sigsafe can calculate the private key for) without any sort of security breach. The PoS terminal would scan for unspent outputs across the keychain domain1 and construct a transaction that spends the amount owing for the coffee to the merchant's address, returning change to a different bitcoin address in the same keychain. The PoS terminal would send this raw transaction to the sigsafe using "SignRawTransaction"--with one additional detail that I'll get to later. Provided none of the signing rules are violated, the sigsafe would produce the required ECDSA signatures, return the signed transaction to the PoS terminal, and the PoS terminal would broadcast to TX to the network to complete the transaction.
What does a BIP32 keychain look like?
A bitcoin keypair consists of a private key, d, and a public key, Q. The public key can be determined from the private key using the equation
Q = d G
where G is the (x,y)-coordinates of a special point on the secp256k1 elliptic curve known as the "base point." Also note that elliptic curve multiplication is of course implied rather than regular vector multiplication. To sign a bitcoin transaction, knowledge of d is required, but to verify the signature only knowledge of Q is required. There is no feasible way to calculate d from Q.
In BIP32, instead of ECDSA public and private keys, we need to think in terms of public and private extended keys or "keychains". (Note that I am now deviating from Pieter Wuille's notation, in an effort to be consistent with the NSA's FIPS 186-3, and IMO also abstract the notation in a useful way.)
public keychain : Q = (Qparent, c)
private keychain : d = (dparent, c)
The parameter c is called the "chain code," is constant for a given keychain, and is 256-bits long. I like to think of Q and d as vectors such that the ith element is the ith child key for that chain. So Q4 means the the child ECDSA public key at index four, and d7 means the child ECDSA private key at index seven. The BIP32 spec describes how to calculate Qi as a function of (Qparent, c, i) and how to calculate di as a function of (dparent, c, i).
The magic of BIP32 is that (for some reason that I don't yet understand )
Qi = di G.
This is very significant because it means that an external device can calculate the public keys for a given keychain without gaining knowledge of the corresponding private keys!
Sigsafe support
A given keychain has 2^32 child keys and neither the sigsafe nor a point-of-sales terminal have a reliable way of knowing at which index unspent outputs might be lying. It would not be practical for a PoS terminal to scan 2^32 bitcoin addresses to look for spare coins!
My idea (and I'm not completely happy with it) is to use signing rules to limit the range of the child index between 0 and N-1, where N can be no smaller than 2^8 = 256 (but could be specified as greater than 256 by the GetInfo packet). The sigsafe would only sign transaction that return change to indices less than N but could still spend outputs at higher indexes. The interface device would know to look for unspent outputs between 0 and 255, but could optionally look for inputs outside of this range. It would just need to make sure that when it contructs the raw transaction, that all outputs are returned to an index within the domain.
The next issue, is that the sigsafe doesn't have a lot of computational power. If it was given a raw transaction that spends outputs on its keychain, it would still take it a long time (even with N=255) to figure out what private keys it needs to produce a valid signature. For this reason, I think the interface device should use an optional argument with "SignRawTransaction" to specify the keychain and index that any inputs or outputs in the raw transaction are controlled by or sent to. This way, the sigsafe can create the needed ECDSA keypairs at run-time, sign the transaction if authorized, and then forget that it ever happened.
TL/DR: I think we just need to add a P2=FF option to the SignRawTransaction APDU to specify the keychains and indices used by the TX:
SignRawTransaction APDU:
CLA = B0
INS = 80
P1 = 00 : standard BSC interpretation
= xx : proprietary interpretation (P1=CC for sigsafe-only interpretations)
P2 = 00 : raw transaction in command data field expressed in binary (arg1)
= 01 : optional parent transaction in command data field expressed in binary (arg2)
= 02 : optional private key in command data field expressed in binary (arg3)
= 03 : optional sighash specifier (arg4)
= FF : optional keychain and index for outputs controlled by or sent to a BIP32 keychain (for P1=CC)
Lc = encodes number of bytes in "command data field"
command data
Le = encodes max number of response bytes allowed
response data
SW1
SW2
1A reduced domain compared to the 2^32 possible child keys.