Hardware wallet wire protocol - page 4.

jim618

legendary

Activity: 1708

Merit: 1066

For the upgrade path of the protobuf format there are a few things we do in the bitcoinj wallet:
1) there is a wallet major and minor version. The major version gets updated on non backwards compatible changes. That way something like MultiBit can look at the version number and say: this is a wallet version 6 - I only know about wallets up to 5 so sorry I cannot load it.

2) also you can put in a new mandatory field to effectively jam a spanner in the machine. A protobuf parser cannot 'legally' load a mandatory field that it does not know about. You can use this to really really stop messages from the future being loaded.

3) in the bitcoinj wallet protobuf there is an extension message with a name/value pair. You can use it to add in additional information - it is designed so that, say, I want to put in some org.multibit fields I can add them in. Other clients will just read them and faithfully write them back (to avoid data loss) but not do anything with them.

2112

legendary

Activity: 2128

Merit: 1073

Unrelated to the above, there is one more design decision that you may need to make.

It seems that as it stands right now the protocol may be training the user to blindly press the button without actually looking at the LCD screen.

In the medical device field there is an ongoing discussion: what is the correct level of interaction required to dismiss a dialog about potentially dangerous and irreversible action.

1) pressing always the same button to approve after hearing a beep or always another button to cancel and hear another acknowledgement tone;

2) pressing exactly one of 4-10 buttons after reading the message, and where the message tells which button to press and the number of the button needs to change with repeated presentation of the same message. Pressing any other button than indicated in the message is interpreted as cancellation.

In case of the Piglet the button on the device would be still the same, single one, but the user would be required to look at the screen and the screen would tell the user which key to press on the computer's keyboard.

The medical field apparently still cannot agree which way to go on this.

2112

legendary

Activity: 2128

Merit: 1073

In my experience the response to the Ping message should contain a non-optional sequence number. This is particularly important in case of the device like yours that may require slight physical rotation to be able to read the LCD and/or may be jostled while pressing the button.

Theoretically the OS should issue USB disconnect/connect events in case of the brief interruption sensed on the USB port, Windows even plays a warning chime. But in practice this doesn't work 100% reliably and the USB device may get reset while being handled by the user to complete the required interaction.

There could be also an opposite situation where the USB device is plugged to the powered USB hub. The user interaction could cause hub disconnection, but the device's internal state would not be affected, despite the application receiving and handling the disconnect/connect events. If the application can detect that the device hadn't lost the internal state it may save the user some aggravation caused by a repeated initialization and retry.

someone42

member

Activity: 78

Merit: 11

Chris Chua

Quote from: slush on November 16, 2012, 01:49:25 PM

Forking protocol buffer standard? No way, at least not for me :-).

I'm not aware of any standard saying that protobuf parsers must skip unrecognised fields. In fact, modifying a parser to fail on unrecognised fields would make things more secure: it would protect against client software accidentally using an unsupported feature and then misleadingly reporting success to a user.

Quote from: slush on November 16, 2012, 01:49:25 PM

I'm affraid of "partial support" of some features. Once you support OTP/PIN messages or not. I'm writing test suite which should cover 100 % of device use cases, so I believe this test suite will help developers to be sure all "features" are well implemented.

It should not report support for some feature which is not fully implemented. As I said, we're handling with valuable information and we should not try to recover from unexpected state by heuristic or some guessing.

I agree that supporting OTP and PIN is a yes or no thing. But sometimes it is not so clear, and we can't anticipate what advanced features might be added to wallets in the future.

The ignore-on-0 convention would not encourage "unexpected state", as long as 0/false/"" is always defined to mean that a feature is disabled. For example, say an "is_hidden" field is added to the LoadDevice message. Wallets which do not support hidden volumes can safely ignore is_hidden = false. If such a wallet ever gets a message with is_hidden = true, it fails with a "feature unsupported" failure message.

Such a convention is safely used in microcontrollers, where reserved bits in peripheral hardware registers are often supposed to be set to 0. Then, if features are added to those registers, the designers will take 0 to mean disabled and 1 to mean enabled.

Edit: Here's a compromise: if a wallet sees an unrecognised field, it always fails with "feature unsupported", no matter what the value of the field. All new fields must be optional. If client software doesn't want to use a new feature, it must not include the corresponding fields.

slush

legendary

Activity: 1386

Merit: 1097

Quote from: someone42 on November 16, 2012, 01:31:34 PM

I would like to propose the convention that if the wallet sees a field in a protocol buffer that it does not recognise, if that field is 0, false or "" (depending on the type of the field), the wallet can safely ignore the field.

I'm affraid that there's no safe solution how to handle forward compatibility by the wallet. I think that if there will be some incompatible messages in the future, clients should ask for wallet version (part of Features message) and then pick appropriate set of messages. This may turn into the mess when wallet developers won't be sane enough, but I'm also affraid of wallet which will try to recover from some messages which it doesn't fully understand...

Quote

As it stands, nanopb currently doesn't allow this (it skips all fields it doesn't recognise). But I'm sure it's possible to modify the runtime decoder to follow this convention.

Forking protocol buffer standard? No way, at least not for me :-).

Quote

I see the approach slush has taken is to include a "Features" message which enumerates a wallet's features. I think a features list is somewhat open to interpretation.

I'm affraid of "partial support" of some features. Once you support OTP/PIN messages or not. I'm writing test suite which should cover 100 % of device use cases, so I believe this test suite will help developers to be sure all "features" are well implemented.

Quote

For example, what happens if a wallet only supports a subset of feature X? Does it report that it supports feature X or not?

It should not report support for some feature which is not fully implemented. As I said, we're handling with valuable information and we should not try to recover from unexpected state by heuristic or some guessing.

someone42

member

Activity: 78

Merit: 11

Chris Chua

I would like to propose the convention that if the wallet sees a field in a protocol buffer that it does not recognise, if that field is 0, false or "" (depending on the type of the field), the wallet can safely ignore the field. On the other hand, if an unrecognised field contains something other than 0, false or "", a wallet must not ignore the field and must return a "feature not supported" failure message. This convention makes it easier to add additional fields in the future.

As it stands, nanopb currently doesn't allow this (it skips all fields it doesn't recognise). But I'm sure it's possible to modify the runtime decoder to follow this convention.

I see the approach slush has taken is to include a "Features" message which enumerates a wallet's features. I think a features list is somewhat open to interpretation. For example, what happens if a wallet only supports a subset of feature X? Does it report that it supports feature X or not? Another advantage of the ignore-on-zero convention is that client software doesn't have to put as many ifs all over the place - "feature not supported" is handled like any other wallet error.

slush

legendary

Activity: 1386

Merit: 1097

Quote from: casascius on November 16, 2012, 01:23:56 PM

In a round-about way... please just don't make this specific to hardware supporting USB.

As far as Jim and Alan will be open to implement support for Pigeon transport, there's no limitation in protocol itself.

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Quote from: slush on November 16, 2012, 12:41:16 PM

Firstly, don't forget that we're not inventing new internet, but just specialized protocol for talking with quite a dumb device. Secondly, protocol itself (PB) isn't specific for some particular transport. And integrity checking should be a part of the transport itself, not the part of application protocol built on top of it.

Shortly said, it is possible to use the same protocol as described above over TCP without any modification.

Your dumb device is actually a brilliant device and is probably the future of Bitcoin as we know it. In the interest of having as few competing protocols as possible, I'm just submitting to you my request that you do this one right, because the next person to make a device intended to be interoperable with yours will probably make some improvement to it that none of us can conceive of today, or run it over some transport that seems crazy here and now but makes total sense somewhere else (e.g. analog radio links in Africa, or DTMF over antiquated voice lines, or printed barcodes on pieces of paper), and you want his job to be as unencumbered as possible.

Simple integrity checking at the application level is commonplace and best practice. Bitcoin addresses have integrity checking in them, despite them most commonly being passed through TCP and other channels that also have error checking. Application-level error checking isn't supposed to reinvent transport-level error checking at the application level, it's there just to throw out corrupted requests as a failsafe and substituting a low-risk action (e.g. doing nothing) instead of taking uncommanded actions with costly consequences when you're dealing with real money. A simple 32-bit CRC and a practice of ignoring messages failing the CRC check would be a minimum that any hardware should have no problem handling without being an unreasonable burden on application-level code.

In a round-about way... please just don't make this specific to hardware supporting USB.

jgarzik

legendary

Activity: 1596

Merit: 1100

Good stuff. Keep up the good work!

I think there are USB simulators to be found, which would make testing this easier.

slush

legendary

Activity: 1386

Merit: 1097

I just added the chapter "How to encode protobuf message into the stream?"

slush

legendary

Activity: 1386

Merit: 1097

Quote from: casascius on November 16, 2012, 11:08:19 AM

I would suggest starting with a message protocol that would be workable over a simple RS232-like serial line. Start with the assumption that basic error detection and recovery will be needed, so the protocol isn't restricted to media that has it built in. Then worry about adding abstractions to it like chunking. USB-CDC and bluetooth can both abstract a serial line "out of the box". If you invent a protocol that's centered around USB-HID, then nobody who wants to extend that protocol to work over TCP/IP or radio waves or named pipes or tin can telephones or light sensors vs flashing boxes on a screen or any of the other useful ways to move data is going to want to deal with implementing "USB-HID" contingencies where they aren't needed.

Firstly, don't forget that we're not inventing new internet, but just specialized protocol for talking with quite a dumb device. Secondly, protocol itself (PB) isn't specific for some particular transport. And integrity checking should be a part of the transport itself, not the part of application protocol built on top of it.

Shortly said, it is possible to use the same protocol as described above over TCP without any modification.

slush

legendary

Activity: 1386

Merit: 1097

Quote from: jim618 on November 16, 2012, 10:38:36 AM

Timeout

Strictly speaking, device itself is just a state machine. I don't think defining some timeout is required. Computer can reset the device at any time by sending "Initialize" message, as I defined above.

From computer perspective, some "Ok, nothing happened" is completely implementation dependent and we don't need to specify this. Confirmation dialog can be opened for unlimited time and nothing wrong happen...

Quote

In 'Standard message flow, first paragraph' you mention that the computer initiates the conversation but with the PIN and OTP the device sends the PinRequest / OTPRequest. Do these occur as responses to the GetEntropy computer request only ?

PinRequest and OtpRequest can basically appear as a response for any sensitive call. GetEntropy was just an one example.

I personally implement Pin/Otp handling in bitkeylib (python) independently to the call itself. When device respond with PinRequest or OtpRequest, library simply ask user for input.

Quote

C: GetEntropy()
D: Entropy(entropy)

or

C: GetEntropy()
D: OtpRequest()
C: OtpAck(otp)
D: Entropy(entropy)

or

C: GetEntropy()
D: OtpRequest()
C: OtpAck(otp)
D: PinRequest()
C: PinAck(pin)
D: Entropy(entropy)

also, is it valid to have the PinRequest/PinAck before the OtpRequest/OtpAck ?

Yes, all these combinations can happen. Although asking for Otp before Pin make sense, it makes guessing/bruteforcing of the pin almost impossible, because very attempt requires unique OTP...

Quote

Does the bytestream for USB have guaranteed integrity ? IE is it necessary to have a checksum for each of the 64 byte message or is that taken care of in the transport protocol ? If required we could simply have 1 checksum byte = XOR(payload bytes) i.e.

USB HID is very simple protocol and yes, it guarantee the order and integrity. So no need for adding it manually.

Quote

Also, you state that the PB message is chunked into 64 byte packets. When they are 'dechunked' how do you know the boundaries of the packets to stitch them back together to recreate the PB messages? Do you need "this protobuf message is chunked over X packets' at the beginning of the PB message.

You're right, I completely forgot to describe "magic character" and "message size" while encoding PB message into the stream. I'll complete the documentation above now.

Quote

I am just thinking of how to make the 'chunking' and 'dechunking' as simple and unambigous as possible. Ideally you want it a dumb transport layer that understands nothing of what is in the messages.

That's exactly how I defined it. There are three layers:

1. Transport - it just transport bytes from one side to another. I want to have only one transport based on USB HID in the specs, but technically it is possible to use serial port or system sockets as well. As I said, I'm already testing the protocol using named pipes. Part of transport specification is (for example) that one byte defining message length (report id in term of HID).

2. Stream encoding/decoding. It contains magic character + PB message length, to make decoding of the stream as easy as possible. I'll complete the docs above.

3. Payload encoding/decoding. I proposed Protocol Buffers, which seems to be pretty good choice for our needs.

Whole messaging stack doesn't need to understand transferred messages themselves, so everything is very flexible.

someone42

member

Activity: 78

Merit: 11

Chris Chua

Quote from: jim618 on November 16, 2012, 10:38:36 AM

Checksum and Chunking/ Dechunking
Does the bytestream for USB have guaranteed integrity ? IE is it necessary to have a checksum for each of the 64 byte message or is that taken care of in the transport protocol ? If required we could simply have 1 checksum byte = XOR(payload bytes) i.e.

1B = payload length
up to 62B = payload
1B = checksum defined as XOR(payload)

Also, you state that the PB message is chunked into 64 byte packets. When they are 'dechunked' how do you know the boundaries of the packets to stitch them back together to recreate the PB messages? Do you need "this protobuf message is chunked over X packets' at the beginning of the PB message.

I am just thinking of how to make the 'chunking' and 'dechunking' as simple and unambigous as possible. Ideally you want it a dumb transport layer that understands nothing of what is in the messages.

USB does have some error checking (one CRC16 per packet), so there's probably no need for us to implement it anywhere.

As for how to determine the packet boundaries, protocol buffers appear to lack a terminator. So length information must be out of band:
1. One option is to include the length of the packet somewhere. For example, the length (eg. as a packed, little-endian 32 bit integer) can be prepended to the protocol buffer.
2. Another option is to rely on the chunk lengths. When you see a chunk with less than 63 bytes of payload, you know it's the end. Note that if the packet length is a multiple of 63, to avoid confusion, an extra 0-payload packet needs to be sent as well.

Lastly, a bit of detail: how is the message type sent? A single byte prepended to the protocol buffer will do; but does anyone have a better solution?

casascius

vip

Activity: 1386

Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

I think this should be invented layers first if it's intended to become a credible standard.

I would suggest starting with a message protocol that would be workable over a simple RS232-like serial line. Start with the assumption that basic error detection and recovery will be needed, so the protocol isn't restricted to media that has it built in. Then worry about adding abstractions to it like chunking. USB-CDC and bluetooth can both abstract a serial line "out of the box". If you invent a protocol that's centered around USB-HID, then nobody who wants to extend that protocol to work over TCP/IP or radio waves or named pipes or tin can telephones or light sensors vs flashing boxes on a screen or any of the other useful ways to move data is going to want to deal with implementing "USB-HID" contingencies where they aren't needed.

jim618

legendary

Activity: 1708

Merit: 1066

A few points and questions:

Timeout
You state:

Standard message flow:
3. Although some responses are sent by device instantly (like GetUUID), most of them are blocking, because they requires manual confirmation (pressing the button) by the user. Bitcoin client should be aware of this and implement communication with the device in separate thread, to not block client's UI or other functionality.

Is it worth adding in a timeout which is, from the computer's perspective, equivalent to a 'Cancel' operation ? i.e Device asks user to 'Confirm or Cancel'. User does nothing and after, say, 60 seconds computer says "Ok, that is a cancel" and moves to a more general "nothing happening state". It stops you getting stuck waiting forever for a button press that never comes.

PIN/ OTP exchanges
In 'Standard message flow, first paragraph' you mention that the computer initiates the conversation but with the PIN and OTP the device sends the PinRequest / OTPRequest. Do these occur as responses to the GetEntropy computer request only ?

i.e. valid exchanges are:

C: GetEntropy()
D: Entropy(entropy)

or

C: GetEntropy()
D: OtpRequest()
C: OtpAck(otp)
D: Entropy(entropy)

or

C: GetEntropy()
D: OtpRequest()
C: OtpAck(otp)
D: PinRequest()
C: PinAck(pin)
D: Entropy(entropy)

also, is it valid to have the PinRequest/PinAck before the OtpRequest/OtpAck ?

(just trying to pin down what are valid exchanges and which are invalid).

Checksum and Chunking/ Dechunking
Does the bytestream for USB have guaranteed integrity ? IE is it necessary to have a checksum for each of the 64 byte message or is that taken care of in the transport protocol ? If required we could simply have 1 checksum byte = XOR(payload bytes) i.e.

1B = payload length
up to 62B = payload
1B = checksum defined as XOR(payload)

Also, you state that the PB message is chunked into 64 byte packets. When they are 'dechunked' how do you know the boundaries of the packets to stitch them back together to recreate the PB messages? Do you need "this protobuf message is chunked over X packets' at the beginning of the PB message.

I am just thinking of how to make the 'chunking' and 'dechunking' as simple and unambigous as possible. Ideally you want it a dumb transport layer that understands nothing of what is in the messages.

stick

sr. member

Activity: 441

Merit: 268

Quote from: someone42 on November 16, 2012, 07:31:06 AM

I was going to come here and whinge about your choice of protocol buffers, but after a bit more research, I'm liking them more. Their encoding is quite lean; I was worried that it would be like XML. Also, there are tiny parser libraries (eg. nanopb) written in C out there.

Same here. I was a little bit sceptical when I saw the generated code from http://code.google.com/p/protobuf-c/ (pure C generator for Protocol Buffers, original implementation supports just C++), but I was quite happy when I saw the code that came out of nanopb - http://code.google.com/p/nanopb/

stick

sr. member

Activity: 441

Merit: 268

Quote from: someone42 on November 16, 2012, 07:17:20 AM

The only disadvantage I've come across in my research: apparently, in Linux, you need to detach a kernel driver in order to properly communicate with a HID class device. This can be done on most systems, but I worry that it will break on some Linux configurations. Please correct me if the situation is different now.

Detaching a kernel driver is just one function call from libusb or a similar library. Also some combinations of VID/PID (e.g the ones Silabs CP2110 use) are IIRC not interpreted in Linux kernel. So I don't think this is a big issue ...

slush

legendary

Activity: 1386

Merit: 1097

Quote from: deepceleron on November 16, 2012, 07:35:49 AM

Want crazy talk? How about Bluetooth?

No way because of battery...

Edit: However HID over Bluetooth is still possible, so even if we will standardize HID device, some future wallet devices can use Bluetooth and still have compatible protocol.

someone42

member

Activity: 78

Merit: 11

Chris Chua

Quote from: slush on November 16, 2012, 07:28:56 AM

FTDI don't work in Windows out of the box and the installation is a bit weird. I spent some time to get my BFL Single working on Win7 :-/.

We want to use USB-to-serial for the Raspberry Pi shield (as RPi don't have usable USB port for HID device), too. But from the computer side I think there should be just one supported transport - USB HID.

I was mistaken about FTDI drivers on Windows.

Looks like it is USB HID then.

deepceleron

legendary

Activity: 1512

Merit: 1036

Want crazy talk? How about Bluetooth?

Applications...
-Transfer of files, contact details, calendar appointments, and reminders between devices with OBEX.
-Replacement of previous wired RS-232 serial communications in test equipment, GPS receivers, medical equipment, bar code scanners, and traffic control devices.

Topic: Hardware wallet wire protocol - page 4. (Read 8757 times)