libsecp256k1 is a cryptographic library that implements operations on the elliptic curve used by Bitcoin. Its goal is to be a faster and safer implementation than OpenSSL, and will be used by default by the Bitcoin Core 0.12 release. It has been designed specially to avoid leaking any private information via timing or power sidechannels. There is no current way to test the latter, so I set out to make a continuous integration test that can automatically determine the susceptibility to a power sidechannel.
Another excellent analysis was done for the TREZOR:
https://jochen-hoenicke.de/trezor-power-analysis/.
Here, to get repeatable results, I wanted to intentionally set up everything in the attacker's favor. I needed a chip, a measurement point, and a data collection device. Here's a picture of the setup:
https://i.imgur.com/xF2s7ki.jpgFor the chip, I chose a STM32F427 - in particular, on one of ST's Discovery kits. This is the largest chip offered by ST without any caches, and is also very similar to the chip in the TREZOR. The chip is fully static, so I can run it at any clock rate I desire. The Discovery board also includes an off-chip DRAM that is currently unused. It is well supported by gcc and has a handy on-board USB JTAG programmer.
The board has a current measurement point on the 3.3V rail to the STM32. The chip then has a few internal regulators that provide internal core voltages. Measuring current on these rails would be ideal, but there is no easy way to do so. The rails are exposed off-chip to external regulator capacitors, which could be measured, but for simplicity I have stuck to the 3.3V rail.
https://i.imgur.com/xqf144l.jpgThe current shunt is amplified with a cobbled-together instrumentation amp with single ended output. This goes into a LFRX board on a USRP software radio. The connection is DC coupled, which allows me to read absolute current, but I haven't determined if this has any real value yet. The LFRX has both a D and Q input - only the D input is currently used. I'm currently sampling at 1MHz with the microcontroller running at 168MHz. The clocks are not synchronized. Ideally I would like to be running at least a 1:1 clock ratio and be synchronized.
The setup currently works, though it is quite noisy. Here, 10 signing iterations are visible against the background USB noise:
https://i.imgur.com/FhUzJHh.pngThe 64 iterations of ecmult_gen:
https://i.imgur.com/AiiTnD9.pngNext steps over the weekend:
- Connect the Q input to a GPIO to precisely slice and dice the measurements
- Build new amplifier with higher gain
- Use higher gain to increase sampling speed / slow down micro
- Figure out what sort of DC removal / normalization is necessary