Recently I've seen couple of posts about timing attacks against the trezor-crypto library. Most notably this post:
http://www.reddit.com/r/Bitcoin/comments/2u1wea/trezor_code_no_longer_lgplv3_but_now_more/co4iomt and the response to it + image
https://i.imgur.com/ON4FxD5.pngI'd like to say here why I believe it's not an issue and I'm looking forward for answers, especially from the guys who claim this on reddit.
First of all, I want to acknowledge that library reveals some timing information. No doubt about it. I would never use it in multi-threaded environment of a web server. But I believe that exploiting it in Trezor is either impossible, or too expensive to be worth the effort. For use of DPA attack you would need to capture tens of thousands of signatures with the same key which is in contradiction with how Trezor is used in practice. And SPA attack is hard. Not impossible, but hard and expensive.
If the Trezor is stolen, you cannot sign transactions at all and if you could, you don't need to attack anything anymore. So let's talk about the remote attack: In this case I claim that you just don't have the accurate data to do SPA attack. I saw the antenna recordings:
https://i.imgur.com/ON4FxD5.png from user 76951234, but guess what: If the library would not leak ANY side channel information, the readings would look EXACTLY the same, so this shows nothing.
So let's talk how precise data you would need to make a successful SPA attack against Trezor. Basically, you would need to know one by one, which elliptic curve points are being added. This is just one piece of code that you would need to know how it went:
1 : ldr r9, .L68
2 : ldr fp, .L68+4
3 :
4 : .L68:
5 : .word secp256k1_cp
6 : .word secp256k1_cp2
7 :
8 :
9 : tst r1, #1
10: beq .L49
11:
12: mla r0, r7, r4, fp
13: mov r1, r6
14: bl point_add
15: mov r4, r5
16: b .L46
17:
18: .L49:
19: mla r0, r7, r4, r9
20: mov r1, r6
21: bl point_add
22: .L46:
On 9th line, there is tst instruction that branches the code to either: 12, 13, 14, 15, 16, 22 OR 19, 20, 21, 22 where lines 14 and 21 are calls to the same function point_add, but once with argument fp, and the other time with r9 (set at lines 1 and 2). In point_add you access memory at either fp or r9 so that may leak some timing as well, but it would be difficult to distinguish which memory is read, because all those data are in one continuous block. Also, point_add does not branch on the given data but rather on preprocessed values so again it's difficult from the timing of point_add to decide which branch in this code was taken. So it comes down to capture whether the sequence was 12, 13, 14, 15, 16, 22 OR 19, 20, 21, 22. Since 13 = 20 and 14 = 21 and instructions on lines 12 and 19 are similar, you basically need to read from side channel whether lines 15 and 16 were executed or not. I claim that if you can read such a precise information from side channel, it does not matter whether the code leaks or does not leak timing information. If you can read data on instruction level, then this is not fixable in code. I also think that if it's even possible, then such attack would require some kind of EXTREME equipment. Any thoughts?