Author

Topic: crypto software - writing the grotty bits. (Read 2013 times)

hero member
Activity: 836
Merit: 1030
bits of proof
December 07, 2014, 11:47:49 PM
#14
I am not questioning the value of skills in the narrow scope Cryddit discussed.

I think it is still a good advice to draw the border where skill dominates code quality as narrow as possible.
Ignoring this leads to a) poor system level security or b) slow pace of development.

There are countless exploits of a) in systems that actually use well written crypto primitives. The world is in desperate need for highly complex systems to protect freedom and privacy, therefore pace of development is not a negligible aspect.
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
December 06, 2014, 09:32:09 AM
#13
Mark Karpeles did maintain libtomcrypt, something I only recently found out. Thought I'd share this.
staff
Activity: 4284
Merit: 8808
December 06, 2014, 08:17:50 AM
#12
On the last bit, I have to take grau's side. Porting low-level code to a high-level language may need a bit of effort, but the amount of pointer-safety checks, fuzz tests and unit tests are tremendously reduced. Also, not every coder can write secure low-level code. If a project needs community contribution, being in a low-level language may be a burden for more novice programmers.
And certain kinds of safety, in particular the kinds of safety Cryddit was almost exclusively talking about become impossible.  Meanwhile, in the kind of codes Cryddit is talking about many of the concerns you'd hoping to address are already structurally impossible, and provably so... e.g. because they do _no_ dynamic memory allocation at all, because there are no data-dynamic memory accesses at all, or the accesses are very efficiently bounds checked via masking, etc.  This also can apply to different kinds of software than Cryddit was talking about, ones where worse case memory or cpu complexity is a significant security constraint due to the need to resist attacks in the form of pathological inputs, or suppress timing side-channels.
hero member
Activity: 508
Merit: 500
Jahaha
December 06, 2014, 07:52:12 AM
#11
Very informative discussion and I really appreciate the effort.

On the last bit, I have to take grau's side. Porting low-level code to a high-level language may need a bit of effort, but the amount of pointer-safety checks, fuzz tests and unit tests are tremendously reduced. Also, not every coder can write secure low-level code. If a project needs community contribution, being in a low-level language may be a burden for more novice programmers.
hero member
Activity: 836
Merit: 1030
bits of proof
December 04, 2014, 05:06:20 AM
#10
I am always puzzled to see how crypto community is still insisting on skills instead of technology if it comes to reliable implementation.

May I suggest some modern precautions:

Use a programming language that:
- is immune to stack manipulation, buffer overflows and has no pointer arithmetic or zero delimited strings e.g. Java
- has immutables instead of synchronization e.g. Scala


You're talking about things that protect against semantic mistakes, which are good.  But cryptographic code has requirements that are not considered in semantic models; we have to avoid creating information leaks.


I understand that a language with a manual memory management gives more control. I doubt however that the control is practically effective. Unless your crypto code is running embedded in a secure computing device, you can not be sure that  its memory does not leak information to some virtualization layer.

So yes, by using modern languages you lose some control, but in my opinion you gain much more reliability by excluding well known sources of problems that are widespread in implementations using lower level languages.

BTW, those problems are not "just" semantics but rich sources of all kinds of exploits.
legendary
Activity: 924
Merit: 1132
December 04, 2014, 01:55:04 AM
#9
I am always puzzled to see how crypto community is still insisting on skills instead of technology if it comes to reliable implementation.

May I suggest some modern precautions:

Use a programming language that:
- is immune to stack manipulation, buffer overflows and has no pointer arithmetic or zero delimited strings e.g. Java
- has immutables instead of synchronization e.g. Scala


You're talking about things that protect against semantic mistakes, which are good.  But cryptographic code has requirements that are not considered in semantic models; we have to avoid creating information leaks.

I'm in favor of using more advanced languages, if we can find an implementation of them that promises to do no copying, no relocating variables, never allocates its own buffers over areas that might still contain private values, doesn't "optimize away" writes to variables when future values don't mean anything or at least lets us order it not to, lets us know when register spills write things to memory, etc. 

I really LIKE nice languages that have good protections against semantic mistakes, and use them for nearly every non-crypto project.  But I have never found a language that both has good protections against semantic mistakes of the kind you're talking about *AND* sufficient control over low-level side channel leaks that it can be used for crypto code. 

If I have to do without one or the other, I'll do without the one that I can at least possibly make up for with sufficient care and testing.

In other news, someone has sent me a private message with a link I'd like to share with y'all:  It turns out we're not the first people to have this conversation.

https://cryptocoding.net/index.php/Coding_rules
hero member
Activity: 836
Merit: 1030
bits of proof
December 03, 2014, 10:27:52 PM
#8
I am always puzzled to see how crypto community is still insisting on skills instead of technology if it comes to reliable implementation.

May I suggest some modern precautions:

Use a programming language that:
- is immune to stack manipulation, buffer overflows and has no pointer arithmetic or zero delimited strings e.g. Java
- has immutables instead of synchronization e.g. Scala
legendary
Activity: 924
Merit: 1132
December 03, 2014, 04:24:15 PM
#7
It's hip these days to include buttons on your projects for test coverage. 100%! or 97.3%!. For the above reason, I feel like this is misleading and is an anti-pattern. Not that 100% test coverage is a bad thing, but that it leads to a false sense of security with regards to your codebase quality. 100% coverage is a pretty low standard to be achieving.

Heh.  I have actually had a manager come to me with push-back on code I wrote because it could not reach 100% coverage.  On investigation it turned out that the code the fuzz tester could not reach was pretty uniformly of the type,

Code:
/* you can't have two items the same value in a permutation! */
if ( P[x] == P[y]){
    DEBUGPRINT("identical values detected in PRNG state permutation after return from PSWAP.\n");
    assert(1 == 0);
}
IE, it was code the fuzz tester *SHOULDN'T* ever be able to reach.  Eventually we settled for modifying the DEBUGPRINT macro to have #ifdef DEBUG blocks, so he could get his 100% coverage on the production build but I could still have it write a message and crash instantly on detecting an error in the debug build.

By the way, "Write a message and crash instantly" is, by far, the easiest thing to debug.  Throwing exceptions that get handled far from detecting the error is NEVER a good idea.
staff
Activity: 4284
Merit: 8808
December 03, 2014, 12:15:56 PM
#6
I feel like this is misleading and is an anti-pattern. Not that 100% test coverage is a bad thing, but that it leads to a false sense of security with regards to your codebase quality. 100% coverage is a pretty low standard to be achieving.
Also a focus numbers are kind of silly, it's fine for the tool to not report 100% when it's really 100% of the actual code that runs... obsessing about the number is likely to lead to bad incentives, like removing error handling code. Also, 100% line coverage is way different from 100% branch coverage.  I mention coverage because it's insanely useful to just look a it, regardless of the numbers, and because some powerful testing approaches I mentioned are not possible unless you have basically complete coverage.

For things like small security cryptographic kernels and other focused high risk work its pretty reasonable to expect basically complete coverage; less so on other code bases.

(Basically complete:  Sometimes you may have error handling code which you cannot trigger. You may _believe_ it is impossible to hit, but you are unable to _prove_ that it is so you cannot remove the code and shouldn't even consider removing it. Or you have some harness goop that lowers your reported coverage. All these things should get careful review, but they shouldn't result in you feeling you failed to get complete coverage.)

Quote
I'm sure I'm not the only one that sometimes wonders how software manages to function at all. It often feels as if there are thousands of potential entry vectors on my machine and just one is enough. But that doesn't mean it's not worth trying our best.
It's pretty bad.   I mean, often programming tools are some of the most heavily tested and reliable pieces of software (after all, many of their users are well qualified to find, report, and fix bugs in them) ... and yet I don't consider my code well tested unless I've found a new toolchain or system library bug while testing it.

The priority comments are pretty good too.  The complete result is what counts.
member
Activity: 96
Merit: 10
esotericnonsense
December 03, 2014, 11:58:24 AM
#5
I have nothing to add here as someone who has (at best) a novice level of C/C++ knowledge.
But I want to thank you both for adding to that.
These are the issues that everyone should have on their minds when writing sensitive crypto code.

As Gavin says in a round-about way; we can only do our best with the time we are given, but we do have a responsibility here.

I'm sure I'm not the only one that sometimes wonders how software manages to function at all. It often feels as if there are thousands of potential entry vectors on my machine and just one is enough. But that doesn't mean it's not worth trying our best.

edit: Just noticed this gem from gmaxwell's post:
Quote
It deserves the extra work to make it completely right, and the users who will depend on it deserve it too.
.

Emphasis on the latter point there, even if none of the technical points sink in.
You're engineering a car. Alice does not necessarily know and cannot be expected to know the potential failure points.
legendary
Activity: 1652
Merit: 2311
Chief Scientist
December 03, 2014, 08:45:50 AM
#4
Excellent advice.

I'd add: you never have infinite time, so you will have to prioritize.

Cryddit's original post talks a fair bit about preventing data leakage in side-channel attacks; I'll just say that if you only have time to either get 100% code path unit test coverage or hand-code some assembly to workaround your compiler leaving a private key in memory instead of a register... I'd work on the test coverage.

And if the choice is between 100% test coverage versus 91% with support for threshold signatures on multiple devices-- I'd choose threshold signatures.

And, of course, the highest priority is creating a product or service that is both trustworthy and that people want to use.
hero member
Activity: 543
Merit: 501
December 03, 2014, 06:03:52 AM
#3
Overall I tried to look for various tools that would help with all of these guidelines. Turns out there are a bunch. Could we get a list of tools that are generally preferred when trying to verify code integrity? Got:

(don't quite know how to organize these)
valgrind
Coverity, clang-static-analysis, cppcheck, pc-lint
frama-c
AFL, KLEE

In particular I'm interested in tools relating to mutation testing.

It would probably also be useful to have links to a bunch of pages describing each of these topics, at least where they exist. For example, a blog post on mutation testing, another on integer promotion, etc. Originally I was going to supply them myself, but later I realized that I wouldn't know if I had picked a link with bad information.

On testing. BE SURE TO TEST THE CASES WHICH SHOULD NOT AND ESP. "CANNOT HAPPEN"  I have seen ... just dozens.. of programs fail to have any security at all because they were JUST NEVER TESTED WITH FAILING CASES.  agreed there.

It's hip these days to include buttons on your projects for test coverage. 100%! or 97.3%!. For the above reason, I feel like this is misleading and is an anti-pattern. Not that 100% test coverage is a bad thing, but that it leads to a false sense of security with regards to your codebase quality. 100% coverage is a pretty low standard to be achieving.
staff
Activity: 4284
Merit: 8808
December 03, 2014, 01:45:38 AM
#2
I guess the most important thing to know is that in C there are no promises that the compiler will not undermine you with leaks... the compiler is only obligated to produce visibly identical behaviour to the c abstract machine.

Good techniques are good and I thank you for sharing ... I'm just also sharing some caution because it's easy to get caught up in the ritual of techniques and miss that there are limits.

Consider, the compiler is free to spill registers to random local variables when it can prove that their value is irrelevant (e.g. because it'll never be read again, or because it will just be overwritten).  They do this.  So now you have memory with your secret data that your zeroizing functions don't hit.

It gets worse, of course data in the registers ends up being sensitive... and you call another function, and it calls pusha (or friends) and pushes all registers onto someplace else on the stack. ... even if this doesn't happen with your current toolchain, if you depend on it and you're not testing, ... well, you better not ever upgrade your toolchain.

And then when you're on a multitasking operating system the kernel can go copying around your current state at basically any point.

Use of volatile sounds interesting but take care: Until fairly recently volatile was fairly buggy in GCC and CLANG/LLVM because it's infrequently used, beyond often doing nothing it all it was sometimes causing miscompilation.  I say was here primarily because more recent testing with randomly generated source code fleshed out a lot of bugs... so I don't think I'd suggest this if you're targeting GCC prior to 4.8. (Or any compiler which hasn't been subjected to the same extensive randomized testing that GCC and CLANG have been).

On testing. BE SURE TO TEST THE CASES WHICH SHOULD NOT AND ESP. "CANNOT HAPPEN"  I have seen ... just dozens.. of programs fail to have any security at all because they were JUST NEVER TESTED WITH FAILING CASES.  agreed there.

Fuzz test too, when you write all the tests yourself you'll fail to discover your misconceptions, fuzzing can help.  Non-uniform randomness can be more useful, long runs of zeros and ones tend to trigger more behaviour in software. Whitebox fuzzers like AFL (and KLEE, though its a pain to run) can get you much more powerful fuzzing when testing a complete system, though for unit tests you don't need them generally.

Instrument your code for testability. If there is some part thats hard to reach, make sure you have a way to test it.  Use branch coverage analysis tools like lcov to make sure you're covering things.

Write _lots_ of assertions. The computer is not a mind reader, it won't know if your expectations have been violated unless you exposed them. The asserts can be used only during testing, if you really want (e.g. performance concerns or uptime matters more than security).

Assertions make all other testing more powerful, time spent on them has super-linear returns.

Test your tests by intentionally breaking the code, both manually and by just incrementally changing + to - or swapping 0 and 1 or > and <, etc.   You cannot use this kind of mutation testing successfully, however, until you have 100% test coverage, since executed code is obviously safe to mutate.

I have found many one in a billion input scale bugs by mutating and improving tests until they catch all the mutations.

There are tools for 'testcase reductions' for finding compiler bugs like: http://embed.cs.utah.edu/creduce/  You can run it on your normal code and add tests until its unable to remove anything but formatting.

Coverity, clang-static-analysis, cppcheck, pc-lint, are useful informal static analysis tools. You should be using one or all of them at least sometimes.

If the codebase is small, consider using sound/formal static analysis tools like frama-c, at least on parts of it.

Valgrind is a life-saver. Learn it. Love it. (likewise for its asan cousin in GCC and clang).

Don't leave things constantly warning, figure out ways to fix are silence the harmless warnings or you'll miss serious ones.

Test on many platforms, even ones you don't intend to target... differences in execution environment can reveal bugs in code that otherwise would go undetected but were still wrong. Plus, portability is a good hedge against the uncertainty of the future.  I've found bugs on x86  by using arm, pa-risc, and itanium which were real bugs but latent on x86 but immediately detected on other platforms because of small differences.

Unit tests are nice and important, but don't skip on the system tests. A lot of bugs arise in the interactions of otherwise correct parts.

Don't shy away from exhaustive testing. Have an important function with <=32 bits of input state space? You can test every value.

Testing once isn't enough: All these things can be done from continuous integration tools. Every commit can be tested, and for the random tests they continue to add more testing... not just wasted cpu cycles.  I've spent time proving code correct or extensively fuzzing it, only to later accept a patch that adds an easily detected misbehaviour, if only I'd redone the testing. Automating it makes it harder to forget to do it or lazy out of it when the change was "obviously safe".  Obviously safe code seldom is.

Quote
but there is no way to get around it because AFAIK no other language allows me to absolutely control when and whether copies are made,
Yes, its the norm in every place other than C for any code you've not written to be effectively a non-deterministic black box. C++ can be less than awful if you subset it enough that you're almost no better off than with C.

Quote
A lot of crypto software makes extensive use of global variables for sensitive values.  They are fast, never get deallocated or (accidentally) copied during runtime
They can be copied at runtime.

Quote
I always use unsigned integers.
Do take care, a lot of C coders cut themselves on the promotion rules around unsigned. Use of unsigned in loop counters results in lots of bugs in my experience, esp with less experienced developers. Take the time to really learn the promotion rules well.

Quote
You can't even check to see if undefined behavior has happened, because the compiler will go, "oh, that couldn't happen except for undefined behavior, and I can do whatever I want with undefined behavior.  I want to ignore it."
Yes, though you can check in advance if it would happen, without causing it and avoid it.  Though experience suggests that these tests are often incorrect.   GCC and CLANG now have -fsanatize=undefined which instruments signed arithmetic and will make the program scream errors at runtime.  Not as good as being statically sure the undefined behaviour cannot happen.

Quote
I tend to use do/while loops.
I have used this same construct. Also with the masked variables.

I've intentionally made objects larger and null filled so that masked access were guaranteed safe. e.g. char foo[27]; foo = 3; becomes char foo[32]; foo[i&31] = 3;.

Some other things:

Avoid recursion, and no recursion at all unless you can statically prove its depth.  Running out of stack is no joke, and a source of many embarrassing (and even life threating bugs). It's not worth it.

Avoid function pointers.  They give you _data_ that controls the flow of your program. ... which may be somehow writable to attackers.   When a function pointer can't be avoided completely try to make it a bit-masked index into a power of two size const array which is filled with zeros in the slack space.

Sidechannels are very hard to avoid and even more likely to be undermined by the compiler into leaks.  First assume you can't prevent them, and try to be safe regardless.  Bitops can get you constant time behaviour in practise, including loads.. e.g. and the thing you want to load with ~0 and the things you don't want to load with 0, and or the results; its tedious and the compiler can still helpfully 'optimize' away your security. ... but the next option is writing things in assembly, which has its own risks.  Valgrind warns on any control flow change (branches!) on 'uninitialized data' there are macros you can use to set any bytes you want to uninitialized, so you can make your secret data uninitialized and have valgrind warn about branches (though it's not completely sound, some warnings get suppressed).

Get comfortable with gcc -S and objdump -d ... reading the assembly is the only way to know for sure what you're getting, and the only way that you're going to discover that your presumed branchless code has been filled with jumps by the helpful compiler. Likewise, you can make your secrets have a distinctive pattern and dump core when you think things should be clean, and confirm if they actually are or not.

It's possible to make wrapper functions that call your real function, and then another dummy function that uses as much stack as your real function and zeros it all. This is one way you can stem the bleeding on stack data leaks..

More recently I leaned that dynamic array accesses are not constant time on all hardware, even if you're sure to always hit the same cache-line pattern, necessitating using masking for loads when the indexes would be secret data.

In some cases it can be useful to make your memory map can be made sparse and your data surrounded in a sea of unaccessible pages, allowing your hardware MMU to do some of the boundary checking for free.  This class of approach is used by asm.js and the classic 'electric fence' memory debugging tool.

Data-structures can have beginning and ending canary values. Set them after initializing them, check them whenever you use the data structure, zeroize them when the datastructure is freed... find cases when code accidentally uses a freed or non-allocated data structure much more often. Esp when crossing a boundary of who-wrote-this-code.

GCC has annotation for function arguments which must not be null and for functions to have results which must not be ignored. These can be used to turn silent mistakes into warnings.  But take care:  not-null teaches the optimizer that the argument cannot be null and _will_ result in optimizing out your not nullness checks, so use them in your headers but don't compile your code with them. (see libopus or libsecp256k1 for examples).

Complexity is the enemy of security. Seek simplicity. They say that code is Nx harder to debug than it is to code, so if you're coding at your limit you can't debug it.  When debugging, you at least have some positive evidence of the bad thing that happened, ... you know badness of some form was possible, at the very minute. Making secure software is even harder, because nothing tells you that there was an issue-- until is too late.

At times I suspect that its (nearly-) impossible to write secure code alone. Another set of eyes, which understand the problem space but don't share all your misunderstandings and preconceptions, can be incredibly powerful, if you can be fortunate enough to find some. They can also help keep you honest around short cuts you make but shouldn't, test cases you skip writting. Embrace the nitpicking and be proud of what you're creating. It deserves the extra work to make it completely right, and the users who will depend on it deserve it too.

legendary
Activity: 924
Merit: 1132
December 02, 2014, 05:40:46 PM
#1
It occurs to me that the craft of designing and writing cryptographc software so that it doesn't leak sensitive information is, at best, a
black art, and most of the necessary techniques aren't widely known, documented, or shared.

I bet there are a thousand tips and tricks that most of us have never written down because they are grotty details that aren't very
interesting mathematically.  I would love to hear the coding techniques that others have developed to control side channels and write software that is both reliable and doesn't leak information.

Here are a few of mine.

First, I write unit tests for everything.  Often even before writing the things.  It's astonishing how often crypto code can look like it's
doing the right thing while it's actually doing something very subtly different that you'd probably never notice was different. Like having the wrong period for your implementation of some LFSG-based thing like the Mersenne Twister - get it wrong by one bit, and the output still looks good, but it'll be insecure and have a period way shorter than you expected, giving your software  an exploitable bug.  Or checking for certificate revocation, then accepting a certificate that's been revoked anyway.

Or getting encrypt-decrypt that reproduces the plaintext, but doesn't produce the same output on test vectors as the standard implementation you're trying to be compatible with (and if it's not producing the same output, very likely that's the standard implementation that's more secure than yours....)  Or finding the key server unavailable and then falling back to accepting a certificate without displaying the error it was supposed to display about not knowing whether the certificate is good.  Anyway, write lots of unit tests.  Not only will they make sure you know what your routines are doing, they'll also tell you when you break something.

The debug build calls the unit tests, and they  write a file of test results. The test results file, rather than the executable, is the usual
makefile target, and the makefile instructions for building test results end with 'cat results.txt' which appends the test result output (ie, a blank line followed by listings of any unit test errors) directly to the compile output, just like any other errors that need fixing.

If I get any choice about it, I try to do all the crypto in a single-threaded, dedicated executable that does very little else.  It's
much easier to analyze and debug a single-threaded application.

Know your compiler options.  Use the diagnostic and standard-enforcing ones heavily.  Use those that enable security extensions peculiar to that particular compiler if it helps (such as stack canaries or an option to zero all memory allocated to your program on exit)  but do your best to write code that does not rely for its security solely on those extensions, because sooner or later someone will need to compile it on a different compiler.  Eschew any standard-breaking extensions that won't work (especially if they will also cause errors) in different environments.

I write in C using the standard libraries and the GMP bignum library. GMP has some not-very-widely known primitives that are specifically for cryptographic use and leave nothing on the stack  or in buffers. They're a little slower than the "normal" set of calls, but that's okay. The C standard libraries can mostly be trusted to do exactly what they say and nothing more.  This is kind of a shame because other languages have nice facilities, less "undefined" behavior, and considerably more protection from programmer mistakes, but there is no way to get around it because AFAIK no other language allows me to absolutely control when and whether copies are made, when and whether writes to variables actually happen, etc, as well. Which isn't high praise for C, because it's still hard and grotty and error prone.  But at least it's possible.

The runtimes, templates, and libraries that come with most languages (and here I include C++) can be trusted to do what they say, but without going over a million lines of difficult, heavily #ifdef'd template code with a fine toothed comb I can't trust that they do nothing more. Therefore I don't know how to write secure software in those languages and be certain that it won't leak information. They accomplish their semantic goals well, but do so while leaving copies, and fragments of copies, of everything they touch in their objects, in unused areas of their allocated buffers until those buffers are actually used for something else, and on the stack.

A lot of crypto software makes extensive use of global variables for sensitive values.  They are fast, never get deallocated or (accidentally) copied during runtime, new values always overwrite previous values instead of landing at new places in memory possibly leaving copies of the old values somewhere, and they avoid indirections that might leave pointers to them lying around.  The only thing even a little bit subtle is making sure that they get erased as soon as the program doesn't need them anymore, and again before the program exits, and that's not hard. A pattern emerges with a dedicated 'eraser' routine that sets them all to bytes read from /dev/random before program exit. The read from /dev/random can't be elided by the compiler because it is a system-level side effect.  But if you read from /dev/random into a buffer and then copy from the buffer to the sensitive variables, the compiler can elide that because those are writes to dead variables. What's necessary is to read bytes from /dev/random *directly* into the global variables, one at a time if necessary.  It helps if you define a singleton record type that holds them all; that way you can just overwrite the record instead of doing one at a time.

That pattern is fine for globals that you clear once or twice per program run and again on program exit, but I/O, even from /dev/random, is too slow for using on return from every subroutine that handles sensitive variables. I kind of don't like using global variables. I don't like the idea that every part of the program has access to them. But I have certainly used them.

C also gives you variables with 'file' scope, which is another way to have something that you can't deallocate or lose track of and allows you to limit access to the sensitive variables to JUST the routines defined in one file.  That's probably a better pattern than globals.

I use the 'volatile' keyword a lot, to designate local variables so that writes to those variables will never be skipped, even if writing to them is the last thing that happens before the procedure they're allocated in returns. I know of no other language besides C that allows that. 'volatile' allows me to easily use local variables and avoid leaving anything sensitive on the stack, but do not use it for anything that's not sensitive.  For example if you use a volatile variable to index a loop, the loop will run slow because the code has to write that variable to cache every iteration rather than just mapping it to a register. It's better to just have a regular auto variable that you use for that.

A very good solution to the problem is to allocate a large 'security' buffer as a local variable in main(), and have the program manage its own stack for sensitive variables.  The way that works is that when main() calls anything, it gives it a pointer into the security buffer, and the size of the buffer. A called routine checks that the buffer is large enough and uses as much of the buffer as it needs for its own sensitive locals by casting the pointer at the buffer into a pointer at a record type that contains its sensitive locals.  If it calls anything else that has sensitive local variables, it does so with a pointer just past the end of its record, and the size it got for the security buffer minus the size of its own sensitive-locals record.   As with globals, before program exit you call an 'eraser' that overwrites the entire buffer with bytes from /dev/random.

The benefit of this is that main() retains control of the buffer.  It doesn't make the system slower the way 'volatile' does.  And if multiple routines both read and write in the buffer, the compiler can never elide writes into it - so the routines can easily and reliably clear any sensitive vars that they *aren't* passing back to their caller before they return.  It's probably safer than using 'volatile' local variables, because it's possible to forget to clear a sensitive 'volatile' before returning but it is NEVER possible to forget to clear the security buffer before exiting - that would definitely break your unit tests. The downside of this is that handling the buffer tends to be a pain in the @$$, which is why I tend to use 'volatile' instead.  I do use a designated buffer for VERY sensitive variables, such as passwords. Absolutely no routine, anywhere, gets to make its own copy of a password that lives outside the buffer, and main() writes over that as soon as the key is set up.

I've seen crypto software where sensitive locals were allocated using the 'static' keyword to ensure that local variables with sensitive
information are not deallocated when the function returns.  This prevents leaks during runtime, and with static variables, the compiler can't USUALLY elide writes, so the called subroutine can usually clear the values of its sensitive locals before returning. But it's  not a technique I trust, because compilers are ALLOWED to elide final writes to static variables if it can prove that the initial values of the static variables don't matter to the routine whose scope they're in, and static locals are strictly worse than globals for leak prevention because main() has no way to overwrite all those before it exits. Every one of them gets released when the program exits, and you just don't know what's going to be the next program to allocate the block of memory where they were contained.

I always use unsigned integers.  In fact this is something I learned rather recently, due to an issue raised on the cryptography list.  The basic mathematical operators (addition, subtraction, multiplication) can overflow, and overflow on signed integers (with the usual wraparound semantics that can give a negative result of adding two positive numbers) is undefined behavior.  If you must use signed integers and check afterward for overflow/wraparound, you must add them as though they were unsigned integers, like this:

(unsigned int)z = (unsigned int)x + (unsigned int)y;

because overflow on unsigned integers *is* defined behavior.

You can't even check to see if undefined behavior has happened, because the compiler will go, "oh, that couldn't happen except for undefined behavior, and I can do whatever I want with undefined behavior.  I want to ignore it."

It will then cut the checks and everything that depends on them out of your program as 'dead code'.  So you can have an integer that is in fact negative because of a wraparound that occurred while adding two positive numbers, and the program will jump straight past a check for a negative value without triggering it.

Crypto coding has taught me to use a few other unusual bits of coding style. In crypto, we tend to allocate buffers, permutations, etc that are 'round' numbers like 0x100 bytes or 0x10000 16-bit values.  Because we're using unsigned integers anyway, indexing into these buffers using uint8_t or uint16_t variables gives us an automatic range safety check.  But it is hard to write 'for' loops that exit if we're iterating over the whole buffer, so instead of 'for' loops I tend to use do/while loops.  If I want to do something like initializing a permutation with every value of a uint16_t for example, my usual idiom is to write something like

count = 0; do {
  permutation[count] = count;
}while (++count != 0);

This is also an example of a habit that suits me when in full-on paranoia mode to never leave any local variable (such as count) with a nonzero value if I can help it, whether *I* think that variable is sensitive or not. If I leave something with a nonzero value on routine exit, I try to leave the same constant value in it on every exit.

So that's a few bits about writing non-leaking crypto code.  I've been more concerned with preventing memory-based data leaks, obviously, than controlling other side channels like timing or power use.

Would anybody else here like to share some of the techniques they use?
Jump to: