I guess the most important thing to know is that in C there are no promises that the compiler will not undermine you with leaks... the compiler is only obligated to produce visibly identical behaviour to the c abstract machine.
Good techniques are good and I thank you for sharing ... I'm just also sharing some caution because it's easy to get caught up in the ritual of techniques and miss that there are limits.
Consider, the compiler is free to spill registers to random local variables when it can prove that their value is irrelevant (e.g. because it'll never be read again, or because it will just be overwritten). They do this. So now you have memory with your secret data that your zeroizing functions don't hit.
It gets worse, of course data in the registers ends up being sensitive... and you call another function, and it calls pusha (or friends) and pushes all registers onto someplace else on the stack. ... even if this doesn't happen with your current toolchain, if you depend on it and you're not testing, ... well, you better not ever upgrade your toolchain.
And then when you're on a multitasking operating system the kernel can go copying around your current state at basically any point.
Use of volatile sounds interesting but take care: Until fairly recently volatile was fairly buggy in GCC and CLANG/LLVM because it's infrequently used, beyond often doing nothing it all it was sometimes causing miscompilation. I say was here primarily because more recent testing with randomly generated source code fleshed out a lot of bugs... so I don't think I'd suggest this if you're targeting GCC prior to 4.8. (Or any compiler which hasn't been subjected to the same extensive randomized testing that GCC and CLANG have been).
On testing. BE SURE TO TEST THE CASES WHICH SHOULD NOT AND ESP. "CANNOT HAPPEN" I have seen ... just dozens.. of programs fail to have any security at all because they were JUST NEVER TESTED WITH FAILING CASES. agreed there.
Fuzz test too, when you write all the tests yourself you'll fail to discover your misconceptions, fuzzing can help. Non-uniform randomness can be more useful, long runs of zeros and ones tend to trigger more behaviour in software. Whitebox fuzzers like AFL (and KLEE, though its a pain to run) can get you much more powerful fuzzing when testing a complete system, though for unit tests you don't need them generally.
Instrument your code for testability. If there is some part thats hard to reach, make sure you have a way to test it. Use branch coverage analysis tools like lcov to make sure you're covering things.
Write _lots_ of assertions. The computer is not a mind reader, it won't know if your expectations have been violated unless you exposed them. The asserts can be used only during testing, if you really want (e.g. performance concerns or uptime matters more than security).
Assertions make all other testing more powerful, time spent on them has super-linear returns.
Test your tests by intentionally breaking the code, both manually and by just incrementally changing + to - or swapping 0 and 1 or > and <, etc. You cannot use this kind of mutation testing successfully, however, until you have 100% test coverage, since executed code is obviously safe to mutate.
I have found many one in a billion input scale bugs by mutating and improving tests until they catch all the mutations.
There are tools for 'testcase reductions' for finding compiler bugs like:
http://embed.cs.utah.edu/creduce/ You can run it on your normal code and add tests until its unable to remove anything but formatting.
Coverity, clang-static-analysis, cppcheck, pc-lint, are useful informal static analysis tools. You should be using one or all of them at least sometimes.
If the codebase is small, consider using sound/formal static analysis tools like frama-c, at least on parts of it.
Valgrind is a life-saver. Learn it. Love it. (likewise for its asan cousin in GCC and clang).
Don't leave things constantly warning, figure out ways to fix are silence the harmless warnings or you'll miss serious ones.
Test on many platforms, even ones you don't intend to target... differences in execution environment can reveal bugs in code that otherwise would go undetected but were still wrong. Plus, portability is a good hedge against the uncertainty of the future. I've found bugs on x86 by using arm, pa-risc, and itanium which were real bugs but latent on x86 but immediately detected on other platforms because of small differences.
Unit tests are nice and important, but don't skip on the system tests. A lot of bugs arise in the interactions of otherwise correct parts.
Don't shy away from exhaustive testing. Have an important function with <=32 bits of input state space? You can test every value.
Testing once isn't enough: All these things can be done from continuous integration tools. Every commit can be tested, and for the random tests they continue to add more testing... not just wasted cpu cycles. I've spent time proving code correct or extensively fuzzing it, only to later accept a patch that adds an easily detected misbehaviour, if only I'd redone the testing. Automating it makes it harder to forget to do it or lazy out of it when the change was "obviously safe". Obviously safe code seldom is.
but there is no way to get around it because AFAIK no other language allows me to absolutely control when and whether copies are made,
Yes, its the norm in every place other than C for any code you've not written to be effectively a non-deterministic black box. C++ can be less than awful if you subset it enough that you're almost no better off than with C.
A lot of crypto software makes extensive use of global variables for sensitive values. They are fast, never get deallocated or (accidentally) copied during runtime
They can be copied at runtime.
I always use unsigned integers.
Do take care, a lot of C coders cut themselves on the promotion rules around unsigned. Use of unsigned in loop counters results in lots of bugs in my experience, esp with less experienced developers. Take the time to really learn the promotion rules well.
You can't even check to see if undefined behavior has happened, because the compiler will go, "oh, that couldn't happen except for undefined behavior, and I can do whatever I want with undefined behavior. I want to ignore it."
Yes, though you can check in advance if it would happen, without causing it and avoid it. Though experience suggests that these tests are often incorrect. GCC and CLANG now have -fsanatize=undefined which instruments signed arithmetic and will make the program scream errors at runtime. Not as good as being statically sure the undefined behaviour cannot happen.
I tend to use do/while loops.
I have used this same construct. Also with the masked variables.
I've intentionally made objects larger and null filled so that masked access were guaranteed safe. e.g. char foo[27]; foo
= 3; becomes char foo[32]; foo[i&31] = 3;.
Some other things:
Avoid recursion, and no recursion at all unless you can statically prove its depth. Running out of stack is no joke, and a source of many embarrassing (and even life threating bugs). It's not worth it.
Avoid function pointers. They give you _data_ that controls the flow of your program. ... which may be somehow writable to attackers. When a function pointer can't be avoided completely try to make it a bit-masked index into a power of two size const array which is filled with zeros in the slack space.
Sidechannels are very hard to avoid and even more likely to be undermined by the compiler into leaks. First assume you can't prevent them, and try to be safe regardless. Bitops can get you constant time behaviour in practise, including loads.. e.g. and the thing you want to load with ~0 and the things you don't want to load with 0, and or the results; its tedious and the compiler can still helpfully 'optimize' away your security. ... but the next option is writing things in assembly, which has its own risks. Valgrind warns on any control flow change (branches!) on 'uninitialized data' there are macros you can use to set any bytes you want to uninitialized, so you can make your secret data uninitialized and have valgrind warn about branches (though it's not completely sound, some warnings get suppressed).
Get comfortable with gcc -S and objdump -d ... reading the assembly is the only way to know for sure what you're getting, and the only way that you're going to discover that your presumed branchless code has been filled with jumps by the helpful compiler. Likewise, you can make your secrets have a distinctive pattern and dump core when you think things should be clean, and confirm if they actually are or not.
It's possible to make wrapper functions that call your real function, and then another dummy function that uses as much stack as your real function and zeros it all. This is one way you can stem the bleeding on stack data leaks..
More recently I leaned that dynamic array accesses are not constant time on all hardware, even if you're sure to always hit the same cache-line pattern, necessitating using masking for loads when the indexes would be secret data.
In some cases it can be useful to make your memory map can be made sparse and your data surrounded in a sea of unaccessible pages, allowing your hardware MMU to do some of the boundary checking for free. This class of approach is used by asm.js and the classic 'electric fence' memory debugging tool.
Data-structures can have beginning and ending canary values. Set them after initializing them, check them whenever you use the data structure, zeroize them when the datastructure is freed... find cases when code accidentally uses a freed or non-allocated data structure much more often. Esp when crossing a boundary of who-wrote-this-code.
GCC has annotation for function arguments which must not be null and for functions to have results which must not be ignored. These can be used to turn silent mistakes into warnings. But take care: not-null teaches the optimizer that the argument cannot be null and _will_ result in optimizing out your not nullness checks, so use them in your headers but don't compile your code with them. (see libopus or libsecp256k1 for examples).
Complexity is the enemy of security. Seek simplicity. They say that code is Nx harder to debug than it is to code, so if you're coding at your limit you can't debug it. When debugging, you at least have some positive evidence of the bad thing that happened, ... you know badness of some form was possible, at the very minute. Making secure software is even harder, because nothing tells you that there was an issue-- until is too late.
At times I suspect that its (nearly-) impossible to write secure code alone. Another set of eyes, which understand the problem space but don't share all your misunderstandings and preconceptions, can be incredibly powerful, if you can be fortunate enough to find some. They can also help keep you honest around short cuts you make but shouldn't, test cases you skip writting. Embrace the nitpicking and be proud of what you're creating. It deserves the extra work to make it completely right, and the users who will depend on it deserve it too.