Sir, you are completely wrong. Don't get me wrong, I love bashing the XMR codebase, which is pretty much the fault of the BCN devs - it's like a hobby. But CryptoNight needing a 64-bit system is NOT an example of the code's inefficiency, it's an example of the algorithm being designed to run quickly on systems that aren't ancient. Optimizing it for 32-bit arithmetic would slow it down by a lot, because it would only be able to process around half the data that it could otherwise (in general). So, the algorithm is made for 64-bit. That's not to say that CryptoNight can't be implemented with 32-bit arithmetic - it's just an ugly hack, and it's slow, and nobody wants to do the work to support a steadily decreasing amount of users - those being the ones on 32-bit.
Here's an example: A very large part of CryptoNight is a loop that is executed 262,144 times. Inside that loop, two 64-bit multiplies are done. Now, on computers, multiplies are fucking slow. The only really common instruction that I know of that is consistently slower (by a lot), is divides, which are ouch slow. Anyway, two 64-bit multiplies. And you can't shortcut it - both the high and low 64-bits of the result are used. Now, on a 64-bit machine, I need to do one or two register loads and a multiply. Done. On 32-bit, not even counting the register loading (and the possible memory accesses you'd need due to register pressure, depending on what else happened to be stored in them at the time), you need somewhere around this much shit: 5 bit shift ops, 5 adds, two AND operations, two logic operations, and the killer, four, yes - FOUR multiplies. Now, this isn't the absolute best implementation, it's what I have in front of me, but it's not absolute shit, either. You can knock off one multiply, I think, maybe a bit shift or two, but you're gonna have to do most of those ops, including the three slow-ass multiplies. This is an excellent example of how the limitations of the 32-bit platform cause issues.
Thanks for the reply. And I accept 64bit is obviously superior and even arguably needed for the reasons you state above. That said what the 32bit machines are currently crashing into is limitations caused by a database that is larger than necessary and fully loaded into ram, right? So I think calling me "completely wrong" is a bit over the top.
I am not making an argument for 32bit.
I am making an argument against kludgy databases and insulting the user.
But I am super excited to see that the development is charging forward!