Do remember that VRMs have a hardwired range of voltages they can provide.
The optimal memory clocks depend chiefly on core clock and mining kernel options such as work size and vector width.
On a few of my cards, some particularly bad core/memory clock combinations result in hardware errors in addition to low hash rate.
That might be the reason, I don't know. It's an interesting fact though, methinks. Not in a million years would I have imagined that the difference between 0.95 and 0.951 is so important. I had heard something about quadratic increase in power usage (which is sort of true too) but nothing about this.
Thanks for pointing out the relevance of kernel settings. I haven't tested that as thouroughly as the rest and totally forgot about it, but I believe vectors=2 and worksize=128 works best for me.