Hey man, as always, thanks for your testing and for sharing the results!
My two cents on your findings: for the 16+14 vs 15+15 on the Vega 64, it should still be the case that 15+15 is the better choice _unless_ you go for a 1500+ cclk. With CN/r and the pretty expensive random ops they added, the higher cclk (probably) kills efficiency in a different way than for CNv8 with a higher cclk. I haven't verified this in practice though, but anything else would be most unexpected. For the V56, 14+14 is the best overall choice for me as well. Most of my V56s are reference cards flashed with V64 bioses, 14+14 still being my optimal choice.
The 14+16 vs 16+14 issue is very surprising, and a driver issue/feature for sure. It will allocate exactly the same amount of memory in both cases, only the order in which certain larger blocks of memory are allocated differs. Somehow this triggers a different allocation path in this driver/config/setup. The numbers are typical for those you can see under Windows when the drivers allows an allocation but actually considers the gpu to be out of mem, instead mapping part(s) of the allocated buffer to host-side memory, and every mem op on the gpu not hitting a cache becomes a PCIe operation. That is at least my own working theory for what happens under Windows. I can't say if the same thing is happening here, only guessing at this point.
May I ask about the efficiency compared to CNv8, assuming that's what you have tuned for?
No problem - happy to share...
Re 56 settings - mine is also a flashed ref, but i'm targeting 1400 cclock (tho only getting to ~1365), so maybe that's the difference. 14+16 was the clear winner for me (though I just realized I haven't tried 14+15.) As for the 14+16 vs 16+14, if the intention is for both to perform the same, I suppose it doesn't matter as long as one works - more of an oddity then.
Re efficiency - I would say it's roughly the same, but the comparison isn't clear-cut. My notes on cnv2 are only for a single nitro 64 on Windows - showing 2120h/s @ 825mv for 163w ATW (~13 h/w). This 64 is a ref on linux, running 2140h/s @ 831mv - i don't have a great wall reading yet. Both were targeting a 1400 effective cclock (w/ 1107 mclock,) so for the most part i would only expect a 1-2% increase in power based on the mv bump.
I have a rig w/ 6 nitros and a couple other polaris which I'm about to swap out to make a full 8 x nitro 64 rig - I'll get a good avg once that's done and post back.