from my limited experience I think that both points apply to everyone. For what is worth I've been albe to verify only the second one: every time I overclock the miner a good number of cores will be disabled during the first two minutes of the new cgminer session. Usually they're concentrated in one or two die.
In my case dies are being disabled (1 die = 48 cores). If just cores are disabled then that is easy to fix - just apply more voltage to stabilize them.
On the other hand I've never tried
to verify the first point, mainly because I don't know how to do it. How do you know for sure that the clock has been reset, do you look at the Amps or is there anyway to read the value of a PLL registry?
Yes, voltage drops. For example you apply the overclock and you see a VRM working at 54A. You change the voltage setting just 1-2 values lower and after you hit apply the Amps drop from 54 to 44, which means the higher overclock frequency is no longer applied.
Instead of increasing the voltage to the maximum value, I just set it a little bit higher take into accounts on how many
cores are disabled in a particular die. the I restart cgminer and wait 1 minute to see if changes make any difference.
I use this approch because I don't want to cook my asics/vrms.
I've tried that, but it doesn't work as effective as applying max voltage. With max voltage the sleepy dies usually kick in immediately.
Also I have a 2nd theory, which I haven't tested a lot, but if you supply sufficient voltage to a sleepy die it might awake after 2-3-4-5-6 hours. But I don't think there is any consistency in results with this method and I prefer to wake them up immediately with stress/shock voltage that waiting hours for them to wake up naturally, which is not guaranteed to happen.
I've also increased the SPI frequency because of what 'orama said in one of hist last post said:
How much? I played with this and I usually stick to 256000Mhz. I tried even more, but I can't see any correlation between this and any results.
Another think I do is taking not of all the changes I apply along the way (a goodthing is coping /config/adavanced.conf at differnet moment in time)
To check the distribution of disabled cores I use a modified version of a pl script included in bertmod. It is an ASCII version, it only outputs temps and disabled core per die, e.g.
Board 0: Temperature sensor: 47.5C
DIE 0 ON: 46 OFF: 2 95.8% OK
DIE 1 ON: 48 OFF: 0 100% OK
DIE 2 ON: 48 OFF: 0 100% OK
DIE 3 ON: 48 OFF: 0 100% OK
Board 2: Temperature sensor: 64.0C
DIE 0 ON: 47 OFF: 1 97.9% OK
DIE 1 ON: 48 OFF: 0 100% OK
DIE 2 ON: 48 OFF: 0 100% OK
DIE 3 ON: 48 OFF: 0 100% OK
Board 3: Temperature sensor: 55.0C
DIE 0 ON: 48 OFF: 0 100% OK
DIE 1 ON: 48 OFF: 0 100% OK
DIE 2 ON: 48 OFF: 0 100% OK
DIE 3 ON: 48 OFF: 0 100% OK
Board 4: Temperature sensor: 49.0C
DIE 0 ON: 48 OFF: 0 100% OK
DIE 1 ON: 48 OFF: 0 100% OK
DIE 2 ON: 48 OFF: 0 100% OK
DIE 3 ON: 48 OFF: 0 100% OK
How exactly did you do that?
You created an additional page within the lighttpd server?