So I've just flashed my 56's to 64 vbios and applied Hellae's "safe" reg file.
I've set some new frequencies in OverdriveNtool but CastXMR just keeps stopping on me after a few minutes. Does that mean I'm not giving the memory enough juice? Too much clock? Is it core related? Can anyone who's been through this help me troubleshoot the behaviour of the miner please?
...
Getting approx 1900h/s/card, total 1050w from the wall (6 GPUs).
Any suggestions welcome!
Can you elaborate on "stops" a little? Does the app crash/close, does it just stop output in the cmd shell....that kind of thing.
Reason I ask is I have similar behavior where it just stops sending out put to the Window (nope it is not the Quick Edit setting). When I try to connect to the 7777 port my moniotoring script will pick up that it is hung and then it restarts the process.
as long as your Vega are stable, Cast XMR almost never ever crashes or close -save for the Quick Edit thing, which we mentioned before-, and from personal experience my issues with the program crashing were totally gone once I found out how to stabilise the cards, meaning that I set the unmodded Vega 56 cards to 935MHz instead of 950MHz using OverdriveNTool.
Ok, so the culprit here is probably memory clock since mem voltage seems fixed anyway. So far it's been hashing all night at 1050 Mhz nicely. I'll try tweaking clocks some more today.
Well it did stop again after about 6 hours.
Restarted, leaving everything the same apart from lower core clocks. Let's see what happens. Worst case I'd have similar hashrates to what I had before the flash, but at 350W less at the wall. Which is positive I suppose.
EDIT: how can you figure out which GPU is causing trouble? Using Cast I have no visibility at all... No logs and all...
if the clocks are somewhat overkill for the cards, it's quite hard to say. Again, that's from personal experience. But at higher than normal clocks they can work for relatively long periods of time or stop just when the program is loading the cards -when the text appears in blue color-, this means in less that 2 seconds.
Some of them lose track of their own settings and fans can stop to spin at 3000rpm and just spin at 1000rpm, temps soar to 83ºC....
A pattern I always noticed when I was having random, unexplainable hangs, is that the GPU5 always had all the leds lit, as if it was working hard and normally, while the other 5 cards only had a single led lit, as when they are in rest mode.
I learnt that going where the rig is physically placed. The OS continued to work normally so I had an internet connection to the rig, but being physically there also gave clues.
Was the GPU5 the culprit? Was it any of the others? That's the question.
Right now I am getting the following using Stock Vega 56 BIOS on "Blockchain Drivers" from AMD and it's been stable for more than 24 hours:
Average Hashrate = 1923 H/s
Fan set to = 2000 RPM
Temp = 60C (Stable)
GPU Freq = 0%
Memory Freq = 935 MHz
Power Limit % = -10
Power draw at the wall (Measured) = 150W to 165W
That's great stuff!