I've noticed that when Windows blocks CastXMR from accessing a GPU because of aggressive overclocking, the software doesn't handle it very gracefully. The affected GPU stops reporting its hashrate every five seconds, and after a few minutes the software hangs entirely and multiple ^C are needed to terminate the run, assuming that the system doesn't freeze entirely and need a warm reset.
A useful feature would be for CastXMR to recognize that if it is accessing a list of GPUs, each of them should report results periodically, and if one of them goes missing for, say, twice the running average time between results, then it should be considered dead, and a clean restart of CastXMR should be initiated.
Or better yet, if CastXMR could perform a disable/enable of the problem card and then reinitialize it individually, so that none of the other cards had to be bothered.
With a Mining Expert 19-slot motherboard running a pile of aggressively overclocked Vegas, you're more likely to wedge one of the cards than usual, and being able to recover gracefully from that situation would substantially improve the average hashrate.
It's not really a problem with CAST-XMR now is it. That is you choosing to overdrive the card and then demand that that the tool fix it for you. Maybe you should spend more time accepting a stable hash rate per card and live with it. Its really a driver issue anyway and you are taking risks when overclocking that you will put the system into an unrecoverable state.
I personally don't need a feature in CAST that disables the cards and I don't think that would be helpful for people who have configured their systems correctly. I reboot once a week for windows updates, other than that I don't really look at the rigs.
If you need these bandaid scripts to monitor your hash rate cause it drops or your cards hang you have other problems that are not associated with the miner. I would rather the dev spend time improving the miner efficiency rather than adding some lame monitoring function that is completely unnecessary.
Oh and by the way I use Wattman. YIKES!
Really? Unbelieveble!
I'm using cast_xmr_vega with 6x VEGA56. Everyday and very often I need to restart miner or PC. VEGA's are not modded.
Everything I do before running the miner is run devcon.exe for cli enable/disable GPU's and then Overdriven tool with overclock settings.
Very often is output paused... until S is pressed and after is all messages displayed with OUTDATED result.
I never had a ban from poolserver but last 2 days 3 times! Sometimes freezing after Difficulty changed message.
https://ibb.co/n9Ra4wWhen I press Q displayed 'quitting..' and nothing happend.