Hi Everyone,
Sorry for the long post - I'm going to try to be thorough... What happened? I set up my first miner, with 4x Vega 64's. Then I blew it. I got greedy, and tried to add 3 more.
Initially, I was able to get the 4x running fine with the cast-xmr - version 060 and xmr-stak-amd-win64. I had some troubles with hash drops on one card with xmr-stak, so stuck with cast-xmr and ran for weeks solid with no problems, using the standard 'hellae' install guidance from here:
https://www.reddit.com/r/MoneroMining/comments/74hjqn/monero_and_vega_the_definitive_guide/.
So when I tried adding the three new cards (with extra PSU and splitter), I decided to use the more recent guidance from here:
http://vega.miningguides.com/.
So far, so good. I set up everything from scratch following that guidance (with the exception of wiping the HD and reinstalling Windows).
I got all 7 vegas running!! Good, high hash rates (1980-2000 each), until... cast-xmr hung, after about four hours. And since then, it only work for about 30 seconds before hanging.
I tried everything I could think of - following all steps in the mining guide again, starting with uninstalling all drivers. I tried the same hellae guidance from scratch (using Wattman, etc.).
I tried the older (e.g. alternating 2016 & 1800 intensity) and newer (1932 [edited] intensity) settings with versions 060 and 081. I tried new and old versions of xmr-stak-amd.
With every combination I try, I get this behavior:
1) Everything runs fine, spins up, fans at ~4000, temps reporting at 35-40C.
2)
About 30-60 seconds in, I get a driver reset. *** When I check event viewer, I see this corresponding with the screen flashing and what comes next: "Display driver amdkmdap stopped responding and has successfully recovered."
3) Then one or more cards starts showing zero hashes, while the rest keep running fine
4) Then cast-xmr hangs (and xmr-hangs at the same point) - no more hashes, no more responses
I've used the driver uninstaller called out in the instructions. I've even done the full Windows reinstall and started everything from scratch. In the latest iteration, I never installed Wattman etc. - just followed the MiningGuide instruction and used the .bat/.ps script to reset GPUs from device manager, then used OverDriveNT (and the .bat instructions) to set fan speeds, did the specified regkey updates, etc.
And I get the same behavior no matter what I try - run for 30-60 seconds, driver failure, then a hang shortly after.
I did recently update the bios on my Asrock H110 BTC mobo, I have the latest Windows 10 pro version (16299.192), I'm running 8 GB of RAM, I have more than enough power to the mobo & cards (1000W PSU, never drawing more than 800W) - it's just that new driver failure that I can't seem to shake.
Since I saw earlier in the thread (before I foolishly decided to try a 7-gpu setup) that more than 4 (or 5 or 6, depending on who's talking) is folly, I tried scaling back to 4. My total reset (with OS wipe) was with only 4 GPUs. The setup that worked before no longer works...
Any suggestions/ideas are welcome. I know that people have seen hangs before, and I saw several answers that suggested tweaking the power settings. The part that's bugging me is that the same cards worked before with the "stock" hellae settings. And now they don't... argh.
Any suggestions are welcome! I've burned most of my weekend trying to reinstall everything multiple times... no dice. Any ideas on what else I could try?