Well after much frustration with my rigs crashing, I thought I had it figured out but the problem continued.
Originally I had setup SMOS but really didn't care for the simplicity of the interface or should I say lack there of and found nvOC.
As my rigs are remote in another state, it was even more difficult to diagnose.
For ease, I had my brother-in-law load up SMOS on both rigs and they have been running for over 4 days with under-volting (100W 1070 & 1070 Ti / 175W 1080 & 1080 Ti) and overclocking (-50 core 1070/1070 Ti, 200 core 1080 & 1080 Ti with 1100 Mem for the 1070/1070 Ti and 1000 for the 1080 & 1080 Ti). Mining ETH.
Whatever my issue was, it was an issue within nvOC. The question is why and what was causing the crashing issue.
Later this week, I plan on loading up a fresh copy of nvOC and see if the issue continues. Unfortunately I cleared the USB sticks we were using that had the nvOC OS on them. Had I though about it, it would have been good to have them to see if anyone could identify any issues. The only thing I really did was re-compile the miners.
It is difficult to pin-point when the issue really showed up as we were having to re-set the miners at least once a week and then eventually even would crash once daily. Both at the same time even on different algorithms. I tried multiple different mining programs, no over clocks, half overclock, full overclock. Nothing seemed to matter. 18-24 hours the crash was inevitable.
I wish I had more information on this, but wanted to get it into the forum in-case this ever happened to anyone else. I will update once I re-load up nvOC and see what happens.
OK...
So update time. It turns out that the issue all along was a faulty graphics card. In moving to SMOS, I had left the mining screen up on my computer and just happen to walk into an error on GPU 12. The rig crashed, and it took the other rig out as well. We replaced the riser and it crashed again. SO we removed the card completely.
It's been over 4 days running constantly for BOTH rigs and not a single crash or issue..
With this. We are moving back to a fresh copy of nvOC tomorrow as I really do dislike SMOS. But I wanted to update all. So, check your GPU's if you have a crashing issue you can't figure out! We are going to try and replace all other parts and re-insert the card and hope it's not an actual issue with the card. But I have a feeling it is. So it may be going back to TRY and get a replacement. I guess we will see.