Author

Topic: [Mining OS] SimpleMining.net - Manage Your GPU farm the easy way! (30 days free) - page 469. (Read 835786 times)

newbie
Activity: 24
Merit: 0
@tytanick, I have been getting a small amount of reboots on a couple of my rigs, maybe 2 or 3 reboots a day.  My rigs come back up fine after about 30 secs but is there a specific log I can look at to determine why this is happening?  

I am running in to this same issue and I am actually running smOS on an SSD instead of a USB stick. Is there anyway I can turn some logging on to be able to look back at why my system reboots 3-6 times a day? Sometimes its minutes apart sometimes it is hours apart and the screen scrolls so fast that there is no point in trying to catch what is happening anyway.
hero member
Activity: 1151
Merit: 528
Anyone experienced any errors like this?

[ 1084.587016] amdgpu 0000:07:00.0: GPU fault detected: 147 0x05708801
[ 1084.587016] amdgpu 0000:07:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00E0983C
[ 1084.587016] amdgpu 0000:07:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06088002

I've walked the cards and even went as far as replacing the OS but it keeps coming back. So far, I'm only seeing this on the z170a board with the 6 470s.
Yes my dmesg is spammed with that. I ignore it.

Yes, I have 7 plugged in with 6 working just fine at the moment, and I believe the lack of the 7th showing up is that I have not shorted out the second x16 slot - I am running old style risers from 2014.

I hope to short the pins during lunch today and have 7 running!

I'll get you the exact specs sometime today, work permitting Smiley

please report if this is a HW or SW problem. I am interested in this as well.
Sorry I haven't had time to resolve this yet. Busy busy always..
newbie
Activity: 26
Merit: 0
Anyone experienced any errors like this?

[ 1084.587016] amdgpu 0000:07:00.0: GPU fault detected: 147 0x05708801
[ 1084.587016] amdgpu 0000:07:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00E0983C
[ 1084.587016] amdgpu 0000:07:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06088002

I've walked the cards and even went as far as replacing the OS but it keeps coming back. So far, I'm only seeing this on the z170a board with the 6 470s.
newbie
Activity: 26
Merit: 0
Well....

Spoke too soon. That 7th cards causes all sorts of stability issues. Random reboots and hangs. With the jumpers, I've got 6 cards installed and active but the 7th just doesn't seem to want to work. Maybe it's still a riser issue but. I'll try again when the USB ones arrive later today.
newbie
Activity: 26
Merit: 0
All right guys, I beat on it till it finally submitted to my will! I tried newer versions of Ubuntu and even went down the windows path before I came back to the physical layer being the culprit.

I'm using the older style risers like Soothaa but have the newer USB ones on order. This I believe was the source of all my ailments. Anyhow, here's what I did to get it working.

MSI Z170a Gaming M5
Bios Revision: 1.D0
Settings:
HD Audio - Disabled
PCI Versions - Auto
Integrated Graphics - IGD
4G Decoding - Enabled

This is where I'd get 4 or 5 cards active on 6 installed. Or, 4 cards active on 7 installed.

So, I started creating jumpers for each of my 1x PCIe slots. With 7 cards installed, I have jumpers on the 2nd and 3rd 1x PCIe slots and a 16x riser on my 2nd 16x PCIe slot. Lastly, I've got a newer style USB riser on the last 16x slot.

I was scraping the bottom of the barrel on risers to get this board running and it shows. Anyhow, it's now pumping away with 7 active GPUs and I'll replace all of them once the new risers arrive.
member
Activity: 116
Merit: 10
Yes, I have 7 plugged in with 6 working just fine at the moment, and I believe the lack of the 7th showing up is that I have not shorted out the second x16 slot - I am running old style risers from 2014.

I hope to short the pins during lunch today and have 7 running!

I'll get you the exact specs sometime today, work permitting Smiley

please report if this is a HW or SW problem. I am interested in this as well.

Same here. Have Z170a M5 boards, and I'd love to use smos!
newbie
Activity: 15
Merit: 0
Yes, I have 7 plugged in with 6 working just fine at the moment, and I believe the lack of the 7th showing up is that I have not shorted out the second x16 slot - I am running old style risers from 2014.

I hope to short the pins during lunch today and have 7 running!

I'll get you the exact specs sometime today, work permitting Smiley

please report if this is a HW or SW problem. I am interested in this as well.
newbie
Activity: 11
Merit: 0
I have 2 rigs running claymore 8.1 on one and 9.3 on other and I am getting a lot of random restarts, does claymore generate logs like windows? If so where would I find them.
hero member
Activity: 1151
Merit: 528
Yes, I have 7 plugged in with 6 working just fine at the moment, and I believe the lack of the 7th showing up is that I have not shorted out the second x16 slot - I am running old style risers from 2014.

I hope to short the pins during lunch today and have 7 running!

I'll get you the exact specs sometime today, work permitting Smiley
newbie
Activity: 26
Merit: 0
I was able to get my MSI Z170A GAMING M5 to boot with 6 GPU's plugged in and Above 4G decoding enabled with one of your modified kernels. I am still trying to trace down why only 5 are showing in claymore though..

Hi Sootha! Any more luck with this board? I just picked one up and it's been giving me fits.

So far, the best luck I've had is with the D0 bios and these settings:
PCI ver: Auto
4g Decoding: enabled
Integrated Graphics Enabled
HD audio disabled

I've got 6 cards plugged in but only 5 show in claymore. I suspect that the 6th card isn't enabled due to bios issues or something else. Possibly it's requiring UEFI since there are a lot of folks reporting success under windows. I'm going to troubleshoot this a bit more this morning.

I'm running the latest build so Smos. V1095 and it appears that has the latest kernel and amd  driver versions already.

Anyone else running one of these z170a boards and been able to get them to see 6 or 8 cards in Smos?
hero member
Activity: 906
Merit: 507
are we able to set custom clocks 1050,1070,1050 under the r os like the rx
legendary
Activity: 2660
Merit: 1096
Simplemining.net Admin
Grouping feature - i will let you know when i will do it.
I have two concepts. Will se which one i will implement when i will starting making them Tongue

Right now i am super busy with multiple projects and i want to implement new dashboard , few things to fix before that.
member
Activity: 126
Merit: 10
let me guess, you're adding 0x8D 0x00 0xXX 0x00 to the VoltageObjectInfo init regulator section. What you might not know is that that section is missing on MANY GPUs. What you probably didn't know is that even if it is there, that will only work for MOST GPUs - that is, the ones that have IR3567B-compatible VRM controllers.

would a solution be for him to search for "01 07 0C 00" in the BIOS ROM before applying his "low wattage mode" ?

(and is there even a tool to read this in Linux?)

Well, I wrote my own toolset - so I don't bother with the VBIOS for voltage anymore. But no, that's not a solution. First - those bytes might show up randomly, say, in a command table. If you're going to try and parse the VBIOS, do it properly.

Proper parser of atombios is a nasty thing to write.
On the other hand it is a very nice thing to have.
For example you can discover scheme to downvolt cards with NCP81022 just by parsing a lot of bioses and checking if vendors do something with it.  Grin
Or you can make universal bios editor which already supports unreleased Vega.

Second, he should first check VoltageObjectInfo for a voltage object of type 3 (init regulator), and go from there. But there's still a problem - he doesn't know for sure what VRM controller it is.

Other things should be checked too.
And I think in current generation of cards IR3567B vs NCP81022 vs none can be distinguished in bios.
hero member
Activity: 1151
Merit: 528
I was able to get my MSI Z170A GAMING M5 to boot with 6 GPU's plugged in and Above 4G decoding enabled with one of your modified kernels. I am still trying to trace down why only 5 are showing in claymore though..
newbie
Activity: 10
Merit: 0
Heya.   Been using your OS on a 6 GPU rig now for a while and it works great.   I tried to move an older 4 GPU rig over from Win 7 to SMoS, but GPU4 always goes to 0MH within a couple of minutes starting out.    Not sure if it is the motherboard or what, but I don't have any issues with it under windows at all.

My primary rig is was all at 1150/1940 for a while, and I've been slowly increasing memory clocks while testing stability with no problems.   But I'm using a PRO BTC board there.   I really liked your OS, so I tried to switch over my older rig also but had problems.

On my older Rig, I am using a Gigabyte GA-Z68X-UD3H-B3 board, with 4 powered risers and 4 RX470s.    In windows, I run 2 cards at 1100/1920 and 2 at 1100/1950, both with -20mV under Trixx.    In SMoS, it won't set it to 1100 for the core properly, and I tried it on powerstages 5 and 6 but card 4 kept having problems.   I tried it at 1150 too, but no matter what I did, GPU4 kept going to 0MH no matter what I did with it.

All cards in both systems are ASUS STRIX RX 470 4GB cards.
newbie
Activity: 26
Merit: 0
Tytanik,

I've switched back to sgminer-gm version 5.5.5-gm-a that supports the ethash-new.cl kernel by pushing them manually to all of my systems. Can I provide you the updated version of Sgminer and the kernel so that your next release doesn't overwrite them or could you refresh the compile on your side please?

Thank you!
newbie
Activity: 6
Merit: 0
Hi guys, probably n00b question (i´m not linux admin):

I would like to set-up remote access to my rig using SSH+OpenVPN. But what if i restrict access to rig only from OpenVPN connected computers? Will the SMOS dashboard work with no problems? Or will I have to allow access for some SMOS servers?
full member
Activity: 154
Merit: 100
@tytanick, I have been getting a small amount of reboots on a couple of my rigs, maybe 2 or 3 reboots a day.  My rigs come back up fine after about 30 secs but is there a specific log I can look at to determine why this is happening? 
full member
Activity: 129
Merit: 100
there is any folder to check claymore's logs ? just to understand if there is some gpu with incorrenct share to underclock it.
newbie
Activity: 2
Merit: 0
Hi tytanik

First run with new image works perfect but after reboot the my system hangs. Do i need to change something on the mobo.
If i use F9 to choose boot drive it also work.
Please help.

Jump to: