Hello miners,
I will appreciate if you support me on an issue I am currently experiencing with my RIG.
I will start from scratch trying to simply explain my pain. Thank you in advance for your time! :-)
RIG Specs:
MB Asrock H81 ProBtc
GPU 4x Saphire 290 Tri-X OC
PSU 1x Corsair i1200ax Platinum
CPU Pentium
RAM 2x2GB 1333mhz
3 GPU's are on risers PCIx1 --> PCIx16
The system is assembled and starting with SMOS BAMT from USB stick. It is working with SMOS config suggested from their site for about 11-36 hours when it suddenly freeze with black screen, no SSH and no web dashboard. The only solution is physically powering off the RIG. I also observe that it wont start mine successfully after reboot with the overclocked cgminer.conf and I have to load a default one from backup to reboot and afterwards to update the overclocked config for another 11-36 hours of successful mining.
{
"pools" : [
{
"url" : "stratum+tcp://middlecoin.com:3333",
"user" : "1CQRt6cwJxdcFN2EAp3JLE1SgBHJbqXYkm",
"pass" : "123"
},
{
"url" : "stratum+tcp://eu.middlecoin.com:3333",
"user" : "1CQRt6cwJxdcFN2EAp3JLE1SgBHJbqXYkm",
"pass" : "123"
},
{
"url" : "stratum+tcp://amsterdam.middlecoin.com:3333",
"user" : "1CQRt6cwJxdcFN2EAp3JLE1SgBHJbqXYkm",
"pass" : "123"
},
{
"url" : "stratum+tcp://asia.middlecoin.com:3333",
"user" : "1CQRt6cwJxdcFN2EAp3JLE1SgBHJbqXYkm",
"pass" : "123"
}
],
"api-listen" : true,
"intensity" : "20",
"vectors" : "1",
"worksize" : "256",
"kernel" : "scrypt",
"auto-fan" : true,
"gpu-fan" : "55-85",
"temp-cutoff" : "85",
"temp-overheat" : "75",
"temp-target" : "70",
"expiry" : "30",
"gpu-dyninterval" : "7",
"log" : "5",
"queue" : "1",
"retry-pause" : "5",
"scan-time" : "30",
"scrypt" : true,
"temp-hysteresis" : "3",
"shares" : "0",
"shaders" : "1792",
"thread-concurrency" : "24768",
"gpu-thread" : "1",
"gpu-engine" : "1045",
"sharethreads" : "32",
"lookup-gap" : "2",
"gpu-vddc" : "1.000",
"gpu-powertune" : "20",
"gpu-memclock" : "1500"
}
So far I have done some investigation and I have the following results:
1) Assuming power consumption issue - I have started the RIG with 3 GPU's where I have the same result. The system works perfectly fine for ~11h and then freeze.
2) Assuming overclocking issue - I have left the RIG with 3/4 GPU's with default cgminer config - the very same result.
3) I have tested also the MEM and can confirm it is fine.
4) GPU temps are 62-69 C with fan's rpm between 55-85%
5)From the logs I can observe the following error:
root@smos-1:~# tail /var/log/messages
Mar 5 20:02:20 smos-1 kernel: [ 58.821275] atitweak[3090]: segfault at 2e6e6570 ip b7511cad sp bf85f700 error 6 in libc-2.11.3.so[b74a2000+140000]
Mar 5 20:02:20 smos-1 start_mining[2992]: generating per gpu munin plugins for template gputemp...
Mar 5 20:02:20 smos-1 start_mining[2992]: done generating munin config, starting munin-node
Mar 5 20:02:20 smos-1 kernel: [ 59.000845] atitweak[3191]: segfault at 1d ip b75e3cad sp bfd397b0 error 6 in libc-2.11.3.so[b7574000+140000]
Mar 5 20:02:24 smos-1 start_mining[3536]: starting cgminer with cmd: cd /opt/miners/cgminer;/usr/bin/screen -d -m -S cgminer /opt/miners/cgminer/cgminer -c /etc/bamt/cgminer.conf
Mar 5 20:02:24 smos-1 start_mining[3547]: post mining tasks...
Mar 5 20:02:24 smos-1 start_mining[3547]: gathering post-mine stats for offline mode
Mar 5 20:03:01 smos-1 kernel: [ 99.895369] atitweak[3905]: segfault at 50a0556 ip b753fcad sp bfd83260 error 6 in libc-2.11.3.so[b74d0000+140000]
Mar 5 20:03:01 smos-1 kernel: [ 99.958959] atitweak[3910]: segfault at 64356441 ip b7459cad sp bf862480 error 6 in libc-2.11.3.so[b73ea000+140000]
Mar 5 20:05:01 smos-1 kernel: [ 220.020975] atitweak[4060]: segfault at 776f64c2 ip b7590cad sp bff36820 error 6 in libc-2.11.3.so[b7521000+140000]I have potted that one of the hangs was with GPU 0 DEAD/SICK - How could I confirm on which slot is GPU 0 installed ?
Could you please advice on what should I focus my self to investigate any further the RIG issue ?
Really appreciate your help!