Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 613. (Read 5805746 times)

sr. member
Activity: 309
Merit: 250
Did you read my advice about fan control and 5970s?

Yea, I tried your latest git, which you said had the fix for 5970 autofan.  I then tried kano's latest with the extra monitoring... just saying threads idle. Then I tried just manually setting the fans which still had no effect. I tried another pool and only had 3 restarts in an hour instead of 12 like on gpumax.... that seemed more reasonable and I adjusted overclock accordingly, but didnt run long enough to see if that stabled it completely.  Finally I started messing around with different options while on gpumax.  The only one that helped was setting GPU threads to 1.  

Since I have done that, I have been stable for 8 hours with no restarts, so looks promising, but we'll see if it lasts...  Undecided   Any disadvantage to only running 1 thread per gpu?  Hash rates seem about the same.
hero member
Activity: 642
Merit: 500
zefir,

Are you running a 32 or a 64 bit kernel?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
Nice to see you're back, Con.

X is up and running at the machine with me being logged in. Before I started to test the 7970 it perfectly worked with 3*6950. Confused...

Code:
zefir@miner0:~/work/cgminer$ aticonfig --list-adapters
* 0. 07:00.0 AMD Radeon HD 7900 Series
  1. 04:00.0 AMD Radeon HD 7900 Series

* - Default adapter

Any way to explicitly enable the second card if it really is disabled?

When you started experimenting with 7970 you wrote that it helped to swap PCIe-slots, right?

I'll try that and test it with 7970+6950 meanwhile. Thanks.

I'm not back. I drop in once a week to answer the accumulated questions. I'll be back once my PSU is replaced. Developing without hardware that can run the software ends up leading to issues which leads to more unnecessary development which leads to more issues and so on...

I had to put the 7970 into the first slot and the 6970s in the rest of the slots. It doesn't even look like your first card is even working properly since it is detecting only one device and then saying it can't use it. Your xorg setup looks fine. The usual thing to point the finger at after doing the obvious things is the ati driver/sdk combo. Try an older one or different one. The first driver they released for the 7970 I hear is the least worst one so far, so try downgrading to that.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Did you read my advice about fan control and 5970s?
... and in case you weren't sure boozer - my git changes were added on top of those changes ckolivas did after the version was released, so that version you compiled will have the changes ckolivas mentioned - so you can do as he suggested with that version.

... and yeah you can clearly see which GPU's are getting 60s idle sick problems (1, 3, 4 & 5) and GPU 4 has twice.

I've updated miner.php in my git to show the notify command (it shows the times as H:M:S and also any warning/errors are orange/red)

The change is on by default - you can switch it off - see the comment near the top of the code - i.e. set '$notify = false;' to switch it off
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Kano, just tried to compile your latest, getting this:

Code:
bitforce.c: In function ‘bitforce_scanhash’:
bitforce.c:310: error: ‘REASON_THERMAL_CUTOFF’ undeclared (first use in this function)
bitforce.c:310: error: (Each undeclared identifier is reported only once
bitforce.c:310: error: for each function it appears in.)

Haven't looked into it myself yet, but thought I'd post it straight away...
Oops - forgot to do another ./configure to catch that Sad
Fixed it now.
donator
Activity: 919
Merit: 1000
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
Nice to see you're back, Con.

X is up and running at the machine with me being logged in. Before I started to test the 7970 it perfectly worked with 3*6950. Confused...

Code:
zefir@miner0:~/work/cgminer$ aticonfig --list-adapters
* 0. 07:00.0 AMD Radeon HD 7900 Series
  1. 04:00.0 AMD Radeon HD 7900 Series

* - Default adapter

Any way to explicitly enable the second card if it really is disabled?

When you started experimenting with 7970 you wrote that it helped to swap PCIe-slots, right?

I'll try that and test it with 7970+6950 meanwhile. Thanks.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Folks, anything that needs to be considered for running cgminer on a multi-7970 setup?

I set up a rig with a XFX-7970 Black Edition and a Gigabyte GV7970-OC, installed the latest APP and drivers, but can't get them running.

Code:
zefir@miner0:~/work/cgminer$ export DISPLAY=:0 && ./cgminer -n -D
[2012-03-23 23:32:22] CL Platform 0 vendor: Advanced Micro Devices, Inc.
[2012-03-23 23:32:22] CL Platform 0 name: AMD Accelerated Parallel Processing
[2012-03-23 23:32:22] CL Platform 0 version: OpenCL 1.2 AMD-APP (923.1)
[2012-03-23 23:32:22] Platform 0 devices: 1
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] GPU 0 AMD Radeon HD 7900 Series hardware monitoring enabled
[2012-03-23 23:32:22] 1 GPU devices max detected


I read people driving up to 4*7970 setups with cgminer under Linux, so I must miss something really obvious   Undecided

Any ideas?
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
So was testing over heat protection on my water cooled farm ...

Code:

 cgminer version 2.3.1 - Started: [2012-03-23 15:45:20]
--------------------------------------------------------------------------------
 (10s):2507.1 (avg):2573.0 Mh/s | Q:3286  A:1932  R:26  HW:0  E:59%  U:35.30/m
 TQ: 8  ST: 9  SS: 21  DW: 1506  NB: 7  LW: 3806  GF: 0  RF: 0
 Connected to http://192.168.0.189:9332 with LP as user user/1000+1
 Block: 000006e1c8f6fcf1aa1e1f358d344831...  Started: [16:36:55]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  58.0C  960RPM | 324.2/335.3Mh/s | A:263 R:2 HW:0 U:   4.81/m I: 8
 GPU 1:  59.0C  960RPM | REST  /333.1Mh/s | A:261 R:6 HW:0 U:   4.77/m I: 8
 GPU 2:  60.0C  960RPM | REST  /225.7Mh/s | A:188 R:4 HW:0 U:   3.44/m I: 8
 GPU 3:  60.5C  960RPM | REST  /328.6Mh/s | A:246 R:5 HW:0 U:   4.49/m I: 8
 GPU 4:  56.0C  960RPM | 324.5/358.6Mh/s | A:217 R:3 HW:0 U:   3.96/m I: 8
 GPU 5:  59.5C  960RPM | REST  /330.7Mh/s | A:239 R:0 HW:0 U:   4.37/m I: 8
 GPU 6:  59.5C  960RPM | REST  /330.4Mh/s | A:261 R:3 HW:0 U:   4.77/m I: 8
 GPU 7:  58.5C  960RPM | REST  /333.3Mh/s | A:262 R:3 HW:0 U:   4.79/m I: 8

Notice the 10s avg hashrate is inaccurate.  Looks like when card goes idle due to overheat its last hashrate is still added to global average.
It's just the asynchronous way the counter is updated. It starts reading wrong when the card is not reporting in and only works after it knows what's going on with the unresponsive card. I can make the counter more accurate but I'm loathe to making it use more CPU time. I went out of my way to make all these counters use hardly any CPU, and since multiple threads report to the global counter there is a lot of locking required to update the counter which is a potential source of overhead if you update it often.
donator
Activity: 919
Merit: 1000
Folks, anything that needs to be considered for running cgminer on a multi-7970 setup?

I set up a rig with a XFX-7970 Black Edition and a Gigabyte GV7970-OC, installed the latest APP and drivers, but can't get them running.

Code:
zefir@miner0:~/work/cgminer$ lspci | grep VGA
04:00.0 VGA compatible controller: ATI Technologies Inc Device 6798
07:00.0 VGA compatible controller: ATI Technologies Inc Device 6798

Code:
zefir@miner0:~/work/cgminer$ cat /etc/X11/xorg.conf
Section "Monitor"
Identifier   "aticonfig-Monitor[0]-0"
Option     "VendorName" "ATI Proprietary Driver"
Option     "ModelName" "Generic Autodetecting Monitor"
Option     "DPMS" "true"
EndSection

Section "Monitor"
Identifier   "aticonfig-Monitor[1]-0"
Option     "VendorName" "ATI Proprietary Driver"
Option     "ModelName" "Generic Autodetecting Monitor"
Option     "DPMS" "true"
EndSection

Section "Device"
Identifier  "aticonfig-Device[0]-0"
Driver      "fglrx"
BusID       "PCI:4:0:0"
EndSection

Section "Device"
Identifier  "aticonfig-Device[1]-0"
Driver      "fglrx"
BusID       "PCI:7:0:0"
EndSection

Section "Screen"
Identifier "aticonfig-Screen[0]-0"
Device     "aticonfig-Device[0]-0"
Monitor    "aticonfig-Monitor[0]-0"
DefaultDepth     24
SubSection "Display"
Viewport   0 0
Depth     24
EndSubSection
EndSection

Section "Screen"
Identifier "aticonfig-Screen[1]-0"
Device     "aticonfig-Device[1]-0"
Monitor    "aticonfig-Monitor[1]-0"
DefaultDepth     24
SubSection "Display"
Viewport   0 0
Depth     24
EndSubSection
EndSection

Code:
zefir@miner0:~/work/cgminer$ export DISPLAY=:0 && ./cgminer -n -D
[2012-03-23 23:32:22] CL Platform 0 vendor: Advanced Micro Devices, Inc.
[2012-03-23 23:32:22] CL Platform 0 name: AMD Accelerated Parallel Processing
[2012-03-23 23:32:22] CL Platform 0 version: OpenCL 1.2 AMD-APP (923.1)
[2012-03-23 23:32:22] Platform 0 devices: 1
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] GPU 0 AMD Radeon HD 7900 Series hardware monitoring enabled
[2012-03-23 23:32:22] 1 GPU devices max detected


I read people driving up to 4*7970 setups with cgminer under Linux, so I must miss something really obvious   Undecided

Any ideas?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Did you read my advice about fan control and 5970s?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Isn't there a way in OpenCL to check what version of the SDK the .bin was compiled in and re-compile only if the SDK did not match currently installed? Or, encode the version into .bin filename. Sorry for nitpicking Smiley.
Doing this would circumvent the cheating people can do by using a bin from an older sdk like 2.1 with a newer installed sdk indefinitely to get the performance from the older sdk while installing newer drivers without losing their hashrate.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Code:
[2012-03-22 22:46:43] Started cgminer 2.3.1

[2012-03-22 22:46:43] Started cgminer 2.3.1
[2012-03-22 22:46:43] Probing for an alive pool
[2012-03-22 22:46:44] Long-polling activated for http://mining.eligius.st:8337/LP
Segmentation fault
root@ds-r:~/cgminer-2.3.1-2#

Running on debian squeeze, sdk 2.4, newer/ish fglrx

I get the same error from both a self built and the pre-built ubuntu binary.
Partial install of 2 different SDKs.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/

I would assume this is a false positive, but I guess it doesn't hurt to ask official advice. MS Security Essentials didn't pick up on anything but AVG did.


Read the increasingly unread FAQ in the readme included in the zip file about cgeminer being a virus.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Hi, I'm sure I am doing something really stupid but I can't get my 3x 7970s over 200 MHash/S each.

Running 12.2 driver and the newest version of cgminer.

Is there anything extra I need to do for 7970? I used cgminner for all my other cards. Thanks.
Presumably that's the dodgy sdk that diablo kernel doesn't work with. Try -k poclbm if you are passing a kernel choice to it.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Is it efficient to use balance or rotation mode? I would like to get text alerts from a pool but I don't like putting on my hashing power there. Balance seems like a good option if LP still works correctly. Does it matter how many pools are used?
It is a little less efficient trying to use multiple pools at the same time because of the problem with pools disagreeing about when a block changes by a few seconds between them and having to run multiple long polls that discard more work. That said it's only 2 or 3 seconds' work every 10 minutes so doesn't amount to much but will be visible if you watch stats on cgminer at the time.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
been getting this error on 2 of my miners lately. they run about 6 hours every time too. been crashing like clockwork for a few days now

[regular log stuff, then suddenly this]
[2012-03-20 14:04:09] Failed to create submit_work_thread
[lots of statistics]
[2012-03-20 14:04:09] API failed (Socket Error: (10004) Interrupted system call) - API will not be available
[2012-03-20 14:04:09] longpoll failed for http://api.bitcoin.cz:8408, sleeping for 30s
and thats the end, they are then sitting at "press any key to continue . . ."

same exact error on both

they are both running 2.3.1, no special flags aside from autoclock and autofan settings.

my 6870 is win7 64 bit, 11.11 driver, 2.5 sdk, and the 6770 is vista 32, 12.1, 2.3 sdk. clean installs, I never reuse bins, and delete bins when upgrading drivers and/or sdks. cgminer is always installed fresh, never over itself.

my 5830 on 11.4, 2.1 and XP with cgminer 2.2.7 runs perfect. same pool settings as the 6870 and 6770, so if its the longpoll fail thats killing the 2.3.1 miners the 2.2.7 version is OK with whatever it is.. is longpoll handled differently between 2.2.7 and 2.3.1?

any ideas?
Failing to create  submit_work_thread is the key here. It has nothing to do with how it fails after that. Inability to create the thread suggests a system resource problem, like running out of memory or too many threads starting up for some reason.
full member
Activity: 373
Merit: 100
Code:
[2012-03-22 22:46:43] Started cgminer 2.3.1

[2012-03-22 22:46:43] Started cgminer 2.3.1
[2012-03-22 22:46:43] Probing for an alive pool
[2012-03-22 22:46:44] Long-polling activated for http://mining.eligius.st:8337/LP
Segmentation fault
root@ds-r:~/cgminer-2.3.1-2#

Running on debian squeeze, sdk 2.4, newer/ish fglrx

I get the same error from both a self built and the pre-built ubuntu binary.

I'm getting a similar segfault with 12.x fglrx if I use an SDK older than 2.6 (debian testing, though). The best solution I found was to remove the APP SDK completely, install the packages "opencl-headers", "amd-libopencl1" and "amd-opencl-icd" and run cgminer with "GPU_USE_SYNC_OBJECTS=1" set after re-compiling it. Alternatively, using fglrx 11.x was the only alternative I found.
But then, I compile cgminer myself and have never gotten the opencl version mismatch.
donator
Activity: 1218
Merit: 1079
Gerald Davis
So was testing over heat protection on my water cooled farm ...

Code:

 cgminer version 2.3.1 - Started: [2012-03-23 15:45:20]
--------------------------------------------------------------------------------
 (10s):2507.1 (avg):2573.0 Mh/s | Q:3286  A:1932  R:26  HW:0  E:59%  U:35.30/m
 TQ: 8  ST: 9  SS: 21  DW: 1506  NB: 7  LW: 3806  GF: 0  RF: 0
 Connected to http://192.168.0.189:9332 with LP as user user/1000+1
 Block: 000006e1c8f6fcf1aa1e1f358d344831...  Started: [16:36:55]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  58.0C  960RPM | 324.2/335.3Mh/s | A:263 R:2 HW:0 U:   4.81/m I: 8
 GPU 1:  59.0C  960RPM | REST  /333.1Mh/s | A:261 R:6 HW:0 U:   4.77/m I: 8
 GPU 2:  60.0C  960RPM | REST  /225.7Mh/s | A:188 R:4 HW:0 U:   3.44/m I: 8
 GPU 3:  60.5C  960RPM | REST  /328.6Mh/s | A:246 R:5 HW:0 U:   4.49/m I: 8
 GPU 4:  56.0C  960RPM | 324.5/358.6Mh/s | A:217 R:3 HW:0 U:   3.96/m I: 8
 GPU 5:  59.5C  960RPM | REST  /330.7Mh/s | A:239 R:0 HW:0 U:   4.37/m I: 8
 GPU 6:  59.5C  960RPM | REST  /330.4Mh/s | A:261 R:3 HW:0 U:   4.77/m I: 8
 GPU 7:  58.5C  960RPM | REST  /333.3Mh/s | A:262 R:3 HW:0 U:   4.79/m I: 8

Notice the 10s avg hashrate is inaccurate.  Looks like when card goes idle due to overheat its last hashrate is still added to global average.
sr. member
Activity: 309
Merit: 250
I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Jump to: