Has anyone else noticed that the order of temperatures and fan speed is incorrect sometimes? For instance, this is a rig with 3 cards; a 6870, a 5970 and a 5970 with one bad GPU.
GPU 0 is the 6870, the one with the exceedingly low temperature and no fan speed is the one with the bad GPU.
The GPU order corresponds with what I see in clocktweak.
Reading data:
Adapter#:0 Temp:75 Load:99 Fan:67 Level:2 CoreL0:250 CoreL1:399 CoreL2:900 MemL0:198 MemL1:799 MemL2:800 mVoltL0:950 mVoltL1:999 mVoltL2:1150
Adapter#:3 Temp:62 Load:99 Fan:NA Level:2 CoreL0:250 CoreL1:500 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000
Adapter#:4 Temp:43 Load:99 Fan:NA Level:2 CoreL0:157 CoreL1:399 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000
Adapter#:5 Temp:60 Load:99 Fan:86 Level:2 CoreL0:157 CoreL1:400 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000
In this case, I changed the clock speed on the 5970 with only one GPU (Adapter #4) to 400MHz, and as you can see the hashrate of GPU1 when down instead of GPU2.
It's not just this configuration of cards either, I've noticed this before with 4 GPUs in the system but the cards in different orders.
Check the debug log file. Sometimes ADL does return -1 from fan speed/temp APIs. I've had people reporting it with 6000 series cards (not cgminer, my akbash watchdog) BTW, -1, is "Most likely one or more of the Escape calls to the driver failed".
Not sure why, maybe it is "not always supported" (as their ADL docs says) ?!?. I raised this issue with AMD support. Waiting for their response.
Not sure how re-order would help, ADL APIs use adapter index, not opencl gpu #.