Author

Topic: first GPU out of 4 is hotter and causes hardware errors (Read 1311 times)

full member
Activity: 196
Merit: 100
I get the same issue, minus the HW errors, but my GPU0 runs hotter than the others which makes the hashrate drop as the clock speed drops due to heat.

Is your GPU0 attached to a display?  I found that mine attached to a 1080P display was hashing slower than the others but when I dropped the resolution of the display down to 640x480 it helped alot.
sr. member
Activity: 840
Merit: 255
SportsIcon - Connect With Your Sports Heroes
Check with Gpu-z, the PCI-e speed and version at which GPU0 is running. PCI-e v3.0 is recent.
Then, the motherboard may be overvolting the PCI-e slot, considering the competition there is on OC friendly motherboards.

Then see if you can change PCI-e frequency and latency on the Bios or with tweaking software. Disable sound, firewire and whatever else not required for mining.

What are your thread-concurrency settings? Some will give HW errors in the very beginning as you mention.


sr. member
Activity: 336
Merit: 250
Can you confirm that the fan spins on  the card that is in GPU0 slot?  Maybe there is something wrong with the signal lines or pins in the socket?  Try to have a fan blowing directly on the card. If that helps you know there is something wrong with the cooling in that spot.  Other than that, maybe if you can post a picture we can see something?
hero member
Activity: 736
Merit: 508
You mentioned it has powered risers.  Could there be an issue with the Power supply on the riser?  IE: did you also switch the risers when you switched the cards position?
The other thought I had, is GPU0 is near the CPU?  Could the heat from the CPU be overheating the card?


unfortunately, I also switched between riser cables, with no effect.

No, GPU0 slot is near the CPU, but the card is far from it
sr. member
Activity: 336
Merit: 250
You mentioned it has powered risers.  Could there be an issue with the Power supply on the riser?  IE: did you also switch the risers when you switched the cards position?
The other thought I had, is GPU0 is near the CPU?  Could the heat from the CPU be overheating the card?
hero member
Activity: 736
Merit: 508
It could be a number of things.  My first suggestion  is to see if its the card or, it's position in your rig.  EDIT: oh seen that you have tried that...
See what happens when you switch positions with one of the other cards.  If the issues moves with the card, you may just have a "weak" GPU.  I have run across a few of those myself that I had to replace.

Otherwise make sure the intake fan is not sucking hot air from a nearby component or that the airflow is not blocked in any way.  Take a small case fan and try to blow directly on the card to make sure its getting enough fresh air.

Thank you for your help.

By the way the position of the card is not the problem, because the same card in the same position runs much cooler is it is not detected as GPU0 and stops to generate HW errors : \
sr. member
Activity: 336
Merit: 250
It could be a number of things.  My first suggestion  is to see if its the card or, it's position in your rig.  EDIT: oh seen that you have tried that...
See what happens when you switch positions with one of the other cards.  If the issues moves with the card, you may just have a "weak" GPU.  I have run across a few of those myself that I had to replace.

Otherwise make sure the intake fan is not sucking hot air from a nearby component or that the airflow is not blocked in any way.  Take a small case fan and try to blow directly on the card to make sure its getting enough fresh air.
hero member
Activity: 736
Merit: 508
Hi,

I have a mining rig with 4x 280x, on an Asrock 970 ex4, connected with powered 16x to 1x risers cables. Two cards are connected to 2 PCI-E 16x slots, the remaining two to PCI-E 1X slots

They mine with an average of 730Kh/s each one, but the behaviour of GPU0 is very different than the others:

- GPU 0 has a temperature of about 80°, the rest 72-73°
- GPU 0 gets hardware errors. A few, but it generates usually when cgminer is just started. The other GPUs NEVER generate HW errors, with the same settings.
- no matter which card the GPU 0 is, switching cables and slots among the cards doesn't solve anything. The GPU 0, the one attached to the PCI-E 16x nearest the cpu, always have the same behaviour.


Any ideas? Some suggestions?
Jump to: