Pages:
Author

Topic: [0.25 BTC bounty] Radeon 6990 2nd gpu overheat (Read 5328 times)

newbie
Activity: 42
Merit: 0
On the underside of the card (PCB) are several screws with springs underneath. These screws control the pressure between the cooler and the GPU. On my 6970 a couple of them got loose during shipping. Had the same problem - one card was 20-30C hotter than the other one. After tightening the screws (carefully) both cards have the same temp profiles, except for minor differences resulting from positioning inside the case.
full member
Activity: 168
Merit: 100
If that number is in C
.......I don't know what to say. Except that your ambient temperature might be just ridiculous

If that number if in F
You need a heatsink remount.
sr. member
Activity: 252
Merit: 251
Heatgun says 78.2
full member
Activity: 168
Merit: 100
Interesting, I have never seen that before...
Have you taken those temp measurements yet..
sr. member
Activity: 252
Merit: 251
or buy an aftermarket heatsink for 6990

That's the most obvious solution. Looked into it the first time this occurred.

Of course, it just happens the 6990 is the only graphics card in existence with no aftermarket coolers available. Sure there's some ref. waterblocks, but no custom air coolers.

Quote
All dual GPU cards have a single heatsink

Well the 6990 is the first card with two separate heatsinks with vapor chambers beneath.
Here are the schematics
full member
Activity: 168
Merit: 100
Don't know how your card looks like, but my guess is that it have reference cooler and that it is blowing hot air from one GPU past other GPU heatsink. Was a minor problem in some HD5970 models.

try to buy card with fan positioned in center between GPU's or but aftermarket heatsink for 6990.

The ultimate solution might be installing water cooling on your video card. Forget about the special thermal compound, in my opinion it's marketing bullshit. Properly installed Arctic Silver 5 or similar high-end compound will be just as good.

All dual GPU cards have a single heatsink with two GPU mounting bases
That heat transfer issue is not what causes one GPU to be hotter than the other. Each GPU is transferring an equal amount of heat to the same heatsink...which would just make the heatsink heat up faster than a single GPU heatsink.

My guess is that the TIM application on the hot GPU is bad and the application on the cooler GPU is fine.

But a spread with any quality aftermarket TIM will improve temps on both GPUs anyways (provided you do it correctly)
legendary
Activity: 1512
Merit: 1049
Death to enemies!
Don't know how your card looks like, but my guess is that it have reference cooler and that it is blowing hot air from one GPU past other GPU heatsink. Was a minor problem in some HD5970 models.

try to buy card with fan positioned in center between GPU's or but aftermarket heatsink for 6990.

The ultimate solution might be installing water cooling on your video card. Forget about the special thermal compound, in my opinion it's marketing bullshit. Properly installed Arctic Silver 5 or similar high-end compound will be just as good.
full member
Activity: 168
Merit: 100
What AMD says is BS, switch it out with some IC7 anyways

Also have you measured the exhaust temps
sr. member
Activity: 252
Merit: 251
It's a reference Sapphire 6990 running at 100% fan speed in an open case.

I changed the pastes on my other cards to IC diamond 7,

but AMD says 6990's have a special phase-change TIM which should not be changed because it already has better cooling properties than synthetic diamonds or silver.
So I left that alone, besides the temps are very cool outside of mining.

Vcore is at stock, afterburner wont allow tweaking it
full member
Activity: 168
Merit: 100
Some misinformation here...

Furmark is the one thing that will push your card the hardest. Nothing will use more resources than Xtreme Burn-in mode in Furmark. Period.

If Pheonix yields higher temperatures you probably did not have Burn-in selected during your Furmark run.

That's not how Crossfire multi-GPU rendering works--each card works in an alternate pattern meaning GPU1 does the first portion of the work and GPU2 does the second portion, and it repeats.

What's your fan speed?

And what is the exact model of your card? Reference or non-reference?

What case? Or open case environment?

Have you tried reseating the heatsink with a fresh TIM application? (Preferrably Shin-etsu, IC Diamond, or anything of the likes)

Edit:
What's your vcore?

And to the above poster: 99c is definitely NOT safe. Yes, your GPU will not die at that temperature, but running them at that temperature for such a prolonged time will definitely shorten the lifespan of your GPU or damage its components. While you're at it check VRM temps too to make sure you're not frying those as well. My GPU cores mining are about 45c (two 5850s)

Also, do this for me will you:
Measure the temperature of the air coming out of the card from the back. In celsius.

I'm a computer hardware enthusiast and I have been way before the whole mining hype began, I would know.
full member
Activity: 168
Merit: 100
I'm not sure about the 6990's, but most other 6x series cards are stable at 300mhz mem clock without hurting hashing speeds. I run multiple 6970's which don't run too much hotter than 6990 and this lowers my temps by almost 10C.

On top of this, go into your catalyst control panel -> performance -> overdrive and reduce power consumption until you notice drops in temp without stability issues. I don't know 100% if this will help temps(it should), but I used to keep my 6970's at -%50 power without having to underclock anything in order for my 650 watt PSU to power 2 cards without lockups.

Let me know if this helps.
sr. member
Activity: 252
Merit: 251
the temperature difference is because data moves in a stream in the gpu

input -> GPU 1 -> GPU 2 -> output

GPU 1 will have a continuous stream of work, while gpu 2 will get the work that gpu 1 hasn't done.

thats why gpu1 is running at full capacity while gpu 2 is slower. chances are if you turn up your aggression, it'll increase the temp.

as for your temp, 99*C is a safe temperature for the gpu, so you're fine. try increasing circulation in your case to drop a few deg, but it wont really matter.


if i've answered your q, http://payb.tc/kookiekrak/ =D

That stream of work theory definitely makes sense.

Any idea why it would only happen while mining though? The load is spread pretty much 50/50 in everything else that stresses the GPUs. During mining both cores have same hash rates but only the 2nd core overheats.

Or maybe it's GUIminer/Poclbm?
full member
Activity: 238
Merit: 100
the temperature difference is because data moves in a stream in the gpu

input -> GPU 1 -> GPU 2 -> output

GPU 1 will have a continuous stream of work, while gpu 2 will get the work that gpu 1 hasn't done.

thats why gpu1 is running at full capacity while gpu 2 is slower. chances are if you turn up your aggression, it'll increase the temp.

as for your temp, 99*C is a safe temperature for the gpu, so you're fine. try increasing circulation in your case to drop a few deg, but it wont really matter.


if i've answered your q, http://payb.tc/kookiekrak/ =D
sr. member
Activity: 252
Merit: 251
Did you try to reduce the memory clock in order to get a lower temperature?

Works fine for me...

Tried to 840mhz ages ago, works great for the other core. Also puts temps in games etc. at about 65c for both cores, both equally loaded to 99-100%.

2nd core still goes up to 99c while mining.
newbie
Activity: 57
Merit: 0
Did you try to reduce the memory clock in order to get a lower temperature?

Works fine for me...
legendary
Activity: 980
Merit: 1003
I'm not just any shaman, I'm a Sha256man
Id like to note that although not reccomended my 6990 runs at 105c stably but i dont even run it at that temp after the inital stress test.
sr. member
Activity: 252
Merit: 251
Ok, good to hear it's normal then. I do run 5 other 6990's since March but none with this big temp fluctuations.

Fans are constantly at 90-100% due to that one core. The 1st gpu temp is just as it should be around 70-75, but the 2nd goes nuts.
And it only happens while mining so it can't be misapplied thermal paste etc. (temps stay within 1-2c of each other in different apps)
legendary
Activity: 980
Merit: 1003
I'm not just any shaman, I'm a Sha256man
My Radeon 6990 made by Diamond is in over clocked mode and I basically have to have the Fans running at 95-100%(usually a constant 100%) unless it is at night because it gets freezing were I live so usualy open a window about that time and can set the fans to around 65-80%.
 Any ways what are your fan speeds at?

P.s. my gpu temps are wierd like that aswell even with fans full blast.
sr. member
Activity: 547
Merit: 253
Good point forgot... um has it always done this since you got it? maybe thermal paste job on hot cpu was not done very good. My diamond 5770 had HORRIBLE paste job n i ran in the mid 90's during load. Took it apart n there was paste all over the chip not just on die. long story short cleaned it n put some AS5 paste on it and it dropped down to the normal 70-80C range.
sr. member
Activity: 252
Merit: 251
It's actually just 1 card, but as you know it houses two GPU's on the board. It's the only card in the system.
Pages:
Jump to: