My initial Radeon HD 7970 mining benchmarks - page 8.

1onevvolf

newbie

Activity: 43

Merit: 0

Quote from: luo demin on January 09, 2012, 01:26:28 PM

I can't wait till the 7990 that is going to be impressive but expensive Sad

I might have missed this but what is the heat like hashing overclocked ? and what fan speed

Overclocked @ 1125/975MHz with automatic fan speed I'm getting temperatures hovering 81-83C, and the fan runs at 47-49% speed. You can see some screencaps on one of the earlier pages. But since I prefer lower temperatures and am worried about VRM and memory temps not yet being reported by GPU-Z, I usually run it at 60% fan speed and get temps around 72C. The blower fan at 60% speed is quite loud (its a reference design from Sapphire).

At 100% fan speed, the overclocked card gets below 60C while mining but you can hear it from outside of the house at this point Tongue

, so as lovely as these temps are this is not an option for me as it is also my gaming and work PC.

terrytibbs

hero member

Activity: 560

Merit: 501

Quote from: ?? on ??

okay, i will fly to Singapore and pick one up if it all makes you happy....

i got a girl there:P

Is it Mrs. Zhou Tong?

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: 1onevvolf on January 09, 2012, 12:55:18 PM

Quote from: DeathAndTaxes on January 09, 2012, 10:20:35 AM

Hey OP do you have a kill-a-watt you could purchase locally. If you are in the states Home Depot and Lowes carry them. If you can find one locally I am sure we could get together the 3 or 4 BTC to get some accurate power readings.

The kill-a-watt brand doesn't appear to be commercialized here in europe, and I've been searching for an equivalent device locally each time I've had a chance to head out to a store for the past couple of days, but no luck so far.

Well that sucks. A more universal albeit expensive tool is a clamp meter.

1onevvolf

newbie

Activity: 43

Merit: 0

Quote from: DiabloD3 on January 09, 2012, 01:10:19 PM

Wait wait wait. Are we sure uint16 is such a good idea? Last time I tried >4 (which was before 2.6, btw, I haven't tested with 2.6), it would crash in the compiler. Also, does anyone have a count on the number of registers per CU? There might not be enough registers to handle that.

I'm not sure if it's a good idea or not so I wanted to measure it Wink

GCN has 64KB worth of registers per CU, and like you said I'm not sure if that's enough. The reason for my curiosity was because GCN's compute units each contain 4 x SIMD units with a width of 16 elements (same size as Larrabee & Intel's MIC, coincidentally), and I recall reading somewhere that each of these SIMD units can retire one 16-way instruction every 4 cycles, so those 16element vectors kind of rang out at me. I also wanted to get familiar with the OpenCL bitcoin mining code and thought it would be a neat exercise (which it was!). Nice code by the way.

I can say for sure that 16element vectors DO compile with the drivers that came with the card.

The -ds code dump for 16 element vectors came out nice and clean, although the last few lines where the result is stored in output seem a bit branchy. It looks something like this:

Code:

if(XG2.s0 == 0x136032ED) { output[Xnonce.s0 & 0xF] = Xnonce.s0; }
if(XG2.s1 == 0x136032ED) { output[Xnonce.s1 & 0xF] = Xnonce.s1; }
if(XG2.s2 == 0x136032ED) { output[Xnonce.s2 & 0xF] = Xnonce.s2; }
...
...
if(XG2.sd == 0x136032ED) { output[Xnonce.sd & 0xF] = Xnonce.sd; }
if(XG2.se == 0x136032ED) { output[Xnonce.se & 0xF] = Xnonce.se; }
if(XG2.sf == 0x136032ED) { output[Xnonce.sf & 0xF] = Xnonce.sf; }

I tried replacing it with a branch-less expression using shuffle() and vstore16() but haven't managed to get it working. What I've come up with looks something like this:

Code:

x mask = Xnonce & 0xF;
x temp = shuffle(select(Xnonce, 0, selection), mask);
vstore16(temp, 0, output);

Anyhow I'm sure that my code modifications are doing all sorts of dumb things. I'm still learning how it all works so please ignore.

Quote from: DiabloD3 on January 09, 2012, 01:10:19 PM

Also, check some of the larger -vs, -v 40 is two sets of uint4 and -v 44 does three uint4s (unlike cgminer, -v 4 does two uint2s).

I've tried all of the different -v settings available (according to the source) but haven't been able to get any higher than the 666MH/s with the default settings and 3 compute threads.

luo demin

newbie

Activity: 70

Merit: 0

I can't wait till the 7990 that is going to be impressive but expensive Sad

I might have missed this but what is the heat like hashing overclocked ? and what fan speed

DiabloD3

legendary

Activity: 1162

Merit: 1000

DiabloMiner author

Quote from: 1onevvolf on January 09, 2012, 12:55:18 PM

Quote from: DeathAndTaxes on January 09, 2012, 10:20:35 AM

Hey OP do you have a kill-a-watt you could purchase locally. If you are in the states Home Depot and Lowes carry them. If you can find one locally I am sure we could get together the 3 or 4 BTC to get some accurate power readings.

The kill-a-watt brand doesn't appear to be commercialized here in europe, and I've been searching for an equivalent device locally each time I've had a chance to head out to a store for the past couple of days, but no luck so far.

I also took a stab at modifying DiabloMiner and managed to get it to use 16component vectors, which is what GCN is supposed to be tuned for, but performance isn't what I expect and its really hard to profile/debug the tahiti since I could not find any development tools that specificly support it yet.

Wait wait wait. Are we sure uint16 is such a good idea? Last time I tried >4 (which was before 2.6, btw, I haven't tested with 2.6), it would crash in the compiler. Also, does anyone have a count on the number of registers per CU? There might not be enough registers to handle that.

Also, check some of the larger -vs, -v 40 is two sets of uint4 and -v 44 does three uint4s (unlike cgminer, -v 4 does two uint2s).

1onevvolf

newbie

Activity: 43

Merit: 0

Quote from: DeathAndTaxes on January 09, 2012, 10:20:35 AM

Hey OP do you have a kill-a-watt you could purchase locally. If you are in the states Home Depot and Lowes carry them. If you can find one locally I am sure we could get together the 3 or 4 BTC to get some accurate power readings.

The kill-a-watt brand doesn't appear to be commercialized here in europe, and I've been searching for an equivalent device locally each time I've had a chance to head out to a store for the past couple of days, but no luck so far.

I also took a stab at modifying DiabloMiner and managed to get it to use 16component vectors, which is what GCN is supposed to be tuned for, but performance isn't what I expect and its really hard to profile/debug the tahiti since I could not find any development tools that specificly support it yet.

stoppots

sr. member

Activity: 271

Merit: 250

man wat a beast

DiabloD3

legendary

Activity: 1162

Merit: 1000

DiabloMiner author

Quote from: rjk on January 09, 2012, 10:26:56 AM

Quote from: DiabloD3 on January 09, 2012, 10:24:20 AM

(because large parts of the chip shut off).

I know that the shaders are used to do the hashing, but is it possible to utilize more of the chip, even if it were at dramatically lower efficiency?

No. I already tried to abuse the texture/memory fetch units, but couldn't figure out a useful way of doing it. Its all fixed function hardware and its not particularly interesting for what we do. Although, I may go try that again, SDK 2.6 seems to be a much better compiler in some areas.

rjk

sr. member

Activity: 448

Merit: 250

1ngldh

Quote from: DiabloD3 on January 09, 2012, 10:24:20 AM

(because large parts of the chip shut off).

I know that the shaders are used to do the hashing, but is it possible to utilize more of the chip, even if it were at dramatically lower efficiency?

DiabloD3

legendary

Activity: 1162

Merit: 1000

DiabloMiner author

Quote from: DeathAndTaxes on January 09, 2012, 10:20:35 AM

You think it will be 200 watts w/ a 20% overclock? I wish the OP had a kill-a-watt.

Thats at stock clocks obviously. I don't know what the mining values will be, all the cards draw less than their full wattage at stock speeds when mining (because large parts of the chip shut off). I imagine 79xx may even get a larger efficiency boost due to this because of AMD's work on power saving, but without a killawatt test, no one knows.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: DiabloD3 on January 09, 2012, 10:16:37 AM

Quote from: poppyh on January 09, 2012, 08:50:50 AM

In the UK currently :

5870 costs 170 GBP and gets 440 mhash/s so about 2.6 mhash/GBP

7970 will cost rougly 430 GBP and get 666 mhash/s so about 1.6 mhash/GBP

Thus, the 5870 is still much better and you can also get a 5970 that gets 850 mhash/s for about 400 GBP.

Power figures ?

7970 is going to be 200 watts I believe, and the 5870 is 188 (both at stock watts). This is where the 7970 suddenly shines. Even if the 7970 is 250 watts, thats still a jump in efficiency.

You think it will be 200 watts w/ a 20% overclock? I wish the OP had a kill-a-watt.

Hey OP do you have a kill-a-watt you could purchase locally. If you are in the states Home Depot and Lowes carry them. If you can find one locally I am sure we could get together the 3 or 4 BTC to get some accurate power readings.

DiabloD3

legendary

Activity: 1162

Merit: 1000

DiabloMiner author

Quote from: poppyh on January 09, 2012, 08:50:50 AM

In the UK currently :

5870 costs 170 GBP and gets 440 mhash/s so about 2.6 mhash/GBP

7970 will cost rougly 430 GBP and get 666 mhash/s so about 1.6 mhash/GBP

Thus, the 5870 is still much better and you can also get a 5970 that gets 850 mhash/s for about 400 GBP.

Power figures ?

7970 is going to be 200 watts I believe, and the 5870 is 188 (both at stock watts). This is where the 7970 suddenly shines. Even if the 7970 is 250 watts, thats still a jump in efficiency.

wndrbr3d

hero member

Activity: 914

Merit: 500

Very interesting results! The only missing piece is the power draw from the wall.

My only hesitations at this point are:

1) Price Point/Performance is still super high when compared to used 58xx series cards

2) Lack of optimization in Miners for any new features in GCN/SDK 2.6. Current Miners are heavily optimized for VLIW4/5, so obviously there's going to need to be some re-working for full GCN support.

The only way I can see this card being a viable miner is that it needs to outperform 5970/6990 in performance per watt and $/mhash, otherwise it's just a good excuse to see more 58xx's hitting eBay since gamers will be upgrading...

Thanks for the initial benchmarks though OP!

poppyh

newbie

Activity: 10

Merit: 0

In the UK currently :

5870 costs 170 GBP and gets 440 mhash/s so about 2.6 mhash/GBP

7970 will cost rougly 430 GBP and get 666 mhash/s so about 1.6 mhash/GBP

Thus, the 5870 is still much better and you can also get a 5970 that gets 850 mhash/s for about 400 GBP.

Power figures ?

sadpandatech

hero member

Activity: 504

Merit: 500

Quote from: 1onevvolf on January 04, 2012, 09:46:23 PM

With those two changes to the default configuration of cgminer hashes start to get accepted, but the 290MH/s hashing performance with the default settings (-g 2 -v 2 -w 128) for this kernel were slower than the 310MH/s from the trusty OC'd HD5850 that this new card replaced, so I played around with the --gpu-threads, --vectors and --worksize settings and here's a small table with the results:
--gpu-threads 1 --vectors 2 --worksize 32 : 141MH/s
--gpu-threads 1 --vectors 2 --worksize 64 : 285MH/s
--gpu-threads 1 --vectors 2 --worksize 128 : 283MH/s
--gpu-threads 1 --vectors 2 --worksize 256 : 284MH/s

--gpu-threads 1 --vectors 4 --worksize 32 : 66MH/s
--gpu-threads 1 --vectors 4 --worksize 64 : 133MH/s
--gpu-threads 1 --vectors 4 --worksize 128 : 133MH/s
--gpu-threads 1 --vectors 4 --worksize 256 : 133MH/s

Not that it might matter much at this point but with vectors 4, and I believe 2, to some extent. There is a need to adjust the memory clock in order to optimize it. I am not sure it would even help being CGN. But, if you get time, I'd check it out. Sadly, I've no clue where that thread is at this time. :/

Quote from: 1onevvolf on January 04, 2012, 09:46:23 PM

**** UPDATE ****

Someone suggested that I give a recent version of the DiabloMiner a try since it should have decent support for GCN, so I did.

~650MH/s with the default diablominer settings and the card OC'd @ 1125/975MHz:

~530MH/s at standard clocks:

pretty freakin awesome, if you ask me. Now if they can just sell the things for <$400 I'd be happy. Do you have any TPD numbers with this card?

chromeguy

newbie

Activity: 28

Merit: 0

Quote from: bulanula on January 09, 2012, 06:41:52 AM

6990 is dual GPU so has total of 3072 shaders gets about 800 mhash/s using two cores total.
7970 is single GPU so has total of 2048 shaders get about 666 mhash/s using one core total.

i stand corrected, thought it was also dual.
i feel the need for read

DiabloD3

legendary

Activity: 1162

Merit: 1000

DiabloMiner author

Quote from: bulanula on January 09, 2012, 06:41:52 AM

6990 is dual GPU so has total of 3072 shaders gets about 800 mhash/s using two cores total.
7970 is single GPU so has total of 2048 shaders get about 666 mhash/s using one core total.

Get some sleep dude and stay off SR !

This.

bulanula

hero member

Activity: 518

Merit: 500

Quote from: chromeguy on January 09, 2012, 06:36:09 AM

Quote from: bulanula on January 09, 2012, 04:45:58 AM

Quote from: chromeguy on January 09, 2012, 04:40:31 AM

now this card is out.. how about some ++real++ benchmarks?
this thing has 500 more sp than the 6990 - how can it possibly be slower? Huh

LOL you don't know a thing about mining, do you ?

6990 has 3 072 or 2*1536

5970 has 2*1600 or 3200 shaders

7970 has 2048 shaders

5870 has 1600 shaders

7990 supposedly has 4096 shaders ?

thanks for repeating what i said.
7970 has 2048 SP
6990 has 1536 SP
thats 500 more. per core, whatever, its more, MORE. so why is the card putting out so much less? maybe if you put your effort into explaining a decent answer (such as 'because the miners need re-optimisation') instead of being a flabberfinger - we could have all benefited.

6990 is dual GPU so has total of 3072 shaders gets about 800 mhash/s using two cores total.
7970 is single GPU so has total of 2048 shaders get about 666 mhash/s using one core total.

Get some sleep dude and stay off SR !

chromeguy

newbie

Activity: 28

Merit: 0

Quote from: bulanula on January 09, 2012, 04:45:58 AM

Quote from: chromeguy on January 09, 2012, 04:40:31 AM

now this card is out.. how about some ++real++ benchmarks?
this thing has 500 more sp than the 6990 - how can it possibly be slower? Huh

LOL you don't know a thing about mining, do you ?

6990 has 3 072 or 2*1536

5970 has 2*1600 or 3200 shaders

7970 has 2048 shaders

5870 has 1600 shaders

7990 supposedly has 4096 shaders ?

thanks for repeating what i said.
7970 has 2048 SP
6990 has 1536 SP
thats 500 more. per core, whatever, its more, MORE. so why is the card putting out so much less? maybe if you put your effort into explaining a decent answer (such as 'because the miners need re-optimisation') instead of being a flabberfinger - we could have all benefited.

Topic: My initial Radeon HD 7970 mining benchmarks - page 8. (Read 46846 times)