Author

Topic: More potential for 7970? (Read 1783 times)

vip
Activity: 571
Merit: 504
I still <3 u Satoshi
June 28, 2012, 07:00:25 PM
#13
oh look its this thread

thanks 1onevvolf  Wink
member
Activity: 72
Merit: 10
June 28, 2012, 05:12:54 PM
#12
You can probably get a bit more out of it yes. How much though, i do not know
member
Activity: 75
Merit: 10
June 28, 2012, 02:16:51 PM
#11
Awesome writeup! To add my 2 Satoshis:

For comparison, I'm running a Radeon 6970 at stock clocks (880 MHz core, 1375 Mem) and am getting right at 389 and change MHash/s sustained throughput.  re-running the calculation gives 880*1536/389=3,475, which is only about 2.9% inefficient, but still less efficient than the 7970 (Way to go AMD!) I'm probably getting that last couple of MHash/s from having a faster processor than the people on the hardware comparison page--I use my computer for a variety of things, not just bitcoin, so I put some money in the CPU, which is an Intel Core i7-3930K, overclocked to 4.4 GHz.  Not a huge leap, and probably not a fair comparison unless OP is using a similar CPU, but it adds another data point.
newbie
Activity: 16
Merit: 0
March 30, 2012, 11:02:36 PM
#10
Nice write up 1onevvolf!

Thanks for taking the time.

-Tossil
legendary
Activity: 1148
Merit: 1008
If you want to walk on water, get out of the boat
March 29, 2012, 02:03:38 PM
#9
Nice explanation
legendary
Activity: 2044
Merit: 1000
March 29, 2012, 01:28:39 PM
#8
Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

Short answer: Yes, slightly better performance is possible.

Long answer: You can expect a little more performance, but unless there's a detail I'm not aware of there is really not much left to gain (1-2% ideally). Let me explain why.

To the best of my knowledge, the closest estimate of the number of mathematical operations required to compute 1 hash is ~3375 (according to Phateus). And if we consider an ideally efficient processor to be one that computes mathematical operations at a rate of one operation per cycle, then hashing would take ~3375 cycles on this ideally efficient processor.

Now lets take a look at what kind of performance we can measure with today's kernels. The 7970 has 2048 stream processors and a stock frequency of 925Mhz, and with the best known kernels it is computing 550MH/s. Knowing this, we can measure the average number of cycles it is taking each stream processor to compute one hash using the following equation:

Code:
               Stream Processor Count x GPU Frequency    2048 x 925MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3444cycles
                         Hashes per second                  550 MH/s

Now if we consider that each stream processor at best can perform one ALU instruction per cycle, then the 7970 is extremely efficient (in cycles per hash) since this 3444 cycle measurement is reaaaallly close to the ideal value of 3375 cycles at one instruction per cycle. This is only a ~2% difference off of ideal and might even be due to measurement error. Its so efficient that unless there is a breakthrough that reduces the amount of operations required per hash, or there's some new GCN instruction that I'm unaware of that allows the GPU to compute several steps of the hashing function in one cycle, or kernels are modified to start taking advantage of fixed-function hardware somehow, then to the best of my knowledge ~550MH/s at stock clocks is pretty much all we're ever going to get.

To give you an idea how efficient the 7970 is at computing hashes we can compare its efficiency (in cycles per hash) with a 6970, which has 1536 stream processors and a stock frequency of 880MHz for the highest reported hashrate of 370MH/s at that frequency (from the mining hardware comparison chart):

Code:
               Stream Processor Count x GPU Frequency    1536 x 880MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3653cycles
                         Hashes per second                  370 MH/s

At an estimate of 3653 cycles, a 6970 stream processor takes ~6% more cycles per hash than a 7970 stream processor at the same frequency, and ~8% more than the ideal 1 instruction per cycle processor.

Now lets compare to a 5870 which has a highest reported hash rate of 379MH/s with its 1600 stream processors and a stock speed of 850MHz:

Code:
               Stream Processor Count x GPU Frequency    1600 x 850MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3588cycles
                         Hashes per second                  379 MH/s

This makes the 5870 roughly 2% more efficient (in cycles per hash) than the 6970, but it still uses ~4% more cycles per hash than the 7970, and ~6% more than the ideal processor. So we can conclude that ATI's GCN is already making ~98% efficient use of its stream processors for hashing, which is more than the VLIW4 and VLIW5 of its previous two generations and close to the ideal. This more efficient stream processor usage along with the increased number of stream processors and higher stock frequency explains the increased hashing performance when compared to the previous generations of GPUs.

Disclaimer: I'm not a GPU programming expert (yet) so please take my answer with a grain a salt. But for what its worth, I develop HPC software for a living that solves problems running on thousands of nodes in parallel.

Awesome.  This is why I love this forum.....so many enlightened people willing to take the time to share vast quantities of knowledge for nothing more than the love of learning. 
full member
Activity: 239
Merit: 100
March 29, 2012, 12:32:10 PM
#7
why do You prefer 7970 instead of 6990 ?

I just replied to the same question on a different thread: https://bitcointalksearch.org/topic/m.826106
newbie
Activity: 29
Merit: 0
March 29, 2012, 12:17:15 PM
#6
why do You prefer 7970 instead of 6990 ?
newbie
Activity: 13
Merit: 0
March 27, 2012, 06:05:28 PM
#5
Thoses 7970 are amazing
full member
Activity: 239
Merit: 100
March 27, 2012, 06:01:30 PM
#4
Wow, 1onevvolf, thanks for the writeup! I wish there was best-of I could nominate you to Smiley
newbie
Activity: 43
Merit: 0
January 17, 2012, 06:38:20 AM
#3
Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

Short answer: Yes, slightly better performance is possible.

Long answer: You can expect a little more performance, but unless there's a detail I'm not aware of there is really not much left to gain (1-2% ideally). Let me explain why.

To the best of my knowledge, the closest estimate of the number of mathematical operations required to compute 1 hash is ~3375 (according to Phateus). And if we consider an ideally efficient processor to be one that computes mathematical operations at a rate of one operation per cycle, then hashing would take ~3375 cycles on this ideally efficient processor.

Now lets take a look at what kind of performance we can measure with today's kernels. The 7970 has 2048 stream processors and a stock frequency of 925Mhz, and with the best known kernels it is computing 550MH/s. Knowing this, we can measure the average number of cycles it is taking each stream processor to compute one hash using the following equation:

Code:
               Stream Processor Count x GPU Frequency    2048 x 925MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3444cycles
                         Hashes per second                  550 MH/s

Now if we consider that each stream processor at best can perform one ALU instruction per cycle, then the 7970 is extremely efficient (in cycles per hash) since this 3444 cycle measurement is reaaaallly close to the ideal value of 3375 cycles at one instruction per cycle. This is only a ~2% difference off of ideal and might even be due to measurement error. Its so efficient that unless there is a breakthrough that reduces the amount of operations required per hash, or there's some new GCN instruction that I'm unaware of that allows the GPU to compute several steps of the hashing function in one cycle, or kernels are modified to start taking advantage of fixed-function hardware somehow, then to the best of my knowledge ~550MH/s at stock clocks is pretty much all we're ever going to get.

To give you an idea how efficient the 7970 is at computing hashes we can compare its efficiency (in cycles per hash) with a 6970, which has 1536 stream processors and a stock frequency of 880MHz for the highest reported hashrate of 370MH/s at that frequency (from the mining hardware comparison chart):

Code:
               Stream Processor Count x GPU Frequency    1536 x 880MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3653cycles
                         Hashes per second                  370 MH/s

At an estimate of 3653 cycles, a 6970 stream processor takes ~6% more cycles per hash than a 7970 stream processor at the same frequency, and ~8% more than the ideal 1 instruction per cycle processor.

Now lets compare to a 5870 which has a highest reported hash rate of 379MH/s with its 1600 stream processors and a stock speed of 850MHz:

Code:
               Stream Processor Count x GPU Frequency    1600 x 850MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3588cycles
                         Hashes per second                  379 MH/s

This makes the 5870 roughly 2% more efficient (in cycles per hash) than the 6970, but it still uses ~4% more cycles per hash than the 7970, and ~6% more than the ideal processor. So we can conclude that ATI's GCN is already making ~98% efficient use of its stream processors for hashing, which is more than the VLIW4 and VLIW5 of its previous two generations and close to the ideal. This more efficient stream processor usage along with the increased number of stream processors and higher stock frequency explains the increased hashing performance when compared to the previous generations of GPUs.

Disclaimer: I'm not a GPU programming expert (yet) so please take my answer with a grain a salt. But for what its worth, I develop HPC software for a living that solves problems running on thousands of nodes in parallel.
full member
Activity: 210
Merit: 100
January 13, 2012, 05:58:44 AM
#2
Diablo miner is being optimized for GCN (https://bitcointalksearch.org/topic/diablominer-gpu-miner-1721)
With time, the drivers will be refined as well. There should be plenty of unused potential in the new architecture.

Check these threads out:
https://bitcointalksearch.org/topic/3x7970-mining-results-57410
https://bitcointalksearch.org/topic/my-initial-radeon-hd-7970-mining-benchmarks-56630

Mind you, it's not pure performance that matters but power efficiency (performance for any given power usage).
vip
Activity: 571
Merit: 504
I still <3 u Satoshi
January 13, 2012, 01:07:15 AM
#1
I'm getting 668Mh/s out of my 7970@1125

Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

yay first post.
Jump to: