Author

Topic: optimal eth mining core:mem ratio 0.56 (Read 873 times)

legendary
Activity: 1151
Merit: 1001
October 27, 2016, 03:26:26 AM
#10
I guess older cards (7xxx) have bigger optimal ratio, because their implementation of memory subsystem (cache? latency?) is worse.
For me optimal 7950/7970 ratio is near 2:3, above 0.6 definitely, maybe 0.62-0.64 (like 950/1500).
sr. member
Activity: 588
Merit: 251
October 26, 2016, 06:34:17 PM
#9
With a 280X setting a core clock of 800Mhz reduces the speed to like 10MH/s from 17MH/s.

The 280x has a 384-bit memory interface, not 256.

So whats the formula for the 280x? 290?

Only ever had one 280x and sold it months ago.  I have a 290x and have had some 290's before, and with a Stilt BIOS, the optimal ratio is close to 1:1.  I have my 290x clocked at 950/1050, which gives me ~26Mh with Genoil's miner and a fraction over 27 with Wolf's sgminer fork.
sr. member
Activity: 588
Merit: 251
October 26, 2016, 06:25:59 PM
#8
But that ratio also depends on the memory timing. With lower latency, the ratio can be increased further.

With stock bios and slower memory timing, you can get away with .54 or .55.  With a tuned bios, going over 0.56 is a waste of power.  It's impossible to get more than 24Mh/s with a 1500Mhz memory clock, no matter how much you tweak the memory timing.  And once you add the overhead for memory refresh and memory bus contention between the compute units, 22.5Mh is about the max you can actually get.


How did you work out the ratio?

Trial and error with some interpolation, starting months ago when I wanted to reduce the power use for some R9 380 cards.  I remember starting at around 750Mhz with a couple R9 380 MSI cards running stock BIOS with memory at 1500.  Despite the >20% drop in core rate from 980Mhz, hashrate only dropped ~10% (around 19.5 from 21.5Mh).  Doing a little math suggested the optimal rate was in the low 800's, so I tried 10Mhz increments from 800 to 850, finding that ~820 was the best.  With another R9 380 that had better stock memory timings, 840/1500 was the best (giving 22-22.5Mh).  After tweaking the BIOS memory timings, with memory at 1600, 900 was the optimal core.  I recently got a R9 385 and a Rx 470, and the 0.56 ratio worked for them as well.

I suspect with your private GCN assembler kernel the optimal ratio would be lower; in the 0.4-0.5 range.
legendary
Activity: 3808
Merit: 1723
October 26, 2016, 02:11:46 PM
#7
With a 280X setting a core clock of 800Mhz reduces the speed to like 10MH/s from 17MH/s.

The 280x has a 384-bit memory interface, not 256.


So whats the formula for the 280x? 290?
hero member
Activity: 751
Merit: 517
Fail to plan, and you plan to fail.
October 26, 2016, 01:55:14 PM
#6
Eth mining uses the ethash algorithm, which does a keccak hash (essentially sha3-512), 64 x 128-byte random DAG reads, then another keccak hash.  The keccak hashing is GPU intensive, while the DAG reads are memory intensive.  All the public miners use a similar opencl implementation, and so have similar GPU/MEM requirements.  For AMD cards with a 256-bit memory interface (R9 380, Rx 470/480...), that ratio works out to 0.56.  So with a 1500Mhz memory clock, the optimal core clock is 840Mhz.  Increasing the core clock beyond 840 will not increase hashrate, and ends up using more power.  Similarly, a Rx 470 with a 1750Mhz memory clock should have a core clock of 980Mhz.
One additional condition is that the GPU needs enough compute units so that it can saturate the memory bandwidth.  With the publicly available miners, that minimum is around 22-24 compute units, so cards like the R7 370 do not max out their memory bandwidth.

This ratio is not a fundamental limit of the ethash algorithm, so a kernel with a highly optimized keccak implementation (such as Wolf's private kernel) likely has a lower ratio and a lower minimum number of compute units required for optimal performance.  This means you won't see any miner get more than 28Mh from a Rx 470 at 1750Mhz, someone could release a miner that gets its maximum hashrate with a core clock of much lower than 980Mhz, and therefore reduces power consumption.



Can confirm, a lot of trial and error over weeks got me to 1050 core/1870 mem as the most efficient for my RX 470's and it fits the 0.56 ratio.
sr. member
Activity: 588
Merit: 251
October 26, 2016, 01:16:21 PM
#5
But that ratio also depends on the memory timing. With lower latency, the ratio can be increased further.

With stock bios and slower memory timing, you can get away with .54 or .55.  With a tuned bios, going over 0.56 is a waste of power.  It's impossible to get more than 24Mh/s with a 1500Mhz memory clock, no matter how much you tweak the memory timing.  And once you add the overhead for memory refresh and memory bus contention between the compute units, 22.5Mh is about the max you can actually get.
sr. member
Activity: 588
Merit: 251
October 26, 2016, 01:09:32 PM
#4
With a 280X setting a core clock of 800Mhz reduces the speed to like 10MH/s from 17MH/s.

The 280x has a 384-bit memory interface, not 256.
newbie
Activity: 38
Merit: 0
October 26, 2016, 11:52:09 AM
#3
But that ratio also depends on the memory timing. With lower latency, the ratio can be increased further.
legendary
Activity: 3808
Merit: 1723
October 26, 2016, 11:51:31 AM
#2
With a 280X setting a core clock of 800Mhz reduces the speed to like 10MH/s from 17MH/s.
sr. member
Activity: 588
Merit: 251
October 26, 2016, 11:44:32 AM
#1
Eth mining uses the ethash algorithm, which does a keccak hash (essentially sha3-512), 64 x 128-byte random DAG reads, then another keccak hash.  The keccak hashing is GPU intensive, while the DAG reads are memory intensive.  All the public miners use a similar opencl implementation, and so have similar GPU/MEM requirements.  For AMD cards with a 256-bit memory interface (R9 380, Rx 470/480...), that ratio works out to 0.56.  So with a 1500Mhz memory clock, the optimal core clock is 840Mhz.  Increasing the core clock beyond 840 will not increase hashrate, and ends up using more power.  Similarly, a Rx 470 with a 1750Mhz memory clock should have a core clock of 980Mhz.
One additional condition is that the GPU needs enough compute units so that it can saturate the memory bandwidth.  With the publicly available miners, that minimum is around 22-24 compute units, so cards like the R7 370 do not max out their memory bandwidth.

This ratio is not a fundamental limit of the ethash algorithm, so a kernel with a highly optimized keccak implementation (such as Wolf's private kernel) likely has a lower ratio and a lower minimum number of compute units required for optimal performance.  This means you won't see any miner get more than 28Mh from a Rx 470 at 1750Mhz, someone could release a miner that gets its maximum hashrate with a core clock of much lower than 980Mhz, and therefore reduces power consumption.

Jump to: