Pages:
Author

Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 11. (Read 155645 times)

sr. member
Activity: 652
Merit: 266
Green? nVidia tuning?

Sorry if it's a newbie question but... I'm really intersted. It it's true, I can run naked in the street to celebrate  Grin

The Green GPUs are of more interest. Meanwhile, even if pushed to 33mh/s and undervolted to 100-110w, RX 400/500 isint that interesting anymore.

Nvidia timing mods are rather restricted regarding GTX atm. But i'm still collecting interestees;

https://pastebin.com/63RWr85T
Are you sure those are restricted? Last time I checked nvidia published it publically Cheesy

legendary
Activity: 1302
Merit: 1068
Green? nVidia tuning?

Sorry if it's a newbie question but... I'm really intersted. It it's true, I can run naked in the street to celebrate  Grin

The Green GPUs are of more interest. Meanwhile, even if pushed to 33mh/s and undervolted to 100-110w, RX 400/500 isint that interesting anymore.

Nvidia timing mods are rather restricted regarding GTX atm. But i'm still collecting interestees;

https://pastebin.com/63RWr85T
member
Activity: 81
Merit: 1002
It was only the wind.
Explain, please, how can I save original bios (aka atiflash in win), modify timings (aka polarisbioseditor in win) and flash a new bios to card (aka atiflash in win) in ubuntu server, without gui?
You might flash the bios without gui, but there's obviously no way to edit the timings since you need a gui software for that. You'll need to find a windows machine to do that.

No you don't - PBE is shit anyways.
jr. member
Activity: 196
Merit: 1
damn I've been out of the GPU game for past few months and you guys are making quite the progress on this front.
Green team is more interesting lately Cheesy

Quote
--> CONFIG0
   RC = 88,  RFC = 220,  RAS = 60,  RP = 28,  RESERVED = 0

--> CONFIG1
   CL = 25,  WL = 5,  RD_RCD = 30,  WR_RCD = 18,  RESERVED = 10

--> CONFIG2
   RPRE = 0,  WPRE = 0,  CDLR = 10,  PAD0 = 0,  WR = 28,  PAD1 = 0,  W2R_BUS = 8,  R2W_BUS = 8

--> CONFIG3
   PDEX = 12,  PDEN2PDEX = 10,  FAW = 32,  AOND = 0,  CCDL = 3,  CCDS = 2

--> CONFIG4
   REFRESH_LO = 2,  REFRESH = 6,  RRD = 8,  DELAY0 = 44,  RESERVED = 0

--> CONFIG5
   ADR_MIN = 2,  PAD0 = 0,  WRCRC = 14,  PAD1 = 0,  OFFSET0 = 39,  INTRP_MSB = 0,  OFFSET1 = 11,  OFFSET2 = 6,  INTRP = 10

Green? nVidia tuning?

Sorry if it's a newbie question but... I'm really intersted. It it's true, I can run naked in the street to celebrate  Grin
sr. member
Activity: 652
Merit: 266
If my suspicion about the labels being reversed is correct, and 32AW is a multiplier for FAW, then 6/8 is perfectly fine.  FAW would be 6*RRD, and 32AW would be 8*6*RRD.
That also would better explain the observation that setting what they think is 32AW to zero causes instability.

As for the tool I'm working on, you can get an idea of how it works by looking at my fork of amdmeminfo.
https://github.com/nerdralph/amdmeminfo
I modified it to show runtime strap values.  My tool doesn't just read them, it writes them too.
I believe Yurio is ahead of me in this process with his Windows miner now being able to do runtime timing modifications.
https://github.com/zawawawa/GatelessGateSharp




That formula seems wrong, or i am missing something.

Elpida is as follow:

1500: 6/10/7
1625: 7/12/8
1750: 7/12/8
2000: 7/12/8

No matter how you put it, neither tFAW and tFAW32 can be any multiple of tRRD.
Currently using tFAW = 0 and tFAW32 = 4.

For what i remember actually, tFAW if its set to 0, its actually read as 8. Dunno if each single value counts as 8, or if its just timing properties. If its the first - that would make sense for the 1625-2000Mhz straps. That makes possible the following : tFAW = 7 * 8, Which would also fall in line with what you said.

I said, "32AW is a multiplier for FAW", not a "multiple of".  So programming memory controller to use 8 for 32AW means 8*FAW, and a value of 6 for FAW means 6*RRD.
Using anything lower than 8 for 32AW should be irrelevant, as should using anything lower than 4 for FAW.

Another complication in finding optimal timings is that they vary slightly according to the memory controller characteristics.  Newer GPUs like Tonga and Polaris have improved memory controllers, likely with deeper queues, which allows more memory accesses to be grouped together in the same bank, reducing the impact of FAW & 32AW.  These newer cards tend to benefit from slightly looser timing (CAS, RRD & FAW) with a higher memory clock.  Older GPUs like Pitcairn seem to be better with a slightly lower memory clock but tighter timing.


GDDR5 Jedec standard - page 58 says it clearly.
althought there are some workarounds that are vendor specific.
sr. member
Activity: 652
Merit: 266
damn I've been out of the GPU game for past few months and you guys are making quite the progress on this front.
Green team is more interesting lately Cheesy

Quote
--> CONFIG0
   RC = 88,  RFC = 220,  RAS = 60,  RP = 28,  RESERVED = 0

--> CONFIG1
   CL = 25,  WL = 5,  RD_RCD = 30,  WR_RCD = 18,  RESERVED = 10

--> CONFIG2
   RPRE = 0,  WPRE = 0,  CDLR = 10,  PAD0 = 0,  WR = 28,  PAD1 = 0,  W2R_BUS = 8,  R2W_BUS = 8

--> CONFIG3
   PDEX = 12,  PDEN2PDEX = 10,  FAW = 32,  AOND = 0,  CCDL = 3,  CCDS = 2

--> CONFIG4
   REFRESH_LO = 2,  REFRESH = 6,  RRD = 8,  DELAY0 = 44,  RESERVED = 0

--> CONFIG5
   ADR_MIN = 2,  PAD0 = 0,  WRCRC = 14,  PAD1 = 0,  OFFSET0 = 39,  INTRP_MSB = 0,  OFFSET1 = 11,  OFFSET2 = 6,  INTRP = 10
sr. member
Activity: 2632
Merit: 328
Anyone has idea why 280X with its huge memory bandwidth has such poor ethash performance?
legendary
Activity: 2174
Merit: 1401
damn I've been out of the GPU game for past few months and you guys are making quite the progress on this front.
jr. member
Activity: 39
Merit: 2
Any users here with R9 380 4GB ? I modify bioses 1 month ago with hex editor ( strings from rx470 at 1500Mhz and 1625Mhz ) and work fine, about 22.6 mh/s on ETH . From few hrs, one of my card stop mining, after 2 min from start sgminer ( is declared DEAD ) . I also try different OC settings, but no success.

If I put other strings, this card start working again, but with lower hashrate ( about 20mh/s )

What`s the problem ? Can anybody help me with new strings ? I can upload here stock bios and mod bios ...
member
Activity: 340
Merit: 29

If my suspicion about the labels being reversed is correct, and 32AW is a multiplier for FAW, then 6/8 is perfectly fine.  FAW would be 6*RRD, and 32AW would be 8*6*RRD.
That also would better explain the observation that setting what they think is 32AW to zero causes instability.

As for the tool I'm working on, you can get an idea of how it works by looking at my fork of amdmeminfo.
https://github.com/nerdralph/amdmeminfo
I modified it to show runtime strap values.  My tool doesn't just read them, it writes them too.
I believe Yurio is ahead of me in this process with his Windows miner now being able to do runtime timing modifications.
https://github.com/zawawawa/GatelessGateSharp


Of course - that makes more sense...

Thanks for the links - time to grab a beer and start reading code! Smiley
sr. member
Activity: 588
Merit: 251
If my suspicion about the labels being reversed is correct, and 32AW is a multiplier for FAW, then 6/8 is perfectly fine.  FAW would be 6*RRD, and 32AW would be 8*6*RRD.
That also would better explain the observation that setting what they think is 32AW to zero causes instability.

As for the tool I'm working on, you can get an idea of how it works by looking at my fork of amdmeminfo.
https://github.com/nerdralph/amdmeminfo
I modified it to show runtime strap values.  My tool doesn't just read them, it writes them too.
I believe Yurio is ahead of me in this process with his Windows miner now being able to do runtime timing modifications.
https://github.com/zawawawa/GatelessGateSharp




That formula seems wrong, or i am missing something.

Elpida is as follow:

1500: 6/10/7
1625: 7/12/8
1750: 7/12/8
2000: 7/12/8

No matter how you put it, neither tFAW and tFAW32 can be any multiple of tRRD.
Currently using tFAW = 0 and tFAW32 = 4.

For what i remember actually, tFAW if its set to 0, its actually read as 8. Dunno if each single value counts as 8, or if its just timing properties. If its the first - that would make sense for the 1625-2000Mhz straps. That makes possible the following : tFAW = 7 * 8, Which would also fall in line with what you said.

I said, "32AW is a multiplier for FAW", not a "multiple of".  So programming memory controller to use 8 for 32AW means 8*FAW, and a value of 6 for FAW means 6*RRD.
Using anything lower than 8 for 32AW should be irrelevant, as should using anything lower than 4 for FAW.

Another complication in finding optimal timings is that they vary slightly according to the memory controller characteristics.  Newer GPUs like Tonga and Polaris have improved memory controllers, likely with deeper queues, which allows more memory accesses to be grouped together in the same bank, reducing the impact of FAW & 32AW.  These newer cards tend to benefit from slightly looser timing (CAS, RRD & FAW) with a higher memory clock.  Older GPUs like Pitcairn seem to be better with a slightly lower memory clock but tighter timing.

jr. member
Activity: 194
Merit: 4
I recently decided to take another look at RAM timings now that many people have been playing with custom timings for several months.

Instead of zeroing FAW/32AW as I do in my strapmod utility, I think it is better to set them low.  7/10 seems to be standard for Hynix straps, while 6/8 is typical for Samsung.  My testing so far using 6/8 on Hynix gives the same performance for eth mining as 0/0, while potentially being more stable at higher memory clocks.
I also suspect the field labels may be reversed, so FAW is really 32AW, and vice-verse.  In my testing, when I set "FAW" to 8 or 10 while leaving "32AW" to zero, I see no slowdown in ethash speed.

I may update my strapmod when I'm done, but my goal is to finally finish a Linux tool I started working on last year that tweaks timings at runtime.  This allows for tuning the memory timing for ethash, equihash, or cryptonight without reflashing the BIOS and rebooting.  It has already greatly improved strap testing, as I have been able to modify timings on the fly while mining; I don't even have to restart the miner.  I still crash the GPU a lot, as dynamically switching between two sets of good timing will still cause a hang depending on which timings you switch in what order.


I have sort of been under the assumption that in the presence of an invalid value, memory controllers must be calculating t32aw automatically.  Every strap I've ever seen (including from mfgs) is out of spec, given that t32aw should always be >= 8 x tfaw, as i understand it.

Btw, would love to take a peek at your tool Smiley



If my suspicion about the labels being reversed is correct, and 32AW is a multiplier for FAW, then 6/8 is perfectly fine.  FAW would be 6*RRD, and 32AW would be 8*6*RRD.
That also would better explain the observation that setting what they think is 32AW to zero causes instability.

As for the tool I'm working on, you can get an idea of how it works by looking at my fork of amdmeminfo.
https://github.com/nerdralph/amdmeminfo
I modified it to show runtime strap values.  My tool doesn't just read them, it writes them too.
I believe Yurio is ahead of me in this process with his Windows miner now being able to do runtime timing modifications.
https://github.com/zawawawa/GatelessGateSharp




That formula seems wrong, or i am missing something.

Elpida is as follow:

1500: 6/10/7
1625: 7/12/8
1750: 7/12/8
2000: 7/12/8

No matter how you put it, neither tFAW and tFAW32 can be any multiple of tRRD.
Currently using tFAW = 0 and tFAW32 = 4.

For what i remember actually, tFAW if its set to 0, its actually read as 8. Dunno if each single value counts as 8, or if its just timing properties. If its the first - that would make sense for the 1625-2000Mhz straps. That makes possible the following : tFAW = 7 * 8, Which would also fall in line with what you said.
sr. member
Activity: 588
Merit: 251
I recently decided to take another look at RAM timings now that many people have been playing with custom timings for several months.

Instead of zeroing FAW/32AW as I do in my strapmod utility, I think it is better to set them low.  7/10 seems to be standard for Hynix straps, while 6/8 is typical for Samsung.  My testing so far using 6/8 on Hynix gives the same performance for eth mining as 0/0, while potentially being more stable at higher memory clocks.
I also suspect the field labels may be reversed, so FAW is really 32AW, and vice-verse.  In my testing, when I set "FAW" to 8 or 10 while leaving "32AW" to zero, I see no slowdown in ethash speed.

I may update my strapmod when I'm done, but my goal is to finally finish a Linux tool I started working on last year that tweaks timings at runtime.  This allows for tuning the memory timing for ethash, equihash, or cryptonight without reflashing the BIOS and rebooting.  It has already greatly improved strap testing, as I have been able to modify timings on the fly while mining; I don't even have to restart the miner.  I still crash the GPU a lot, as dynamically switching between two sets of good timing will still cause a hang depending on which timings you switch in what order.


I have sort of been under the assumption that in the presence of an invalid value, memory controllers must be calculating t32aw automatically.  Every strap I've ever seen (including from mfgs) is out of spec, given that t32aw should always be >= 8 x tfaw, as i understand it.

Btw, would love to take a peek at your tool Smiley

If my suspicion about the labels being reversed is correct, and 32AW is a multiplier for FAW, then 6/8 is perfectly fine.  FAW would be 6*RRD, and 32AW would be 8*6*RRD.
That also would better explain the observation that setting what they think is 32AW to zero causes instability.

As for the tool I'm working on, you can get an idea of how it works by looking at my fork of amdmeminfo.
https://github.com/nerdralph/amdmeminfo
I modified it to show runtime strap values.  My tool doesn't just read them, it writes them too.
I believe Yurio is ahead of me in this process with his Windows miner now being able to do runtime timing modifications.
https://github.com/zawawawa/GatelessGateSharp


jr. member
Activity: 194
Merit: 4
I had found a new better memory strap for Elpida RAM. More than 1kh/s cryptonight xmr ! Cheesy

RX 470 : 1020 h/s  , sgminer @1250 core, @2100 mem, 66W (gpu-z)

                   942 h/s , @1150 core, @2050 mem, 52W (gpu-z)

777000000000000022AA1C00AC615B3CA0550F142C8C1506006004007C041420CA8980A9020004C 01712262B612B3715

I don´t know the eth hash speed.  Huh Maybe somebody can check it ?

This strap is bad. Both of my cards bsod hard with it. Also its slower and runs with more power. Trfx and ras2ras are too tight.

I often see TRFC and ras2ras for elpida straps at that value

Its actually default value for 1500MHz strap
Make it better

My bad, i didnt say that properly. The trfc and ras2ras for xryptonight is nearly useless, you lose about 10h/s when going to 150 for example. But the power cons drops by 5watts. Thats for cryptonight.

Also, all trcd values are not properly done, the A value should be the same as the non A one.

Make it better ! I test so much straps, and that are the best, I can find.

Okay...

####SEQ_RAS_TIMING####
TRCDW = 12
TRCDWA = 13
TRCDR = 24
TRCDRA = 22
TRRD = 5
TRC = 60
####SEQ_CAS_TIMING####
TNOPW = 0
TNOPR = 0
TR2W = 26
TCCDL = 2
TCCDS = 5
TW2R = 15
TCL = 20
####SEQ_MISC_TIMING####
TRP_WRA = 44
TRP_RDA = 24
TRP = 22
TRFC = 97
####SEQ_MISC_TIMING2####
PA2RDATA = 0
PA2WDATA = 0
TFAW = 0
TCRCRL = 3
TCRCWL = 4
T32AW = 0
TWDATATR = 0
####ARB_DRAM_TIMING####
ACTRD = 23
ACTWR = 18
RASMACTRD = 38
RASMACTWR = 43
####ARB_DRAM_TIMING2####
RAS2RAS = 97
RP = 43
WRPLUSRP = 55
BUS_TURN = 21


This is yours.


Below is yours, with the fixes applied.

####SEQ_RAS_TIMING####
TRCDW = 12
TRCDWA = 12
TRCDR = 24
TRCDRA = 24
TRRD = 5
TRC = 60
####SEQ_CAS_TIMING####
TNOPW = 0
TNOPR = 0
TR2W = 26
TCCDL = 2
TCCDS = 5
TW2R = 15
TCL = 20
####SEQ_MISC_TIMING####
TRP_WRA = 44
TRP_RDA = 24
TRP = 22
TRFC = 97
####SEQ_MISC_TIMING2####
PA2RDATA = 0
PA2WDATA = 0
TFAW = 0
TCRCRL = 2
TCRCWL = 2
T32AW = 4
TWDATATR = 0
####ARB_DRAM_TIMING####
ACTRD = 23
ACTWR = 18
RASMACTRD = 38
RASMACTWR = 43
####ARB_DRAM_TIMING2####
RAS2RAS = 97
RP = 43
WRPLUSRP = 55
BUS_TURN = 21


See the difference?

TRFC and RAS2RAS are left like that, since you use the same strap for Ethash. Otherwise, you increase them by 50% for Cryptonight.

I already gave you critic, now i even give you an example of how it would better in terms of stability/speed.
jr. member
Activity: 194
Merit: 4
Heh, got my Elpida strap stable.

At 1200/2000Mhz, i am getting 975h/s with 896/8/2 and Compute Mode.

At 1280/2000Mhz - 1020h/s.

With the other card, i am getting 1068h/s at 1380/2000, or 1038h/s at 1280/2000.

Can you share your straps?

its not finished, not 100% stable on both cards. I dont have enough free time to update it.

Another tweak I've wondered about is monitoring memory die temps.  GDDR5 chips have internal thermal diodes similar to the GPU, and they can be enabled by setting a bit in mode register 7.  Knowing the temperature of each memory die could provide useful diagnostic info for problems such as a heatsink not having good contact with the chip.


Yes!  Mem temp management is critical to getting Vegas stable - I'm sure it would be hugely useful for stabilizing polaris as well.  I now use a rule of thumb of core generally being about 10º cooler than mem, but clearly this is a very rough estimate, and depends significantly on heat sink design (or lack thereof.)

Also might give some sense of the 'asic quality' of the mem (if that's a thing...)

From personal experience, TFAW = 0 is not an issue, but TFAW32 = 0 alongside, causes instability, often seen with Monitor Sleep and background refreshes. Using TFAW32 = 4.
I recently decided to take another look at RAM timings now that many people have been playing with custom timings for several months.

Instead of zeroing FAW/32AW as I do in my strapmod utility, I think it is better to set them low.  7/10 seems to be standard for Hynix straps, while 6/8 is typical for Samsung.  My testing so far using 6/8 on Hynix gives the same performance for eth mining as 0/0, while potentially being more stable at higher memory clocks.
I also suspect the field labels may be reversed, so FAW is really 32AW, and vice-verse.  In my testing, when I set "FAW" to 8 or 10 while leaving "32AW" to zero, I see no slowdown in ethash speed.

I may update my strapmod when I'm done, but my goal is to finally finish a Linux tool I started working on last year that tweaks timings at runtime.  This allows for tuning the memory timing for ethash, equihash, or cryptonight without reflashing the BIOS and rebooting.  It has already greatly improved strap testing, as I have been able to modify timings on the fly while mining; I don't even have to restart the miner.  I still crash the GPU a lot, as dynamically switching between two sets of good timing will still cause a hang depending on which timings you switch in what order.


I have sort of been under the assumption that in the presence of an invalid value, memory controllers must be calculating t32aw automatically.  Every strap I've ever seen (including from mfgs) is out of spec, given that t32aw should always be >= 8 x tfaw, as i understand it.

Btw, would love to take a peek at your tool Smiley

Well, 8*0 is 0 Cheesy
There is probably something like that as well, but there is clearly a difference for me, higher instability when both are 0.
member
Activity: 340
Merit: 29
I recently decided to take another look at RAM timings now that many people have been playing with custom timings for several months.

Instead of zeroing FAW/32AW as I do in my strapmod utility, I think it is better to set them low.  7/10 seems to be standard for Hynix straps, while 6/8 is typical for Samsung.  My testing so far using 6/8 on Hynix gives the same performance for eth mining as 0/0, while potentially being more stable at higher memory clocks.
I also suspect the field labels may be reversed, so FAW is really 32AW, and vice-verse.  In my testing, when I set "FAW" to 8 or 10 while leaving "32AW" to zero, I see no slowdown in ethash speed.

I may update my strapmod when I'm done, but my goal is to finally finish a Linux tool I started working on last year that tweaks timings at runtime.  This allows for tuning the memory timing for ethash, equihash, or cryptonight without reflashing the BIOS and rebooting.  It has already greatly improved strap testing, as I have been able to modify timings on the fly while mining; I don't even have to restart the miner.  I still crash the GPU a lot, as dynamically switching between two sets of good timing will still cause a hang depending on which timings you switch in what order.


I have sort of been under the assumption that in the presence of an invalid value, memory controllers must be calculating t32aw automatically.  Every strap I've ever seen (including from mfgs) is out of spec, given that t32aw should always be >= 8 x tfaw, as i understand it.

Btw, would love to take a peek at your tool Smiley
member
Activity: 340
Merit: 29

Another tweak I've wondered about is monitoring memory die temps.  GDDR5 chips have internal thermal diodes similar to the GPU, and they can be enabled by setting a bit in mode register 7.  Knowing the temperature of each memory die could provide useful diagnostic info for problems such as a heatsink not having good contact with the chip.


Yes!  Mem temp management is critical to getting Vegas stable - I'm sure it would be hugely useful for stabilizing polaris as well.  I now use a rule of thumb of core generally being about 10º cooler than mem, but clearly this is a very rough estimate, and depends significantly on heat sink design (or lack thereof.)

Also might give some sense of the 'asic quality' of the mem (if that's a thing...)
full member
Activity: 504
Merit: 122
Heh, got my Elpida strap stable.

At 1200/2000Mhz, i am getting 975h/s with 896/8/2 and Compute Mode.

At 1280/2000Mhz - 1020h/s.

With the other card, i am getting 1068h/s at 1380/2000, or 1038h/s at 1280/2000.

Can you share your straps?
full member
Activity: 364
Merit: 106
ONe Social Network.
im looking for bether samsung cryptonight straps than ubermix 3.1, that give significant speedbost on 1150-1200 coreclock, dont want to overclock to 1300+ because of 50% more powerdraw for 100hs

would trade for hynix straps with 940hs @1180core/2080memclock on rx 580




sr. member
Activity: 2632
Merit: 328
I had found a new better memory strap for Elpida RAM. More than 1kh/s cryptonight xmr ! Cheesy

RX 470 : 1020 h/s  , sgminer @1250 core, @2100 mem, 66W (gpu-z)

                   942 h/s , @1150 core, @2050 mem, 52W (gpu-z)

777000000000000022AA1C00AC615B3CA0550F142C8C1506006004007C041420CA8980A9020004C 01712262B612B3715

I don´t know the eth hash speed.  Huh Maybe somebody can check it ?

This strap is bad. Both of my cards bsod hard with it. Also its slower and runs with more power. Trfx and ras2ras are too tight.

I often see TRFC and ras2ras for elpida straps at that value

Its actually default value for 1500MHz strap
Make it better

My bad, i didnt say that properly. The trfc and ras2ras for xryptonight is nearly useless, you lose about 10h/s when going to 150 for example. But the power cons drops by 5watts. Thats for cryptonight.

Also, all trcd values are not properly done, the A value should be the same as the non A one.

Make it better ! I test so much straps, and that are the best, I can find.

Good idea Smiley

And shame you have no merit, despite helping people a lot, so let me give you first one
Pages:
Jump to: