Pages:
Author

Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 31. (Read 155645 times)

sr. member
Activity: 588
Merit: 251
Anyway to know which rx gpu models have mem heatsinks? and how much do the heatsinks matter?

I'd say active cooling for the RAM makes a 5% difference.  i.e. overclocking to 2100 vs 2000 for 1750-rated RAM.
sr. member
Activity: 588
Merit: 251
I finally got my hands on the JEDEC GDDR5 spec, and am now convinced that there are still some mistakes (or at least mis-naming) in the strap decoding previously discussed.  One thing I noticed a few weeks ago was that tR2R is not mentioned in the Hynix datasheet.  It's not in the JEDEC spec either (which unsurprisingly is the source of much of the Hynix datasheet).  The closest thing I can find is tCCDS and tCCDL.  However if the field labeled tR2R is 5 clocks, then it can't be either CCDS or CCDL, since CCDS is always 2 clocks, and CCDL is either 2 (bank groups disabled) or 3/4 (bank groups enabled).


No, it's right. TR2R is not mentioned because it's ALWAYS going to be 5 for GDDR5.

OK, so what exactly is it?  If it's not the number of cycles between reads on different banks (since that is tCCD), is it the number of cycles required between 2 reads from the same open page?


I think so, as I know it's calculated from burst length, but I'd have to look up the exact formula again.

My Sapphire Rx470/K4G4 has been running for about a half hour with tR2R=4.  No change in hashrate.  Still at a loss as to what it actually is.

I've also thought more about what tRRD is, and it must be tRRDL.  tRRDS doesn't seem to be in the strap, and is possibly fixed at 4.
sr. member
Activity: 430
Merit: 254
Anyway to know which rx gpu models have mem heatsinks? and how much do the heatsinks matter?

Try here:

http://www.overclock.net/t/1605802/official-polaris-owners-club/
sr. member
Activity: 689
Merit: 253
Anyway to know which rx gpu models have mem heatsinks? and how much do the heatsinks matter?
sr. member
Activity: 588
Merit: 251
I finally got my hands on the JEDEC GDDR5 spec, and am now convinced that there are still some mistakes (or at least mis-naming) in the strap decoding previously discussed.  One thing I noticed a few weeks ago was that tR2R is not mentioned in the Hynix datasheet.  It's not in the JEDEC spec either (which unsurprisingly is the source of much of the Hynix datasheet).  The closest thing I can find is tCCDS and tCCDL.  However if the field labeled tR2R is 5 clocks, then it can't be either CCDS or CCDL, since CCDS is always 2 clocks, and CCDL is either 2 (bank groups disabled) or 3/4 (bank groups enabled).


No, it's right. TR2R is not mentioned because it's ALWAYS going to be 5 for GDDR5.

OK, so what exactly is it?  If it's not the number of cycles between reads on different banks (since that is tCCD), is it the number of cycles required between 2 reads from the same open page?
sr. member
Activity: 588
Merit: 251
I finally got my hands on the JEDEC GDDR5 spec, and am now convinced that there are still some mistakes (or at least mis-naming) in the strap decoding previously discussed.  One thing I noticed a few weeks ago was that tR2R is not mentioned in the Hynix datasheet.  It's not in the JEDEC spec either (which unsurprisingly is the source of much of the Hynix datasheet).  The closest thing I can find is tCCDS and tCCDL.  However if the field labeled tR2R is 5 clocks, then it can't be either CCDS or CCDL, since CCDS is always 2 clocks, and CCDL is either 2 (bank groups disabled) or 3/4 (bank groups enabled).
sr. member
Activity: 588
Merit: 251
nerdralph's goddecode edit :

Code:
typedef struct _SEQ_MISC_TIMING_FORMAT
{
uint32_t TRP_WRA : 7;
uint32_t TRP_RDA : 7;
uint32_t TRP : 6;
uint32_t TRFC : 9;
uint32_t Pad0 : 3;
} SEQ_MISC_TIMING_FORMAT;

Niko2004 :

Code:
MC_SEQ_MISC_TIMING_RX=BitStruct("MC_SEQ_MISC_TIMING_RX", #last field is lowest bits
  Bits("unused3", 3), #Unused
  Bits("TRFC", 9),    #Auto-refresh command period - 1
  Bits("unused2", 1), #Unused
  Bits("TRP", 5),     #Precharge command period - 1
  Bits("unused1", 1), #Unused
  Bits("TRP_RDA", 6), #From read with auto-precharge to active - 1
  Bits("TRP_WRA", 7), #From write with auto-precharge to active - 1
)

TRP_RDA
TRP

Not same length.
Which one is the real one?

Mine are correct.   From a chip-design perspective, masking off the "unused" bits would actually add complexity.  The simple, and most logical conclusion is that the fields can store larger values than are required by the specs for currently-produced GDDR5 RAM.  I also suspect if anyone took the time to look at the JEDEC GDDR5 spec, you'd at least see tRP needs to support more than 5 bits (max 31).

To be 100% certain, on my Rx470 (Samsung K4G4) I changed MC_SEQ_MISC_TIMING: 0x09D82033
TRP_WRA=51 TRP_RDA=64 TRP=32 TRFC=157 Pad0=0

With my tweaked strap using 0x09D50CB3 I was getting 29.3Mh/s, and with only the high bit of RP_RDA and RP set, hashrate dropped to 23.1.  I let it run for 15 minutes with those timings and it was stable.

I suspect for GCN1.2 devices (i.e. Tonga) that use the R9 SEQ_MISC format that RP_RDA is actually 7 bits, not 6 plus a 1-bit pad before tRP.

hero member
Activity: 2548
Merit: 626
I did not test it, but want to write a tool for myself so that's why im asking Smiley

If you just need redistributable tool for windows you can try cx_Freeze on my python code.
It will give you one executable file with all libraries bundled-in.


Thanks mate, but this way i learn Smiley
Cheers
member
Activity: 126
Merit: 10
I did not test it, but want to write a tool for myself so that's why im asking Smiley

If you just need redistributable tool for windows you can try cx_Freeze on my python code.
It will give you one executable file with all libraries bundled-in.
hero member
Activity: 2548
Merit: 626
I did not test it, but want to write a tool for myself so that's why im asking Smiley
member
Activity: 126
Merit: 10
Not same length.
Which one is the real one?

Mine was inherited from linux kernel (at least for R9).
And it seems correct since these registers overflow (https://bitcointalksearch.org/topic/m.18280243)
at least in some official tool for bios generation.
I do not know however if it is ok for bios to overflow these registers.
I collected >100 different timing tables with >800 different timing strings and in all cases high bits of these registers always set to 0.

You can test it (didn't do it myself).
Set high unused bit of TRP/TRP_RDA to 1 and all low to 0 for some high strap.
If card crashes then overflow is not good from bios side too.
sr. member
Activity: 652
Merit: 266
hero member
Activity: 2548
Merit: 626
nerdralph's goddecode edit :

Code:
typedef struct _SEQ_MISC_TIMING_FORMAT
{
uint32_t TRP_WRA : 7;
uint32_t TRP_RDA : 7;
uint32_t TRP : 6;
uint32_t TRFC : 9;
uint32_t Pad0 : 3;
} SEQ_MISC_TIMING_FORMAT;

Niko2004 :

Code:
MC_SEQ_MISC_TIMING_RX=BitStruct("MC_SEQ_MISC_TIMING_RX", #last field is lowest bits
  Bits("unused3", 3), #Unused
  Bits("TRFC", 9),    #Auto-refresh command period - 1
  Bits("unused2", 1), #Unused
  Bits("TRP", 5),     #Precharge command period - 1
  Bits("unused1", 1), #Unused
  Bits("TRP_RDA", 6), #From read with auto-precharge to active - 1
  Bits("TRP_WRA", 7), #From write with auto-precharge to active - 1
)

TRP_RDA
TRP

Not same length.
Which one is the real one?
sr. member
Activity: 430
Merit: 254
Well I tried Equihash and it's really good at finding unstable memory clocks! HWiNFO reported 100 GPU memory errors or so right away on both cards at 1500MHz.

This makes way more sense, because before I bumped VDDCI/AUX up to 1.025 from 1.000, I used to only be able to run at about 1375MHz on both cards without errors. That is why I was using stilt's and 1125MHz straps, because even though they would crash above 1400, the IMC isn't good enough to run that high anyway. I wasn't getting any reported memory errors when I was mining ETH last night at 1525 though (just crashes if I went too high).

At the moment, with nerdified stock 1250MHz straps, I am getting ~385 Sol/s on the second (BFR) card @ 1100/1425
On the primary (AFR) card (it is driving my monitors atm) I am getting ~345 Sol/s @ 1075/1400

I will try reflashing with nerdified 1125MHz straps and maybe I will try The Stilt's strap again too.

EDIT: Seems to be pretty good at finding unstable core clocks as well. Just got a "GPU returned incorrect data" on the card running at 1100 core. I knew that freq was unstable because I get artifacting if I actually do any gaming or firestrike or whatever at those clocks. I will also clock these down to 1075.

EDIT2: Getting ~378 S/s @ 1075/1365 on the BFR card w/ nerdified stock 1125MHz strap
          Getting ~350 S/s @ 1075/1380 on the AFR card w/ nerdified stock 1125MHz strap <---remember this one is running my monitors so would prob be higher if ran as a secondary card.
          1075 was an unstable core frequency. The AFR card also gets ~378 S/s @ 1050/1375. Driving the monitors doesn't seem to be bothering it (during general browsing at least). I am really impressed by how good of a stability test Claymore's Zcash miner is!

https://i.imgur.com/NprszQp.jpg

Now I'm getting ~379 and ~355 respectively, so obviously it fluctuates a bit. Only 1 memory error after ~20 mins so that's acceptable.

https://i.imgur.com/glWShA8.jpg

I think it's time to start trying even lower straps. Next up: Nerdified 1000MHz straps.

EDIT3: Tried nerdified 1000MHz strap on the BFR card. It was getting a lower Sol rate @ 1250MHz and crashed at 1275MHz. I went back to nerdified 1125MHz strap and it's getting 375 - 380 Sol/s @ 1075/1370
The AFR card is doing better with the nerdified 1000MHz strap. It gets errors at 1300MHz, but @ 1075/1275 it is getting 368 - 375 Sol/s.
1075 was unstable for the core. It does better with the nerdified 1125MHz strap (~378 Sol/s @ 1050/1375).
1000MHz straps are just too tight to be useful.
newbie
Activity: 33
Merit: 0
Hi

I have tuned some RX480 with 8GB Samsung memory.

Finally i used this custom strap:
Code:
777000000000000022CC1C00106A5D4DD0571016B90D060C0060070014051420FA8900A0030000001011333DC0303A17

I use it on three different cards.
- MSI Armor
- Asus Strixx OC
- Powerful Color Red Devil

On both, asus and msi i get already memory errors above 2000MHz, a few, about 1 of 15 run without memerrors at 2100 over 24h.
On the red devil all runs 2100MHz without any memory errors over 24h. So on the red devil 30,5MH/s, other only 29MH/s.

Are there different Samsung memory chips?Huh

Original straps here:
Asus Strixx RX480 8GB OC
Code:
1500	555000000000000022CC1C00AD595B41C0570E14B00B450A0068C70003011420FA8900A003000000170E2B34A42A3116
1625 555000000000000022CC1C00CE616C47D0570F15B48C250B006AE7000B031420FA8900A003000000190F2F39B22D3517
1750 777000000000000022CC1C00106A6D4DD0571016B90D060C006AE70014051420FA8900A0030000001B11333DC0303A17
2000 777000000000000022CC1C0031F67E57F05711183FCFB60D006C070124081420FA8900A0030000001E123A46DB354019

MSI Armor RX480 8GB
Code:
1500	555000000000000022CC1C00AD595B41C0570E15B00B450A0068C70003011420FA8900A003000000170E2B34A42A3116
1625 555000000000000022CC1C00CE616C47D0570F16B48C250B006AE7000B031420FA8900A003000000190F2F39B22D3517
1750 777000000000000022CC1C00106A6D4DD0571017B90D060C006AE70014051420FA8900A0030000001B11333DC0303A17
2000 777000000000000022CC1C0031F67E57F05711193FCFB60D006C070124081420FA8900A0030000001E123A46DB354019

Powerful Color Red Devil
Code:
1500	555000000000000022CC1C00AD595B41C0570E14B00B450A0068C70003011420FA8900A003000000170E2B34A42A3116
1625 555000000000000022CC1C00CE616C47D0570F15B48C250B006AE7000B031420FA8900A003000000190F2F39B22D3517
1750 777000000000000022CC1C00106A6D4DD0571016B90D060C006AE70014051420FA8900A0030000001B11333DC0303A17
2000 777000000000000022CC1C0031F67E57F05711183FCFB60D006C070124081420FA8900A0030000001E123A46DB354019

I have marked the difference in 2000 straps.

Devil   77 70 00 00 00 00 00 00 22 CC 1C 00 31 F6 7E 57 F0 57 11 18 3F CF B6 0D 00 6C 07 01 24 08 14 20 FA 89 00 A0 03 00 00 00 1E 12 3A 46 DB 35 40 19
MSI     77 70 00 00 00 00 00 00 22 CC 1C 00 31 F6 7E 57 F0 57 11 19 3F CF B6 0D 00 6C 07 01 24 08 14 20 FA 89 00 A0 03 00 00 00 1E 12 3A 46 DB 35 40 19
ASUS    77 70 00 00 00 00 00 00 22 CC 1C 00 31 F6 7E 57 F0 57 11 18 3F CF B6 0D 00 6C 07 01 24 08 14 20 FA 89 00 A0 03 00 00 00 1E 12 3A 46 DB 35 40 19
CUSTOM  77 70 00 00 00 00 00 00 22 CC 1C 00 10 6A 5D 4D D0 57 10 16 B9 0D 06 0C 00 60 07 00 14 05 14 20 FA 89 00 A0 03 00 00 00 10 11 33 3D C0 30 3A 17



Can You help me out? I need a bit better strap for the MSI and Asus. Many THX.
sr. member
Activity: 430
Merit: 254
I've never mined cryptonight and I haven't done equihash since Nov/Dec. I might try it again some time but who knows.

BFR straps were just from putting stock straps through nerdralph's tool so thank him.

EDIT: This is where I got the 1125MHz BFR strap from in case you want other ones:

http://www.overclock.net/t/1554360/tahiti-memory-timings-patch-for-hynix-vram/#post_24933352
legendary
Activity: 1176
Merit: 1015
Yeah I just played around with different numbers using the fools tool and figured it out heh. Already did my testing.

Did you try equihash or cryptonight? I have seen much bigger gains there with hawaii. Also memory clocks higher than 1250 give no more speed @dagger without REALLY high core oc. I never tried if really tight timings@1250 would give more eth speed.

Thanks for those bfr straps, going to try them later.

edit: Your nerdified stock 1250MHz bfr timings copied to 1500 made it, now those bfr cards I have been fighting with are doing closer what they should on equihash, +12%.

edit2: Thanks for that link, I have read it thousand times before but still couldn't see it before you linked it to me.
sr. member
Activity: 430
Merit: 254
Yeah I just played around with different numbers using the fools tool and figured it out heh. Already did my testing.
sr. member
Activity: 652
Merit: 266
But I've already been using RRD = 4 for months because I've been using Stilt's strap. I am going to test both, I just want to know how to modify the output of the tool to change RRD=5 to RRD=4 so I can start testing.

Btw my 290's max mem clock is ~1500 so I probably need tighter than usual timings. The 390s can do over 1625 so they probably need RRD to be higher.

EDIT: nvm I just realized there's a windoze version of the ohgodadecode tool so I'll just check myself.

EDIT2: mmmkay that tool says that TRRD = 0 when i input 77713320000000000839472A50550C0B242045040046C40022BB1C005C0B14204A8900A00000012 0120C211E51192613

I'm guessing it only works for RX timings.

I'm just going to assume my original...uhh...assumption is correct.
It is Correct, indeed:

Quote
--> HEX strap: 77713320000000000839572A50550C0B242045040040040022BB1C005C0B14204A8900A00000012 0100C211E51192613

--> MC_SEQ_WR_CTL_D0
    DAT_DLY = 7,   DQS_DLY = 7,  DQS_XTR = 1,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 7,  OEN_EXT = 3

--> MC_SEQ_WR_CTL_D1
    DAT_DLY = 0,   DQS_DLY = 0,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 0,  OEN_EXT = 0

--> MC_SEQ_RAS_TIMING
    TRCDW = 8,  TRCDWA = 8,  TRCDR = 14,  TRCDRA = 14,  TRRD = 5,  TRC = 42,  Pad0 = 0

--> MC_SEQ_CAS_TIMING
    TNOPW = 0,  TNOPR = 0,  TR2W = 21, TCCLD = 2,  TR2R = 5,  Pad0 = 0,  TW2R = 12,  TCL = 11,  Pad1 = 0

--> MC_SEQ_MISC_TIMING
    TRP_WRA = 36, Pad0 = 0,  TRP_RDA = 32, Pad1 = 0,  TRP = 10,  TRFC = 68,  Pad2 = 0

--> MC_SEQ_MISC_TIMING2
    PA2RDATA = 0,  Pad0 = 0,  PA2WDATA = 0,  Pad1 = 0,  FAW = 0,  TREDC = 2,  TWEDC = 4,  T32AW = 0,  Pad2 = 0,  TWDATATR = 0

--> MC_SEQ_PMG_TIMING
    TCKSRE = 2,  Pad0 = 0,  TCKSRX = 2,  Pad1 = 0,  TCKE_PULSE = 11,  TCKE = 11,  SEQ_IDLE = 7,  Pad2 = 0,  TCKE_PULSE_MSB = 0, SEQ_IDLE_SS = 0




--> MC_ARB_DRAM_TIMING
    ACTRD = 16,  ACTWR = 12,  RASMACTRD = 33,  RASMACTWR = 30

--> MC_ARB_DRAM_TIMING2
    RAS2RAS = 81,  RP = 25,  WRPLUSRP = 38,  BUS_TURN = 19

Quote
--> HEX strap: 77713320000000000839472A50550C0B242045040040040022BB1C005C0B14204A8900A00000012 0100C211E51192613

--> MC_SEQ_WR_CTL_D0
    DAT_DLY = 7,   DQS_DLY = 7,  DQS_XTR = 1,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 7,  OEN_EXT = 3

--> MC_SEQ_WR_CTL_D1
    DAT_DLY = 0,   DQS_DLY = 0,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 0,  OEN_EXT = 0

--> MC_SEQ_RAS_TIMING
    TRCDW = 8,  TRCDWA = 8,  TRCDR = 14,  TRCDRA = 14,  TRRD = 4,  TRC = 42,  Pad0 = 0

--> MC_SEQ_CAS_TIMING
    TNOPW = 0,  TNOPR = 0,  TR2W = 21, TCCLD = 2,  TR2R = 5,  Pad0 = 0,  TW2R = 12,  TCL = 11,  Pad1 = 0

--> MC_SEQ_MISC_TIMING
    TRP_WRA = 36, Pad0 = 0,  TRP_RDA = 32, Pad1 = 0,  TRP = 10,  TRFC = 68,  Pad2 = 0

--> MC_SEQ_MISC_TIMING2
    PA2RDATA = 0,  Pad0 = 0,  PA2WDATA = 0,  Pad1 = 0,  FAW = 0,  TREDC = 2,  TWEDC = 4,  T32AW = 0,  Pad2 = 0,  TWDATATR = 0

--> MC_SEQ_PMG_TIMING
    TCKSRE = 2,  Pad0 = 0,  TCKSRX = 2,  Pad1 = 0,  TCKE_PULSE = 11,  TCKE = 11,  SEQ_IDLE = 7,  Pad2 = 0,  TCKE_PULSE_MSB = 0, SEQ_IDLE_SS = 0




--> MC_ARB_DRAM_TIMING
    ACTRD = 16,  ACTWR = 12,  RASMACTRD = 33,  RASMACTWR = 30

--> MC_ARB_DRAM_TIMING2
    RAS2RAS = 81,  RP = 25,  WRPLUSRP = 38,  BUS_TURN = 19
sr. member
Activity: 430
Merit: 254
R9 290, 1.025VDDCI/AUX, 1100/1525, Hynix H5GQ2H24AFR

stock timings (1501 - 1625MHz strap)
7771332000000000105A7B41805511112EA59906004C060122119D086C0F14206A8900A00000012 017112B3169262F15
30.83 - 30.95MH/s

nerdified stock 1500MHz timings
7771332000000000CE515A3B705510102BA218060040060022009D08640E14206A8900A00000012 0100F272D61232C14
31.00 - 31.07MH/s

nerdified stock 1375MHz timings
7771332000000000ADCD593770550F10292198050040050022EE1C08640D14205A8900A00000012 0100E242A59222A14
~31.03MH/s

nerdified stock 1250MHz timings
77713320000000008CC5583160550F0F251E17050040040022CC1C085C0B14204A8900A00000012 0100D2025511F2613
~31.03MH/s @ 1450MHz...seemed to decrease when I raised mem freq more than this

nerdified stock 1125MHz timings
55513320000000006BBD572D60550D0E229C96040020030022BB1C08530A1420BA8800A00000012 0100C1E22491D2312
30.99 - 31.07MH/s @ 1500MHz...crashed at 1525

stilt's AFR timings
77713320000000000839472A50550C0B242045040046C40022BB1C005C0B14204A8900A00000012 0120C211E51192613
30.83 - 30.90MH/s @ 1400MHz...crashes above 1400

stilt's nerdified AFR timings
77713320000000000839572A50550C0B242045040040040022BB1C005C0B14204A8900A00000012 0100C211E51192613
30.95 - 31.04MH/s @ 1400MHz....crashes above 1400





R9 290, 1.025VDDCI/AUX, 1100/1525, Hynix H5GC2H24BFR

stock 1250MHz timings (there aren't any higher straps in the BIOS)
77713320000000008CC55835805510112C9418050048C50022FF1C086C0F14205A8900A00000012 0120D242951242D15
31.00 - 31.03MH/s

nerdified stock 1125MHz timings (got these from a Tahiti BIOS with BFR...my card doesn't have this strap)
77713320000000006BBD572F70550F10289297040040040022EE1C08640D14204A8900A00000012 0100C202449212914
31.02 - 31.05 @ 1375MHz...crashes at higher mem freq

nerdified stock 1250MHz timings
77713320000000008CC55835805510112C9418050040050022FF1C086C0F14205A8900A00000012 0100D242951242D15
~31.03MH/s @ 1525MHz
~31.73MH/s @ 1125/1525
~32.43MH/s @ 1150/1525

TL;DR I gained as much as 0.2MH/s over stock by modifying Hynix timings on Hawaii. I can gain about 0.7MH/s per every +25MHz core frequency.
Pages:
Jump to: