Pages:
Author

Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 49. (Read 155485 times)

legendary
Activity: 980
Merit: 1001
aka "whocares"
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


So it's not as simple as using atombios.h to dump the fields in ATOM_MEMORY_TIMING_FORMAT_V2.
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/radeon/atombios.h

Straps for GCN cards are 52 bytes long (3 bytes for memory clock, 1 byte for memory type, 48 bytes for strap), but sizeof(ATOM_MEMORY_TIMING_FORMAT_V2) = 40 bytes.

So is it just a matter of old-fashioned reverse engineering?  i.e. looking at different straps and reading through GDDR5 data sheets to figure out the strap offsets for different values?


Hah, you don't know the format and you're going to make a public tool? Your threats are like skate park swimming pools - empty Tongue

Looks like you haven't read the rest of the thread.  It took less than an hour to figure it out from the Linux drm code.


Not quite - they tell you part of the story - but look at MISC1, for example :3

lol - ssshhh... "fools rush in where angels fear to tread"
sr. member
Activity: 588
Merit: 251
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


So it's not as simple as using atombios.h to dump the fields in ATOM_MEMORY_TIMING_FORMAT_V2.
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/radeon/atombios.h

Straps for GCN cards are 52 bytes long (3 bytes for memory clock, 1 byte for memory type, 48 bytes for strap), but sizeof(ATOM_MEMORY_TIMING_FORMAT_V2) = 40 bytes.

So is it just a matter of old-fashioned reverse engineering?  i.e. looking at different straps and reading through GDDR5 data sheets to figure out the strap offsets for different values?


Hah, you don't know the format and you're going to make a public tool? Your threats are like skate park swimming pools - empty Tongue

Looks like you haven't read the rest of the thread.  It took less than an hour to figure it out from the Linux drm code.
newbie
Activity: 25
Merit: 0
You are a beast man, now I need to buy a dev rig...
sr. member
Activity: 588
Merit: 251
Here's the first working beta of my strap decoding software is ready.  Brief instructions:
1. Read the python and C source.
2. If you can't understand, goto step 1.

https://github.com/nerdralph/strapread

p.s. as others have mentioned, it seems the strap layout for Polaris is different than the previous generation cards.  So trying to decode the straps from a Tonga BIOS gives wrong values.

sr. member
Activity: 588
Merit: 251
Fucking little-endian byte order had me confused for over an hour, but now I`m almost done:

SEQ_PMG_TIMING
tCKSRE:2 tCKSRX:2 tCKE_PULSE:8 tCKE:24 SEQ_IDLE:7 tCKE_PULSE_MSB:1 SEQ_IDLE_SS:0
SEQ_RAS_TIMING
tRCDW:19 tRCDWA:19 tRCDR:27 tRCDRA:27 tRRD:8 tRC:83
SEQ_CAS_TIMING
tNOPW:0 tNOPR:0 tR2W:24 tCCDL:2 tR2R:5 tW2R:21 tCL:19
SEQ_MISC_TIMING
tRP_WRA:62 tRP_RDA:79 tRP:13 tRFC:197
SEQ_MISC_TIMING2
PA2RDATA:2 PA2WDATA:0 FAW:0 tREDC:0 tWEDC:17 t32AW:1 tWDATATR:2
ARB_DRAM_TIMING2
RAS2RAS:197 RP:48 WRPLUSRP:63 BUS_TURN:23

I`m ignoring some of the mask values since I think all 32 bits are being written to the memory controller registers.

sr. member
Activity: 588
Merit: 251
I wonder what's your background to have such knowledge. I haven't touched C in years so I barely have any idea about whats happening in this snippet.
 For example what does RAS2RAS:8 means ?
And  ARB_DRAM_TIMING2 t8 = {0x922A3217}; What's happening there ? You are instantiating a type with some hex ?

I am curious because it's been a long time since I had to deal with this kind of low level programming.

Google c bitfields.
newbie
Activity: 25
Merit: 0
I wonder what's your background to have such knowledge. I haven't touched C in years so I barely have any idea about whats happening in this snippet.
 For example what does RAS2RAS:8 means ?
And  ARB_DRAM_TIMING2 t8 = {0x922A3217}; What's happening there ? You are instantiating a type with some hex ?

I am curious because it's been a long time since I had to deal with this kind of low level programming.
hero member
Activity: 751
Merit: 517
Fail to plan, and you plan to fail.
I`ve got a working prototype, and should have full decoding of at least 6 strap registers working by the end of the day.

Praise be upon you.
sr. member
Activity: 588
Merit: 251
I`ve got a working prototype, and should have full decoding of at least 6 strap registers working by the end of the day.

strap.h:
Code:
typedef union  _ARB_DRAM_TIMING2 {
  u32 data;
  struct {
    u32 RAS2RAS:8;
    u32 RP:8;
    u32 WRPLUSRP:8;
    u32 BUS_TURN:8;
  } fields;
} ARB_DRAM_TIMING2;

parse.c:
Code:
#include 
#include

typedef uint32_t u32;

#include "strap.h"

ARB_DRAM_TIMING2 t8 = {0x922A3217};

int main(void)
{
  printf("RAS2RAS: %d\n", t8.fields.RAS2RAS);
}

RAS2RAS: 23
jr. member
Activity: 144
Merit: 2
Decoding is already done by many of us.
The hard part is finding the best values, that is not simply guessing.

An encoding/decoding tool could help that process very much.
Maybe someone publishes his own, I don't have time to write one.

What I wonder the most about is how custom straps improve hashrate through reduced latency and not through increased bandwidth. I think for example ZEC mining is much more sensitive to latency than bandwidth. Obviously fragmented memory operations rely more on latency than bw.
I'd like to get some insight on these different approaches in timing adjustments and their relation.
sr. member
Activity: 652
Merit: 266
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


So it's not as simple as using atombios.h to dump the fields in ATOM_MEMORY_TIMING_FORMAT_V2.
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/radeon/atombios.h

Straps for GCN cards are 52 bytes long (3 bytes for memory clock, 1 byte for memory type, 48 bytes for strap), but sizeof(ATOM_MEMORY_TIMING_FORMAT_V2) = 40 bytes.

So is it just a matter of old-fashioned reverse engineering?  i.e. looking at different straps and reading through GDDR5 data sheets to figure out the strap offsets for different values?


After reviewing the thread discussion again, and more digging through the kernel code, it seems atombios.h is some ancient relic of video cards gone by.
The straps for GCN cards are 32-bit values that get written to the GPU memory controller registers.
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h

4 bytes each for:
MC_SEQ_RAS_TIMING
MC_SEQ_CAS_TIMING
MC_SEQ_MISC_TIMING
MC_SEQ_MISC_TIMING2
MC_SEQ_PMG_TIMING

That accounts for 20 bytes of the strap.  Training and PHY timing may be part of the strap as well, but decoding the 5 register values above should be all that is needed to make tangible improvements in the timing.

Well...u are not wrong Smiley
sr. member
Activity: 588
Merit: 251
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


So it's not as simple as using atombios.h to dump the fields in ATOM_MEMORY_TIMING_FORMAT_V2.
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/radeon/atombios.h

Straps for GCN cards are 52 bytes long (3 bytes for memory clock, 1 byte for memory type, 48 bytes for strap), but sizeof(ATOM_MEMORY_TIMING_FORMAT_V2) = 40 bytes.

So is it just a matter of old-fashioned reverse engineering?  i.e. looking at different straps and reading through GDDR5 data sheets to figure out the strap offsets for different values?


After reviewing the thread discussion again, and more digging through the kernel code, it seems atombios.h is some ancient relic of video cards gone by.
The straps for GCN cards are 32-bit values that get written to the GPU memory controller registers.
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h

4 bytes each for:
MC_SEQ_RAS_TIMING
MC_SEQ_CAS_TIMING
MC_SEQ_MISC_TIMING
MC_SEQ_MISC_TIMING2
MC_SEQ_PMG_TIMING

That accounts for 20 bytes of the strap.  Training and PHY timing may be part of the strap as well, but decoding the 5 register values above should be all that is needed to make tangible improvements in the timing.
legendary
Activity: 980
Merit: 1001
aka "whocares"
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...

With all due respect, if you have such a tool then public release means less profits for you.
No sane (=selfish) miner would do that. At least I wouldn't with all the work involved.
And yes I written such a tool to decode/encode RAS, CAS, PGM, MISC, MISC2, ARB_DRAM, ARB_DRAM2 parts of timing string and edit bios with it
(for all cards starting from HD 7xxx).

It is not as simple as reading atombios.h (I don't read this one and it seems mostly irrelevant for the task),
but this thread gives a lot of hints how to do it.


I started this thread to try and help people but quickly realized that trying to help people became "Can you do this for me" and my PM inbox was overrun with it.  So I took a backseat for a bit.

On a side note niko2004x, nerdralph would have zero problems figuring it out and writing it.  His knowledge about programing etc FAR exceeds both yours and mine.
member
Activity: 126
Merit: 10
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...

With all due respect, if you have such a tool then public release means less profits for you.
No sane (=selfish) miner would do that. At least I wouldn't with all the work involved.
And yes I written such a tool to decode/encode RAS, CAS, PGM, MISC, MISC2, ARB_DRAM, ARB_DRAM2 parts of timing string and edit bios with it
(for all cards starting from HD 7xxx).

It is not as simple as reading atombios.h (I don't read this one and it seems mostly irrelevant for the task),
but this thread gives a lot of hints how to do it.
legendary
Activity: 980
Merit: 1001
aka "whocares"
I know myself and wolf0 both wrote our own programs/scripts to do it.  Even with that I use an old fashion speadsheet like Excel or Numbers to do it mostly.
sr. member
Activity: 588
Merit: 251
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


So it's not as simple as using atombios.h to dump the fields in ATOM_MEMORY_TIMING_FORMAT_V2.
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/radeon/atombios.h

Straps for GCN cards are 52 bytes long (3 bytes for memory clock, 1 byte for memory type, 48 bytes for strap), but sizeof(ATOM_MEMORY_TIMING_FORMAT_V2) = 40 bytes.

So is it just a matter of old-fashioned reverse engineering?  i.e. looking at different straps and reading through GDDR5 data sheets to figure out the strap offsets for different values?
hero member
Activity: 751
Merit: 517
Fail to plan, and you plan to fail.
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...


I haven't seen a publicly available program yet, would be great if you could do that man Smiley
sr. member
Activity: 588
Merit: 251
I see at least a couple people have written strap decoding programs, but I can't find publicly released.  I was going to write one and release it publicly, but I figured if someone else has already written one...
sr. member
Activity: 652
Merit: 266
Thanks, now i see you get 28.5-29 on msi armor 470 4gb, that is what i get on the same card using 1750 strap, i was wondering it could get better using custom timings....
Actually I haven't pushed it too much, last unstable result was 30.4 on linux and I just lowered to 1100/1950 and I got 28.8 stable + bonus 75W power consumption ATW Smiley
newbie
Activity: 31
Merit: 0
Thanks, now i see you get 28.5-29 on msi armor 470 4gb, that is what i get on the same card using 1750 strap, i was wondering it could get better using custom timings....
Pages:
Jump to: