Pages:
Author

Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 44. (Read 155695 times)

legendary
Activity: 1050
Merit: 1294
Huh?
Are there any publicly available custom memory strap timings for the rx 400 series? (samsung, elpida, or hynix)

I posted a basic mod for Elpida And Hynix a few pages back..
sr. member
Activity: 689
Merit: 253
Are there any publicly available custom memory strap timings for the rx 400 series? (samsung, elpida, or hynix)
legendary
Activity: 1050
Merit: 1294
Huh?
Kinda OT since this is a RAM Timings thread, but ill ask this anyways :

For Sapphire RX 470 4GB (Ref) cards GPU Core Volt offset is at A992 correct?
Now for the cards with Hynix memory, I find the default 04 which is 4 X 6.25 or +25mv, which seems legit.
But for the cards with Samsung memory, the value at A992 is FF which is -1 X 6.25 or -6.25mv, so something looks off.
Do different memory versions of the card have different default offset values? Or is the location different?

Any help/guidance would be appriciated.

Seems like the samsung one doesn't have global offset.

Stock roms with global offset have either '03' +18.75mV value or '04' +25mV as VDDC offset.
I've never seen something else, or rather negative offset.. and i've opened up a lot of them ;-) 'But correct me if i'm wrong..'

Greetings.
hero member
Activity: 751
Merit: 517
Fail to plan, and you plan to fail.
Kinda OT since this is a RAM Timings thread, but ill ask this anyways :

For Sapphire RX 470 4GB (Ref) cards GPU Core Volt offset is at A992 correct?
Now for the cards with Hynix memory, I find the default 04 which is 4 X 6.25 or +25mv, which seems legit.
But for the cards with Samsung memory, the value at A992 is FF which is -1 X 6.25 or -6.25mv, so something looks off.
Do different memory versions of the card have different default offset values? Or is the location different?

Any help/guidance would be appriciated.
sr. member
Activity: 588
Merit: 251

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes).  Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...


Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 3,  CL = 22,  TM = 0,  WR = 23,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0
Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0

I think you are off by +1 with the MR0 CAS latency.  SEQ_CAS_TIMING has CL=21(0x15) for the 1625 strap, and CL=22(0x16) for the 1750 strap.
full member
Activity: 152
Merit: 100
Grin Grin
seriously guys , leave aside cherry picked cards and post something that can run for at least 48 hours with 5/6 similar cards all running exactly same clocks and hashing at similar speed .
Getting that is the farm administrator work, they're already posting lots of hard to get info Cheesy
member
Activity: 130
Merit: 10
 Grin Grin
seriously guys , leave aside cherry picked cards and post something that can run for at least 48 hours with 5/6 similar cards all running exactly same clocks and hashing at similar speed .
legendary
Activity: 1050
Merit: 1294
Huh?
Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 3,  CL = 22,  TM = 0,  WR = 23,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0
Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0

Bingo! Smiley
sr. member
Activity: 652
Merit: 266
You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting.  So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all.  If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?


It actually seems to have a tolerance of one value up or down before it stops working entirely.

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes).  Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...


Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 3,  CL = 22,  TM = 0,  WR = 23,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0
Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote
--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0
legendary
Activity: 1050
Merit: 1294
Huh?
Just wanted to ask did anybody managed to get over 31+ mhs, but i got my answer before asking Cheesy

That's rather easy, but keeping it there and not setting insanely high clocks, that's the interesting part.

+ i would take a perfectly fine tuned setup with a very low power consumption over high speed any day.

I almost hit 1000h on a 480 8G (XMR) but i prefer 850 stable and very low power consumption.

Greetings!
sr. member
Activity: 588
Merit: 251
You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting.  So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all.  If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?


It actually seems to have a tolerance of one value up or down before it stops working entirely.

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes).  Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...
hero member
Activity: 2548
Merit: 626
Just wanted to ask did anybody managed to get over 31+ mhs, but i got my answer before asking Cheesy
member
Activity: 81
Merit: 1002
It was only the wind.
Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:
TRCDW = 16
TRCDWA = 16
TRCDR = 26
TRCDRA = 22
TRRD = 5
TRC = 71
Pad0 = 0

TRP_WRA = 48
Pad0 = 2
TRP_RDA = 12
TRP = 22
TRFC = 144

PA2RDATA = 0
Pad0 = 0
PA2WDATA = 0
Pad1 = 0
TFAW = 8
TCRCRL = 3
TCRCWL = 7
TFAW32 = 6

MC_SEQ_MISC1: 0x20140514

MC_SEQ_MISC3: 0xA00089FA

MC_SEQ_MISC8: 0x00000003

ACTRD = 25
ACTWR = 13
RASMACTRD = 47
RASMACTWR = 57

RAS2RAS = 157
RP = 45
WRPLUSRP = 46
BUS_TURN = 23

Looking forward to others input! Cheesy

TRCDR & TRCDRA should be the same.
sr. member
Activity: 588
Merit: 251
Sometimes loosening the timings is better, and clocking higher - specifically on Eth.

I've seen you mention loosening the CAS timings.  I tried bumping up tCL by 1, but still get crashes on the K4G4 at 2100.  So is it just loosening tCL that usually does the trick, or something else too?


You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting.  So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all.  If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?

Here's a quote from the datasheet:
During READ bursts, the first valid data‐out element will be available after the CAS latency (CL). The CAS
Latency is defined as CLmrs * tCK + tWCK2CKPIN + tWCK2CK + tWCK2DQO, where CLmrs is the number of
clock cycles programed in MR0, tWCK2CKPIN is the phase offset between WCK and CK at the pins when
phase aligned at phase detector, tWCK2CK is the alignment error between WCK and CK at the GDDR5
SGRAM phase detector, and tWCK2DQO is the WCK to DQ/DBI#/EDC offset as measured at the DRAM
pins
sr. member
Activity: 652
Merit: 266
How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.

Is that running AMDGPU-Pro 16.60 on kernel 4.10?

You can't run amdgpu-pro 16.60 on 4.4+ it won't compile without patching.
You run 4.10/11 with OSS amdgpu driver + ocl package from amdgpu-pro 16.60.
Quote
clinfo-amdgpu-pro                               install
libdrm-amdgpu-pro-amdgpu1:amd64                 install
libdrm-amdgpu-pro-dev:amd64                     install
libdrm-amdgpu-pro-utils                         install
libdrm-amdgpu1:amd64                            install
libdrm2-amdgpu-pro:amd64                        install
libegl1-amdgpu-pro:amd64                        install
libgbm1-amdgpu-pro:amd64                        install
libgbm1-amdgpu-pro-base                         install
libgbm1-amdgpu-pro-dev:amd64                    install
libgl1-amdgpu-pro-appprofiles                   install
libopencl1-amdgpu-pro:amd64                     install
opencl-amdgpu-pro-icd:amd64                     install
Linux ucm64xd 4.11.0-rc2-nvamd #1 SMP Mon Mar 13 20:30:11 EET 2017 x86_64 x86_64 x86_64 GNU/Linux
sr. member
Activity: 588
Merit: 251
How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.

Is that running AMDGPU-Pro 16.60 on kernel 4.10?
sr. member
Activity: 652
Merit: 266
How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.
My friend...this is pure terror over the hardware Cheesy
I've sent screen to ElioVP a while ago doing almost 33 at 1211/2225, but that 1300/2275 is insane!
sr. member
Activity: 652
Merit: 266
Some hynix tests at doktor83's clocks:



One of my experimental straps:



Both with set tRAS
sr. member
Activity: 652
Merit: 266
dont be rude, you were too generic : "If u get 29MH on 2000 it means your timings are too tight and won't be as stable"
I appologize, it was a moment of weakness. I read now that I haven't specified with this specific strap he posted.
Of course you can get 29+ with even lower than 2000 but not with these specific timings.
hero member
Activity: 2548
Merit: 626
dont be rude, you were too generic : "If u get 29MH on 2000 it means your timings are too tight and won't be as stable"
Pages:
Jump to: