Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 44.

Eliovp

legendary

Activity: 1050

Merit: 1294

Huh?

Quote from: Truthchanter on March 22, 2017, 04:40:17 PM

Are there any publicly available custom memory strap timings for the rx 400 series? (samsung, elpida, or hynix)

I posted a basic mod for Elpida And Hynix a few pages back..

Truthchanter

sr. member

Activity: 689

Merit: 253

Are there any publicly available custom memory strap timings for the rx 400 series? (samsung, elpida, or hynix)

Eliovp

legendary

Activity: 1050

Merit: 1294

Huh?

Quote from: deadsix on March 22, 2017, 04:09:41 PM

Kinda OT since this is a RAM Timings thread, but ill ask this anyways :

For Sapphire RX 470 4GB (Ref) cards GPU Core Volt offset is at A992 correct?
Now for the cards with Hynix memory, I find the default 04 which is 4 X 6.25 or +25mv, which seems legit.
But for the cards with Samsung memory, the value at A992 is FF which is -1 X 6.25 or -6.25mv, so something looks off.
Do different memory versions of the card have different default offset values? Or is the location different?

Any help/guidance would be appriciated.

Seems like the samsung one doesn't have global offset.

Stock roms with global offset have either '03' +18.75mV value or '04' +25mV as VDDC offset.
I've never seen something else, or rather negative offset.. and i've opened up a lot of them ;-) 'But correct me if i'm wrong..'

Greetings.

deadsix

hero member

Activity: 751

Merit: 517

Fail to plan, and you plan to fail.

Kinda OT since this is a RAM Timings thread, but ill ask this anyways :

For Sapphire RX 470 4GB (Ref) cards GPU Core Volt offset is at A992 correct?
Now for the cards with Hynix memory, I find the default 04 which is 4 X 6.25 or +25mv, which seems legit.
But for the cards with Samsung memory, the value at A992 is FF which is -1 X 6.25 or -6.25mv, so something looks off.
Do different memory versions of the card have different default offset values? Or is the location different?

Any help/guidance would be appriciated.

nerdralph

sr. member

Activity: 588

Merit: 251

Quote from: laik2 on March 22, 2017, 09:38:14 AM

Quote from: nerdralph on March 22, 2017, 09:10:26 AM

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes). Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...

Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 3, CL = 22, TM = 0, WR = 23, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 4, CL = 23, TM = 0, WR = 25, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

I think you are off by +1 with the MR0 CAS latency. SEQ_CAS_TIMING has CL=21(0x15) for the 1625 strap, and CL=22(0x16) for the 1750 strap.

lpedretti

full member

Activity: 152

Merit: 100

Quote from: kemo6600 on March 22, 2017, 11:24:51 AM

seriously guys , leave aside cherry picked cards and post something that can run for at least 48 hours with 5/6 similar cards all running exactly same clocks and hashing at similar speed .

Getting that is the farm administrator work, they're already posting lots of hard to get info Cheesy

kemo6600

member

Activity: 130

Merit: 10

seriously guys , leave aside cherry picked cards and post something that can run for at least 48 hours with 5/6 similar cards all running exactly same clocks and hashing at similar speed .

Eliovp

legendary

Activity: 1050

Merit: 1294

Huh?

Quote from: laik2 on March 22, 2017, 09:38:14 AM

Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 3, CL = 22, TM = 0, WR = 23, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 4, CL = 23, TM = 0, WR = 25, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

Bingo!

laik2

sr. member

Activity: 652

Merit: 266

Quote from: nerdralph on March 22, 2017, 09:10:26 AM

Quote from: Wolf0 on March 22, 2017, 08:37:42 AM

Quote from: nerdralph on March 22, 2017, 08:30:00 AM

Quote from: Wolf0 on March 21, 2017, 09:32:06 PM

You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting. So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all. If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?

It actually seems to have a tolerance of one value up or down before it stops working entirely.

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes). Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...

Original Samsung 4G ( your particular GPU ) 1625
555000000000000022CC1C00CE596B44D0570F1531CB2409004AE700 [ 0B03 | 1420 ] 7A8900A003000000170F2E36922A3217

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 3, CL = 22, TM = 0, WR = 23, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

Original Samsung 4G 1750
777000000000000022CC1C0010626C49D0571016B50BD509004AE700 [ 1405 | 1420 ] 7A8900A003000000191131399D2C3617

Quote

--> MC_SEQ_MISC1
-- MR0
   WL = 4, CL = 23, TM = 0, WR = 25, BA0 = 0, BA1 = 0, BA2 = 0, BA3 = 0
-- MR1
   DS = 0, DT = 1, ADR = 1, CAL = 0, PLL = 0, RDBI = 0, WDBI = 0, ABI = 0,
   RES = 0, BA0 = 0, BA1 = 1, BA2 = 0, BA3 = 0

Eliovp

legendary

Activity: 1050

Merit: 1294

Huh?

Quote from: doktor83 on March 22, 2017, 08:40:19 AM

Just wanted to ask did anybody managed to get over 31+ mhs, but i got my answer before asking Cheesy

That's rather easy, but keeping it there and not setting insanely high clocks, that's the interesting part.

+ i would take a perfectly fine tuned setup with a very low power consumption over high speed any day.

I almost hit 1000h on a 480 8G (XMR) but i prefer 850 stable and very low power consumption.

Greetings!

nerdralph

sr. member

Activity: 588

Merit: 251

Quote from: Wolf0 on March 22, 2017, 08:37:42 AM

Quote from: nerdralph on March 22, 2017, 08:30:00 AM

Quote from: Wolf0 on March 21, 2017, 09:32:06 PM

You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting. So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all. If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?

It actually seems to have a tolerance of one value up or down before it stops working entirely.

So I need to update MC_SEQ_MISC1, offset 54 in the hex string of the strap (offset 27 in bytes). Are the 3 hex chars at offset 55-57 the 12 bits for MR0, or is that MR1 and MR0 is 59-61?
I know I could figure it out by comparing different straps and seeing how the bits map to the register values, but since you seem to have already figured it out...

doktor83

hero member

Activity: 2548

Merit: 626

Just wanted to ask did anybody managed to get over 31+ mhs, but i got my answer before asking Cheesy

Wolf0

member

Activity: 81

Merit: 1002

It was only the wind.

Quote from: tharp on March 21, 2017, 09:53:56 AM

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:
TRCDW = 16
TRCDWA = 16
TRCDR = 26
TRCDRA = 22
TRRD = 5
TRC = 71
Pad0 = 0

TRP_WRA = 48
Pad0 = 2
TRP_RDA = 12
TRP = 22
TRFC = 144

PA2RDATA = 0
Pad0 = 0
PA2WDATA = 0
Pad1 = 0
TFAW = 8
TCRCRL = 3
TCRCWL = 7
TFAW32 = 6

MC_SEQ_MISC1: 0x20140514

MC_SEQ_MISC3: 0xA00089FA

MC_SEQ_MISC8: 0x00000003

ACTRD = 25
ACTWR = 13
RASMACTRD = 47
RASMACTWR = 57

RAS2RAS = 157
RP = 45
WRPLUSRP = 46
BUS_TURN = 23

Looking forward to others input! Cheesy

TRCDR & TRCDRA should be the same.

nerdralph

sr. member

Activity: 588

Merit: 251

Quote from: Wolf0 on March 21, 2017, 09:32:06 PM

Quote from: nerdralph on March 21, 2017, 07:43:29 PM

Quote from: Wolf0 on March 21, 2017, 05:22:02 PM

Sometimes loosening the timings is better, and clocking higher - specifically on Eth.

I've seen you mention loosening the CAS timings. I tried bumping up tCL by 1, but still get crashes on the K4G4 at 2100. So is it just loosening tCL that usually does the trick, or something else too?

You have to loosen it on the DRAM, too - you're loosening the tCL on the ASIC, but not the DRAM, throwing them off.

Interesting. So the memory controller (or driver) isn't smart enough to take tCL from SEQ_CAS_TIMING and use same value for MR0 Cas Latency?
edit: I don't even understand how this would work at all. If the controller is expecting the data 22 cycles after the read, but MR0 is programmed for 21, then wouldn't that cause a 100% error rate?

Here's a quote from the datasheet:
During READ bursts, the first valid data‐out element will be available after the CAS latency (CL). The CAS
Latency is defined as CLmrs * tCK + tWCK2CKPIN + tWCK2CK + tWCK2DQO, where CLmrs is the number of
clock cycles programed in MR0, tWCK2CKPIN is the phase offset between WCK and CK at the pins when
phase aligned at phase detector, tWCK2CK is the alignment error between WCK and CK at the GDDR5
SGRAM phase detector, and tWCK2DQO is the WCK to DQ/DBI#/EDC offset as measured at the DRAM
pins

laik2

sr. member

Activity: 652

Merit: 266

Quote from: nerdralph on March 22, 2017, 07:28:22 AM

Quote from: Wolf0 on March 22, 2017, 07:04:36 AM

How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.

Is that running AMDGPU-Pro 16.60 on kernel 4.10?

You can't run amdgpu-pro 16.60 on 4.4+ it won't compile without patching.
You run 4.10/11 with OSS amdgpu driver + ocl package from amdgpu-pro 16.60.

Quote

clinfo-amdgpu-pro install
libdrm-amdgpu-pro-amdgpu1:amd64 install
libdrm-amdgpu-pro-dev:amd64 install
libdrm-amdgpu-pro-utils install
libdrm-amdgpu1:amd64 install
libdrm2-amdgpu-pro:amd64 install
libegl1-amdgpu-pro:amd64 install
libgbm1-amdgpu-pro:amd64 install
libgbm1-amdgpu-pro-base install
libgbm1-amdgpu-pro-dev:amd64 install
libgl1-amdgpu-pro-appprofiles install
libopencl1-amdgpu-pro:amd64 install
opencl-amdgpu-pro-icd:amd64 install

Linux ucm64xd 4.11.0-rc2-nvamd #1 SMP Mon Mar 13 20:30:11 EET 2017 x86_64 x86_64 x86_64 GNU/Linux

nerdralph

sr. member

Activity: 588

Merit: 251

Quote from: Wolf0 on March 22, 2017, 07:04:36 AM

How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.

Is that running AMDGPU-Pro 16.60 on kernel 4.10?

laik2

sr. member

Activity: 652

Merit: 266

Quote from: Wolf0 on March 22, 2017, 07:04:36 AM

How about going in the other direction (NSFW): https://ottrbutt.com/tmp/ethwolf-03212017.jpg ?

See GPU 4.

My friend...this is pure terror over the hardware Cheesy

I've sent screen to ElioVP a while ago doing almost 33 at 1211/2225, but that 1300/2275 is insane!

laik2

sr. member

Activity: 652

Merit: 266

Some hynix tests at doktor83's clocks:

One of my experimental straps:

Both with set tRAS

laik2

sr. member

Activity: 652

Merit: 266

Quote from: doktor83 on March 22, 2017, 02:42:54 AM

dont be rude, you were too generic : "If u get 29MH on 2000 it means your timings are too tight and won't be as stable"

I appologize, it was a moment of weakness. I read now that I haven't specified with this specific strap he posted.
Of course you can get 29+ with even lower than 2000 but not with these specific timings.

doktor83

hero member

Activity: 2548

Merit: 626

dont be rude, you were too generic : "If u get 29MH on 2000 it means your timings are too tight and won't be as stable"

Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED - page 44. (Read 155721 times)