Pages:
Author

Topic: AMD Mem Tweak XL - Read/modify timings/pp/straps on the fly - page 59. (Read 58905 times)

legendary
Activity: 1050
Merit: 1294
Huh?
@eliovp, thank you for this tool.
Do you or anybody else know if there is a tool available in Linux to count GPU memory errors?  Much like HWiNFO64 in Windows?

Hey Bobben2,

Yeah, it's not as obvious as on windows, that's a fact, try to check dmesg log, this gives you a lot of info as well.


>>Posted by: dragonmike

>> Has nobody ported this to Windows yet?

>> I'd be happy to do all testing myself and share my findings in this and other threads... But I'm not going to install Ubuntu and all that shizzle for that sole purpose. I suck at Linux unfortunately...
________

I've been trying to but my plate is pretty over topped at the moment but it is something I have been working on. Sadly not having luck finding somebody to help so its all on when i can get time.

No worries, i hoped it would be a nice incentive for people to get the *nix vibes Smiley

Who knows, maybe a windows version will pop up one of these days  Cheesy Wink

Changing memory timings on the fly is a very good idea!
It's possible to access GPU MMIO registers in Windows too.
It's a bit tricky to implement it properly even if you have good skills and that's the main problem.
Registers offsets can be found in ROCm sources and typedefs in OhGodADecode utility or in ROCm again (though it contains only bitmasks so it would take some time to convert them).
Anyway, this tool gave me an idea of new feature, thanks, also it was funny to play with registers and see some BSODs Smiley
I donated 0.25BTC to the address in first post.

Quote
It's possible to access GPU MMIO registers in Windows too.
It's a bit tricky to implement it properly even if you have good skills and that's the main problem.
Yeah, i'm very much aware that it's possible, it's the tricky part that's tricky :p

Thank you for your donation. It's very much appreciated!

I also noticed i've been getting some hash from someone.
I would like to thank that person as well!

Also, would again like to point out that i kind of hoped that i would see some more test results (with screenshots) here.
It doesn't hurt to help eachother out...

Cheers!
donator
Activity: 1610
Merit: 1325
Miners developer
Changing memory timings on the fly is a very good idea!
It's possible to access GPU MMIO registers in Windows too.
It's a bit tricky to implement it properly even if you have good skills and that's the main problem.
Registers offsets can be found in ROCm sources and typedefs in OhGodADecode utility or in ROCm again (though it contains only bitmasks so it would take some time to convert them).
Anyway, this tool gave me an idea of new feature, thanks, also it was funny to play with registers and see some BSODs Smiley
I donated 0.25BTC to the address in first post.
jr. member
Activity: 36
Merit: 7
>>Posted by: dragonmike

>> Has nobody ported this to Windows yet?

>> I'd be happy to do all testing myself and share my findings in this and other threads... But I'm not going to install Ubuntu and all that shizzle for that sole purpose. I suck at Linux unfortunately...
________

I've been trying to but my plate is pretty over topped at the moment but it is something I have been working on. Sadly not having luck finding somebody to help so its all on when i can get time.
newbie
Activity: 42
Merit: 0
dear sir, can u share your ppt settings at win with cnr algo? I have some questions ①is it necessary to flash 56 to 64 bios? hynix memory can flash to 64?②what driver are u use?  

my vega can not get below 863mv under my setting, stock bios, see the picture i upload, what should i do to lower it's power consume, thanks

I'm using linux, so my ppts are different from windows.  

No, it's not necessary to flash your 56, and I wouldn't recommend it for cn/r.  Some algos make better use of high bandwidth (high clocks), but cn/r is not really one of them - CN in general likes low latency, so the stock 56 bios is fine (and will save you some power.)

I'm guessing you're using 19.3.x drivers?  If so, I'm betting your problem is that you have the stock 900mv set in your mem p2 state, which if that's the case, means regardless of your other settings, you will always run @ minimum 900mv (minus droop) unless you disable the state (for example, in overdriventool), or use a PPT which lowers the setting.

Thanks, it works
full member
Activity: 279
Merit: 104
@eliovp, thank you for this tool.
Do you or anybody else know if there is a tool available in Linux to count GPU memory errors?  Much like HWiNFO64 in Windows?
full member
Activity: 585
Merit: 106
Has anyone tried it on Radeon VII?
member
Activity: 340
Merit: 29
dear sir, can u share your ppt settings at win with cnr algo? I have some questions ①is it necessary to flash 56 to 64 bios? hynix memory can flash to 64?②what driver are u use?  

my vega can not get below 863mv under my setting, stock bios, see the picture i upload, what should i do to lower it's power consume, thanks

I'm using linux, so my ppts are different from windows.  

No, it's not necessary to flash your 56, and I wouldn't recommend it for cn/r.  Some algos make better use of high bandwidth (high clocks), but cn/r is not really one of them - CN in general likes low latency, so the stock 56 bios is fine (and will save you some power.)

I'm guessing you're using 19.3.x drivers?  If so, I'm betting your problem is that you have the stock 900mv set in your mem p2 state, which if that's the case, means regardless of your other settings, you will always run @ minimum 900mv (minus droop) unless you disable the state (for example, in overdriventool), or use a PPT which lowers the setting.
newbie
Activity: 11
Merit: 0
If you want it easy go for hiveos. It's very simple and this tool is already included. You don't have to install it, you can run it from usb drive.
This version is specifically for rx Vega series https://download.hiveos.farm/[email protected] you should update it at the first boot. It's just a matter of one click.
hero member
Activity: 1274
Merit: 556
Has nobody ported this to Windows yet?

I'd be happy to do all testing myself and share my findings in this and other threads... But I'm not going to install Ubuntu and all that shizzle for that sole purpose. I suck at Linux unfortunately...
jr. member
Activity: 144
Merit: 2
Testing stability the past two days.
The light timing with ETH 51mh is rock solid.
The more edgy timings for CN algos needs to be smoothened because after a few hours mining stopped. It was 2130h/s till then.

BTW I tried kernel 4.16 but totally same behaviour lik 4.15. The main problem is I can't set p7 above 1700MHz to get ~1500 at low voltage.
newbie
Activity: 11
Merit: 0
Ethash benefits from memory bandwidth, so it's no surprise r7 does that hr. It has 1tb/sec bandwidth. So it's capable of about 128mh theoretically. Cryptonight algos benefit from low latency.
newbie
Activity: 42
Merit: 0
you’re prob running 225-250w for that 2100h/s, or about 80-100w more than you would need for 2000, without timing mods (I got 2K @ 837mv.)  +50% power for +5% h/r is not worth it - even if your power is free!

Normally 1500MHz needs around 925mV. I don't get those sub 850 numbers. Do you have a watt meter on it? Vddci is also set to <837?

to do the math in my case:  975/925=1.054  and 1.054^2 = 1.11 (+11%), and considering the higher clock, that means power consumption is only ~15% higher then before.


@coinscrow:
Thank you mate, going to try that!

Even 1500MHz is high - ~1400Mhz effective is a pretty efficient spot from all my testing. That combined w/ keeping your SOC <= 1107 MHz, generally allows <= 850mv.  My 50% number was based on 1030 vs 837mv - even @ 975 it would be 35%+.  

My 56 is plugged into a PDU, but on a mixed rig currently, so hard to isolate.  Though I have benched every GPU i own on a meter in the past (w/ multiple algos,) and tend to see 1400/1107 @ 837 eating around 160-170w (64 bios).

As to the 'mem/vddci' setting - since it's really just a vddc floor, in Windows I make it a point to always just set it well below my active state, while in linux I just use ppts, so it's forced by nature of being a table ref.



dear sir, can u share your ppt settings at win with cnr algo? I have some questions ①is it necessary to flash 56 to 64 bios? hynix memory can flash to 64?②what driver are u use?  

my vega can not get below 863mv under my setting, stock bios, see the picture i upload, what should i do to lower it's power consume, thanks
http://www.kepfeltoltes.eu/images/2019/03/468_24494_20449_22270_292.png
jr. member
Activity: 144
Merit: 2
There is a tool for mem timings strap encode/decode. Google it.
But for polaris the public straps are quiet good. They are included in PBE. You can also see them in the source.

Anyone willing to share results of Radeon VII ? I'm curious.
If 110mh is possible with stock timings, then there is much potential in the memory. Might be core bottlenecked.
newbie
Activity: 8
Merit: 0
how can i use this tool? i tried
Code:
./amdmemtool --RFC 43 --ras2ras 176
but no success (from help). how to chose parameters for gddr5? i have rx 580/570. this is --currnet output:
Code:
GPU 0:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:01:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 24  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 1:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:02:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 24  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 2:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:03:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 25  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 3:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:04:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 24  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 4:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:06:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 25  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 5:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:08:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 24  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 6:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:09:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 25  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
GPU 7:  Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] pci:0000:0a:00.0
CAS
  CL: 22  W2R: 19  CCDS: 5  CCLD: 2  R2W: 26  NOPR: 0  NOPW: 0
RAS
  RC: 79  RRD: 6  RCDRA: 29  RCDR: 29  RCDWA: 23  RCDW: 23
MISC
  RFC: 219  TRP: 29  RP_RDA: 33  RP_WRA: 70
MISC2
  WDATATR: 0  T32AW: 7  CRCWL: 24  CRCRL: 3  FAW: 10  PA2WDATA: 0  PA2RDATA: 0
DRAM1
  RASMACTWR: 33  RASMACTRD: 30  ACTWR: 14  ACTRD: 18
DRAM2
  RAS2RAS: 18  RP: 14  WRPLUSRP: 30  BUS_TURN: 33
jr. member
Activity: 144
Merit: 2
Today I tried ethash, but disappointment.
I can only reach 48.0 MH/s with Phoenix
For ethash it should be tuned towards high bandwidth and not low latency

Update:
Started tuning all over from a different approach and got 51+ MH/s Smiley
This timing seems stable so far. But got lower CN hashrate than the previous (~2050).

Update2:
51mh is verified by the pool side 6hr average and no invalid shares.



Hint:
Lowest stable RAS for me is 30. You can get the other numbers with the formula.

Any hint from the experts what to do with all the mysterius params at the lower part of the table? Tongue
member
Activity: 340
Merit: 29
my results with Vega56 (samsung) on CN-R so far


What miner are you using to get those figures?
I'll be honest I'm a little disappointed. I get around 2050 per Vega56@64 with no memory timing tweaks using Teamredminer at 1458 core clock (~1410 effective), pretty much like pbfarmer.

...or is the miner slower on Linux?

That is XMRigCC-AMD

Teamredminer might be faster...

Which is the fastest CN miner for Linux?

I use SRB for win.

TRM is fastest in win from what I’ve seen, and performs the same under Linux + amdgpu-pro 18.50
jr. member
Activity: 144
Merit: 2
my results with Vega56 (samsung) on CN-R so far


What miner are you using to get those figures?
I'll be honest I'm a little disappointed. I get around 2050 per Vega56@64 with no memory timing tweaks using Teamredminer at 1458 core clock (~1410 effective), pretty much like pbfarmer.

...or is the miner slower on Linux?

That is XMRigCC-AMD

Teamredminer might be faster...

Which is the fastest CN miner for Linux?

I use SRB for win.
hero member
Activity: 1274
Merit: 556
my results with Vega56 (samsung) on CN-R so far


What miner are you using to get those figures?
I'll be honest I'm a little disappointed. I get around 2050 per Vega56@64 with no memory timing tweaks using Teamredminer at 1458 core clock (~1410 effective), pretty much like pbfarmer.

...or is the miner slower on Linux?
jr. member
Activity: 144
Merit: 2
1600MHz core needed to  drive out tuned 1100MHz HBM2 on the 2 CN forks I tested so far.
Daggerhashimoto is different in this sense.
member
Activity: 340
Merit: 29
you’re prob running 225-250w for that 2100h/s, or about 80-100w more than you would need for 2000, without timing mods (I got 2K @ 837mv.)  +50% power for +5% h/r is not worth it - even if your power is free!

Normally 1500MHz needs around 925mV. I don't get those sub 850 numbers. Do you have a watt meter on it? Vddci is also set to <837?

to do the math in my case:  975/925=1.054  and 1.054^2 = 1.11 (+11%), and considering the higher clock, that means power consumption is only ~15% higher then before.


@coinscrow:
Thank you mate, going to try that!

Even 1500MHz is high - ~1400Mhz effective is a pretty efficient spot from all my testing. That combined w/ keeping your SOC <= 1107 MHz, generally allows <= 850mv.  My 50% number was based on 1030 vs 837mv - even @ 975 it would be 35%+. 

My 56 is plugged into a PDU, but on a mixed rig currently, so hard to isolate.  Though I have benched every GPU i own on a meter in the past (w/ multiple algos,) and tend to see 1400/1107 @ 837 eating around 160-170w (64 bios).

As to the 'mem/vddci' setting - since it's really just a vddc floor, in Windows I make it a point to always just set it well below my active state, while in linux I just use ppts, so it's forced by nature of being a table ref.

Pages:
Jump to: