Pages:
Author

Topic: [OS] rxOC easy-to-use Linux AMD Mining v_stopgap - page 13. (Read 31370 times)

legendary
Activity: 1260
Merit: 1009
fullzero, thank you for your NvOc and RxOc. Can you please update the RxOc? I want to change SimpleMining to something better, flexible configuration, but RxOc is didn't updated so long time and I don't know if you stop working on this.

I am planning on updating rxOC with a new version this week.  First I am updating the 1bash for nvOC with all the requested coin changes / then I will update and release a new rxOC version.   Smiley

Doing it this way is faster in the long run.
newbie
Activity: 9
Merit: 0
fullzero, thank you for your NvOc and RxOc. Can you please update the RxOc? I want to change SimpleMining to something better, flexible configuration, but RxOc is didn't updated so long time and I don't know if you stop working on this.

legendary
Activity: 1260
Merit: 1009
Need help guys. My rig keeps crashing and I don't know why. Running rxoc v0012

mobo is biostar tb250-btc
gpu 1 is a sapphire 580 4gb nitro+
gpu 2 is a xfx 480 4gb (with the stupid white leds)

If I run both gpus with unchanged clocks on onebash, I get pcie bus errors and it fills up the USB. I know it's not the risers because I tested them individually (after receiving replacements from the seller) on both gpus using smOS and rxoc individually and got no pcie errors.

I'm mining ZEC and it doesn't matter if I use optiminer or claymore, eventually the system hangs due to some error. I checked the error logs and this is a snippet of what I find

from the xorg.0.log: [  2402.679] (WW) AMDGPU(0): amdgpu_dri2_flip_event_handler: Pageflip completion event has impossible msc 143245 < target_msc 143246   (a crapload of these)

from the syslog.log:            only the 580 gpu running:

Jul 14 21:14:30 m1-desktop systemd[1]: dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.swap: Job dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.swap/start failed with result 'dependency'.
Jul 14 21:14:30 m1-desktop systemd[1]: dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.device: Job dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.device/start failed with result 'timeout'.
Jul 14 21:17:01 m1-desktop CRON[4102]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)              

(the miner crashed and restarted at 21:15)


Jul 14 21:32:31 m1-desktop kernel: [ 2752.674236] gmc_v8_0_process_interrupt: 39 callbacks suppressed
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674247] amdgpu 0000:01:00.0: GPU fault detected: 147 0x09020402
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674253] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00018920
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674257] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03004002
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674263] amdgpu 0000:01:00.0: VM fault (0x02, vmid 1) at page 100640, write from 'TC1' (0x54433100) (4)

Jul 14 21:32:31 m1-desktop kernel: [ 2752.731128] amdgpu 0000:01:00.0: IH ring buffer overflow (0x00081AD0, 0x000016C0, 0x00001AE0)


(I had never seen these before by the way, just pcie bus errors)


from the terminal running the claymore miner and throwing a fit:


ZEC: 07/14/17-21:13:42 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 311.807 H/s, Total Shares: 9, Rejected: 0, Time: 00:13
ZEC: GPU0 311.807 H/s
GPU0 t=57C fan=66%
DevFee: ZEC: Stratum - connecting to 'zec.suprnova.cc' <46.105.114.185> port 2242
ZEC: 07/14/17-21:14:09 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 309.570 H/s, Total Shares: 9, Rejected: 0, Time: 00:14
ZEC: GPU0 309.570 H/s
GPU0 t=57C fan=66%
GPU0 t=57C fan=66%
ZEC: 07/14/17-21:15:10 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 310.868 H/s, Total Shares: 9, Rejected: 0, Time: 00:15
ZEC: GPU0 310.868 H/s
ZEC: 07/14/17-21:15:15 - SHARE FOUND - (GPU 0)
ZEC: Share accepted (166 ms)!
GPU0 t=57C fan=66%
GPU0 t=57C fan=66%
Miner thread hangs, need to restart miner!

What's going on? bad gpus? bad mobo? risers have been swapped out already 2 times. I am a total Linux noob by the way.

First I would try setting:
Code:
OVERDRIVE="NO" 

and seeing if that is what is causing the problem.

Ensure your fan speed is set high enough to keep your GPUs cool:

I would try 200

Code:
FAN_SPEED=200


It also looks like there are some disk errors occurring: if disabling the overdrive and increasing the fan speed doesn't solve the problem: I would reimage the USB or use another USB.

Let me know how it goes.

newbie
Activity: 14
Merit: 0
Need help guys. My rig keeps crashing and I don't know why. Running rxoc v0012

mobo is biostar tb250-btc
gpu 1 is a sapphire 580 4gb nitro+
gpu 2 is a xfx 480 4gb (with the stupid white leds)

If I run both gpus with unchanged clocks on onebash, I get pcie bus errors and it fills up the USB. I know it's not the risers because I tested them individually (after receiving replacements from the seller) on both gpus using smOS and rxoc individually and got no pcie errors.

I'm mining ZEC and it doesn't matter if I use optiminer or claymore, eventually the system hangs due to some error. I checked the error logs and this is a snippet of what I find

from the xorg.0.log: [  2402.679] (WW) AMDGPU(0): amdgpu_dri2_flip_event_handler: Pageflip completion event has impossible msc 143245 < target_msc 143246   (a crapload of these)

from the syslog.log:            only the 580 gpu running:

Jul 14 21:14:30 m1-desktop systemd[1]: dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.swap: Job dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.swap/start failed with result 'dependency'.
Jul 14 21:14:30 m1-desktop systemd[1]: dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.device: Job dev-disk-by\x2duuid-d06ff735\x2d6872\x2d4264\x2daa59\x2dd42811d47b35.device/start failed with result 'timeout'.
Jul 14 21:17:01 m1-desktop CRON[4102]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)              

(the miner crashed and restarted at 21:15)


Jul 14 21:32:31 m1-desktop kernel: [ 2752.674236] gmc_v8_0_process_interrupt: 39 callbacks suppressed
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674247] amdgpu 0000:01:00.0: GPU fault detected: 147 0x09020402
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674253] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00018920
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674257] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03004002
Jul 14 21:32:31 m1-desktop kernel: [ 2752.674263] amdgpu 0000:01:00.0: VM fault (0x02, vmid 1) at page 100640, write from 'TC1' (0x54433100) (4)

Jul 14 21:32:31 m1-desktop kernel: [ 2752.731128] amdgpu 0000:01:00.0: IH ring buffer overflow (0x00081AD0, 0x000016C0, 0x00001AE0)


(I had never seen these before by the way, just pcie bus errors)


from the terminal running the claymore miner and throwing a fit:


ZEC: 07/14/17-21:13:42 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 311.807 H/s, Total Shares: 9, Rejected: 0, Time: 00:13
ZEC: GPU0 311.807 H/s
GPU0 t=57C fan=66%
DevFee: ZEC: Stratum - connecting to 'zec.suprnova.cc' <46.105.114.185> port 2242
ZEC: 07/14/17-21:14:09 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 309.570 H/s, Total Shares: 9, Rejected: 0, Time: 00:14
ZEC: GPU0 309.570 H/s
GPU0 t=57C fan=66%
GPU0 t=57C fan=66%
ZEC: 07/14/17-21:15:10 - New job from zec-us-west1.nanopool.org:6666
ZEC - Total Speed: 310.868 H/s, Total Shares: 9, Rejected: 0, Time: 00:15
ZEC: GPU0 310.868 H/s
ZEC: 07/14/17-21:15:15 - SHARE FOUND - (GPU 0)
ZEC: Share accepted (166 ms)!
GPU0 t=57C fan=66%
GPU0 t=57C fan=66%
Miner thread hangs, need to restart miner!


What's going on? bad gpus? bad mobo? risers have been swapped out already 2 times. I am a total Linux noob by the way.









 
legendary
Activity: 1260
Merit: 1009
So I have been watching all the cool stuff going into the Nvidia build and wondering why the AMD build is getting no love when it occurred to me....Why not make ONE build for both?
Wouldn't it work to just have AMD AND Nvidia drivers in one build?

I am currently building a new mixed rig with 4 AMD and 1 Nvidia and plan on using rxOC build 12 and adding the NV drivers manually and then just run a second screen.
Thats sort of where this idea of a single build came into play..

Some sort of switch like IF GPU=NVIDIA THEN (DO SOMETHING HERE LIKE USE NVIDIA DRIVERS) ELSE (USE AMD STUFF HERE)

Just thinking out loud...


I will implement all the transferable changes from nvOC to rxOC.  Some of them are Nvidia api dependent and might not have an AMD version and thus are not transferable.


Yeah I guess there must a good reason why Fullzero hasn't already done it lol..

I am changing my idea some right now. Burning the NV version and then adding the AMD drivers. That way I get all the new features Smiley

Ok this won't work lol. xORG issues all over the place...
Back to Windows for mixed rigs...Sad

Newer versions of *nix don't like mixing nvidia and amd.  I have gotten them to work together on specific rigs and in console mode (they don't like X together); but the methods I used are in no way portable.  Also only Claymore Ethash client seems to support AMD and Nvidia on the same rig; so there is little incentive to merge them.  This is as QuintLeo has said before; something Windows does better.


Mixing AMD and NVidia in LINUX is at best a royal PITA, and often just flat out doesn't work.

 I spent about 3 WEEKS trying different things to get the AMD 15.12 drivers and a reasonably current NVidia driver working on one of my A10-7860K based rigs, never did get it all working so finally gave up on the idea.

 That's WHY I'm currently working on shifting cards around as I can, to get them away from mixed-GPU so I can drop Windows in favor of much-more-reliable LINUX for my dedicated mining rigs - it USED to work, but doesn't seem to do so any more on recent driver versions.
legendary
Activity: 1498
Merit: 1030
Mixing AMD and NVidia in LINUX is at best a royal PITA, and often just flat out doesn't work.

 I spent about 3 WEEKS trying different things to get the AMD 15.12 drivers and a reasonably current NVidia driver working on one of my A10-7860K based rigs, never did get it all working so finally gave up on the idea.

 That's WHY I'm currently working on shifting cards around as I can, to get them away from mixed-GPU so I can drop Windows in favor of much-more-reliable LINUX for my dedicated mining rigs - it USED to work, but doesn't seem to do so any more on recent driver versions.

sr. member
Activity: 301
Merit: 251
Yeah I guess there must a good reason why Fullzero hasn't already done it lol..

I am changing my idea some right now. Burning the NV version and then adding the AMD drivers. That way I get all the new features Smiley

Ok this won't work lol. xORG issues all over the place...
Back to Windows for mixed rigs...Sad
sr. member
Activity: 301
Merit: 251
Yeah I guess there must a good reason why Fullzero hasn't already done it lol..

I am changing my idea some right now. Burning the NV version and then adding the AMD drivers. That way I get all the new features Smiley
sr. member
Activity: 353
Merit: 251
fullzero wrote that there are some issues with X11 configuration mixing the drivers. I guess it is ok in manual mode for particular installation/rig, but a bit harder to make it for any configuration of rig.
sr. member
Activity: 301
Merit: 251
So I have been watching all the cool stuff going into the Nvidia build and wondering why the AMD build is getting no love when it occurred to me....Why not make ONE build for both?
Wouldn't it work to just have AMD AND Nvidia drivers in one build?

I am currently building a new mixed rig with 4 AMD and 1 Nvidia and plan on using rxOC build 12 and adding the NV drivers manually and then just run a second screen.
Thats sort of where this idea of a single build came into play..

Some sort of switch like IF GPU=NVIDIA THEN (DO SOMETHING HERE LIKE USE NVIDIA DRIVERS) ELSE (USE AMD STUFF HERE)

Just thinking out loud...
legendary
Activity: 1260
Merit: 1009
@Fullzero

Version 0013

Can you make a global switch to turn off all overclocking?

OVERCLOCK="OFF"

This will allow a single onebash to work on rigs that have all cards fully modded and have mixed R9 with RX.

Currently I just did this.... Get some errors but it works...

# select level: to see supported clocks scroll to the top of the mining process
#  __CORE_OVERCLOCK_LEVEL=7  # for ETH use lowest without decreasing the hashrate / Highest for ZEC
#  MEMORY_OVERCLOCK_LEVEL=1   # use highest level

Unless I am missing something if the cards are modded with timings, undervolted and clock speeds then rocm-smi is unnecessary. (right???) (except maybe for fans)

Just an idea for more flexibility....


I will make a switch like this:

You probably want manual fans; so I will exclude that.
sr. member
Activity: 301
Merit: 251
@Fullzero

Version 0013

Can you make a global switch to turn off all overclocking?

OVERCLOCK="OFF"

This will allow a single onebash to work on rigs that have all cards fully modded and have mixed R9 with RX.

Currently I just did this.... Get some errors but it works...

# select level: to see supported clocks scroll to the top of the mining process
#  __CORE_OVERCLOCK_LEVEL=7  # for ETH use lowest without decreasing the hashrate / Highest for ZEC
#  MEMORY_OVERCLOCK_LEVEL=1   # use highest level

Unless I am missing something if the cards are modded with timings, undervolted and clock speeds then rocm-smi is unnecessary. (right???) (except maybe for fans)

Just an idea for more flexibility....

sr. member
Activity: 301
Merit: 251
Why does GPU0 have much less performance than others?
I tried from 2 to 6 cards 470 and 580
On the cards nVidia there is no such thing, everyone works the same performance.


Is your monitor plugged into GPU0?

Have you modded your bios of GPU0 the same as the other cards?

Can you try another MoBo?


I found the problem. Changed BIOS settings from "GEN1" to "AUTO"

Excellent!
newbie
Activity: 26
Merit: 0
Why does GPU0 have much less performance than others?
I tried from 2 to 6 cards 470 and 580
On the cards nVidia there is no such thing, everyone works the same performance.


Is your monitor plugged into GPU0?

Have you modded your bios of GPU0 the same as the other cards?

Can you try another MoBo?


I found the problem. Changed BIOS settings from "GEN1" to "AUTO"
sr. member
Activity: 301
Merit: 251
Why does GPU0 have much less performance than others?
I tried from 2 to 6 cards 470 and 580
On the cards nVidia there is no such thing, everyone works the same performance.


Is your monitor plugged into GPU0?

Have you modded your bios of GPU0 the same as the other cards?

Can you try another MoBo?

I am running 40 cards total on this version and modded all cards with timings, undervolt and clock speeds and I don't have that issue at all.

All my AMD rigs GPU0 performs the same as other GPU's that are alike.
newbie
Activity: 26
Merit: 0
Why does GPU0 have much less performance than others?
I tried from 2 to 6 cards 470 and 580
On the cards nVidia there is no such thing, everyone works the same performance.
sr. member
Activity: 854
Merit: 277
liife threw a tempest at you? be a coconut !
How did you get linux to oc rx cards? would you mind sharing it?

I used the roc-SMI:

https://github.com/RadeonOpenCompute/ROC-smi

then made some changes to it.  I will make more.  If you look at its code you will see the actual AMD api cmds.



Thank you ! You gave me a real envy to test your distro with your answer! I hope it will help me, and if I succeed... will pm you to ask your address as a thank you.

thanks again ! and I wish you a lot of success with your software!

edit: would be nice to have a torrent link to download it Smiley
legendary
Activity: 1260
Merit: 1009
Actually, I tried to remove all nvidia-smi commands at all and reboot. Still higher power consumption with modded BIOS comparing to Windows.
That is why I asked anyone who was able to run rx470 undervolted on Linux: please post your PBE screenshots and linux settings.

It might not be possible with all GPUs and Linux without hex editing.

It seems like no one has succeeded with that because no one posted a success story Smiley
I am ok to use Polaris BIOS Editor, or do manual hex patching. No matter what I was not able to fix the issue with high power consumption under linux with rx470 and asked for anyone who WAS able to do that to share how.

If I remember correctly; Wolf has made special tools for that: and offers some kind of modification service.
sr. member
Activity: 353
Merit: 251
Actually, I tried to remove all nvidia-smi commands at all and reboot. Still higher power consumption with modded BIOS comparing to Windows.
That is why I asked anyone who was able to run rx470 undervolted on Linux: please post your PBE screenshots and linux settings.

It might not be possible with all GPUs and Linux without hex editing.

It seems like no one has succeeded with that because no one posted a success story Smiley
I am ok to use Polaris BIOS Editor, or do manual hex patching. No matter what I was not able to fix the issue with high power consumption under linux with rx470 and asked for anyone who WAS able to do that to share how.
legendary
Activity: 1260
Merit: 1009

Maybe this will work:

Ensure the monitor is connected to the primary GPU ( the one in the 16x slot closest to the CPU )

Disconnect the USB or SSD/HHD from the rig.

Fully power off everything: including the PSU.

Press the power button several times to clear any remaining power in the mobo.

Turn the PSU powerswitch back to | "on".

power on (without the USB attached)

See if the bios posts; if you get nothing in 20 seconds; press ctrl + alt + del repeatedly until the system reboots.

Wait and see if the bios posts.

If the bios posts attach the USB key to a USB 2.0 port and press ctrl + alt + delete.


if it boots; stop the mining process before it starts mining:

then go to the top left and click the ubuntu button

type u

and click on software updater

run updates

reboot

Let me know if this works.


I tried with a new motherboard MSI Z270 Gaming Plus with only one GPU directly connected to it,  and a new PSU.
Same Result.
I have extra pendrives but I am not able to copy the image there because they are 15.5 Gb and I it needs 15.6.


If the bios posts; you can access the grub loader menu by pressing

esc

continuously while booting (note holding it down doesn't usually work), then select boot in recovery mode. 

in recovery mode:

Enable networking

then install updates from the cmd prompt:
Code:
sudo apt-get update && sudo apt-get dist-upgrade --yes

and reboot

this should ensure your build has all known system files for your system.



I think I've found the problem, it is not solved by updating:
I installed xubuntu in a new pendrive. And did a apt dist-upgrade, after that, I rebooted and installed lastest amdgpu-pro drivers, and then it happended exactly the same.
I am able to press ctrl+f1 and work via command prompt, but it is not solved by updating.
I am still trying to connect the pendrive to the 2.0 USB port, but it is not recognized. Why do you suggest to do it like this?

Thanks for your help!

There is something wrong with your usb image it if is not recognized when connecting to a USB 2.0 port.  USB 3 and 3.1 are fully compatible with 2.0. 

There may be a TPM or secureboot option in the bios that is causing problems.

I am assuming you are connecting the monitor to the GPU, and not the motherboard (if you connect a monitor directly to the motherboard; rxOC will go into an infinite loop).
Pages:
Jump to: