Author

Topic: [OS] nvOC easy-to-use Linux Nvidia Mining - page 317. (Read 418416 times)

hero member
Activity: 672
Merit: 500
I will hold off on integrating this for now then (and wait for your changes); in the meantime I will make a link to your repo on the OP.

I've committed an update that, if it pans out, rolls everything into one Python script...no auxiliary shell scripts.  I'm testing it right now to verify that it behaves the same as the previous version.  I suspect I'll know in the morning.

Edit: Just did some accelerated testing by manually switching to a less-profitable coin first...the script killed the miner and fired up the appropriate miner.  I think the most recent update is ready for wider testing:

https://gitlab.com/salfter/nvoc-nicehash-switcher


If I could make my nvOS stable I would like to test this
I blame my biostar z170 board for all of my problems lol
Maybe tomorrow I can give it a go
newbie
Activity: 29
Merit: 0
If anyone is interested my 7 card rig with all GTX1070 FE is finally stable running for the past 72 hours with the following clocks and pl:

__CORE_OVERCLOCK_0=000
MEMORY_OVERCLOCK_0=1400

__CORE_OVERCLOCK_1=000
MEMORY_OVERCLOCK_1=1400

__CORE_OVERCLOCK_2=000
MEMORY_OVERCLOCK_2=1300

__CORE_OVERCLOCK_3=000
MEMORY_OVERCLOCK_3=1600  

__CORE_OVERCLOCK_4=000
MEMORY_OVERCLOCK_4=1400

__CORE_OVERCLOCK_5=000
MEMORY_OVERCLOCK_5=1300

__CORE_OVERCLOCK_6=000
MEMORY_OVERCLOCK_6=1200


INDIVIDUAL_POWERLIMIT_0=125

INDIVIDUAL_POWERLIMIT_1=125

INDIVIDUAL_POWERLIMIT_2=120

INDIVIDUAL_POWERLIMIT_3=130

INDIVIDUAL_POWERLIMIT_4=125

INDIVIDUAL_POWERLIMIT_5=120

INDIVIDUAL_POWERLIMIT_6=120

i determined the power limit for the cards by their individual temperature and lowered/increased their clocks accordingly, if i give them all the same power limit GPU 3 is 10 degrees C cooler then GPU 2/5/6 while GPU 0/1/4 are right in the middle, at the current power limits all the cards are pretty much the same temp plus or minus 1 degree.

i might be able to push GPUs 1-5 further, i suspect GPU 6 was causing my crashes since the last crash when it was clocked @ 1300 it was the only GPU to lose its sensor connections, since lowering it to 1200 everything is stable so there might be room for higher clocks on the rest of the cards.

wish there was a way to determine which are samsung and which are micron..
S9k
newbie
Activity: 26
Merit: 0
Hi,

Please help!
I have got stuck on this problems  Huh
My configuration:

-ASUS PRIME Z270-P - 2 . I tried both, results are similar.
-EVGA GeForce GTX 1080 GAMING ACX 3.0 - 2
-MSI Geforce GTX 1080 Gaming X-  2
-The Gigabyte power supply unit on 1200 watts


Three video cards work perfectly in any any combinations,

m1@m1-desktop:~$ nvidia-smi -L
GPU 0: GeForce GTX 1080 (UUID: GPU-43453088-0fca-9442-106d-7594d157ebf2)
GPU 1: GeForce GTX 1080 (UUID: GPU-d099b67e-f204-66fa-96dc-365a6b559a7e)
GPU 2: GeForce GTX 1080 (UUID: GPU-5aacd4db-f68b-917e-8ac2-84caf68d6cac)
m1@m1-desktop:~$


m1@m1-desktop:~$ lspci |grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
m1@m1-desktop:~$



but if I add the fourth (in this case the ID GPU-5aacd4db-f68b-917e-8ac2-84caf68d6cac ), then the system falls. Here what I see in dmesg


[   98.722227] nvidia-modeset: Allocated GPU:0 (GPU-43453088-0fca-9442-106d-7594d157ebf2) @ PCI:0000:01:00.0
[   98.769072] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769117] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769144] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769169] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769193] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769217] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   98.769241] ACPI Warning: \_SB_.PCI0.RP04.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.359255] nvidia-modeset: Allocated GPU:1 (GPU-5c9c8e29-a088-90a6-2a20-b2b2b971d1fb) @ PCI:0000:05:00.0
[   99.398991] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399035] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399063] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399087] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399112] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399136] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buff er], ACPI requires [Package] (20150930/nsarguments-95)
[   99.399160] ACPI Warning: \_SB_.PCI0.RP05.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[   99.984670] nvidia-modeset: Allocated GPU:2 (GPU-5aacd4db-f68b-917e-8ac2-84caf68d6cac) @ PCI:0000:06:00.0
[  100.619118] nvidia-modeset: Allocated GPU:3 (GPU-d099b67e-f204-66fa-96dc-365a6b559a7e) @ PCI:0000:03:00.0
[  100.743159] NVRM: GPU at PCI:0000:01:00: GPU-43453088-0fca-9442-106d-7594d157ebf2
[  100.743162] NVRM: GPU Board Serial Number:
[  100.743164] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 000001e0 00000801 00000004 00000005
[  100.743649] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000004 00000005 00000004

[  102.432593] r8169 0000:07:00.0 enp7s0: link up
[  102.432600] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
[  103.743306] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
[  103.773941] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000004
[  105.501795] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[  105.501798] Bluetooth: BNEP filters: protocol multicast
[  105.501802] Bluetooth: BNEP socket layer initialized
[  105.613048] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000004
[  105.613106] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000004
[  105.704570] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000004

[  105.704972] BUG: unable to handle kernel paging request at ffff88167153d830
[  105.704974] IP: [] _nv008171rm+0x620/0x780 [nvidia]
[  105.705052] PGD 220c067 PUD 0
[  105.705053] Oops: 0000 [#1] SMP

Three days I try to solve a problem.
I changed versions of BIOS (0325,0608,0610) and risers, control 4G is included, has updated NVIDIA drivers to 381.22 - nothing helps.
Maybe somebody will have ideas?
newbie
Activity: 26
Merit: 0
Hi,

Trying out your OS, but first thing i'm encountering is when trying to change pools to european ones, the miner can't connect anymore (even with nanopool).
Second, when trying to use ethermine as a pool (with the switch turned to YES), it loops to read response failed end of file and cannot resolve hostname and read response failed end of file again, etc etc - no, it's not an internet or dns problem i assure, it works great.

Could you link a working onebash example with eu1.ethermine.org:4444 , so I can see what I did wrong?

are you using the Newest_oneBash linked on the OP?

I think you might be having a problem related to the way workers were being named when their host address is less than 100.

I fixed this in the Newest_oneBash.

also ensure

Code:
ETHERMINEdotORG="YES"

Let me know if this is the problem.


Hi,
I found the solution, i think it was because i tried to put a custom worker name - maybe you can include such a feature in the future?
Also, I couldn't find how I can see the current mining process. I did see the screen -r commands, but that implies killing the current process and restarting it. I'd like to be able to see, from SSH, the current mining process without killing it. Is this possible?


I believe you must kill the process first. When the mining process starts from boot up, you can SSH in and type

Code:
ps aux | grep gnome-terminal  (find the PID for gnome-terminal-server)
kill PID
export DISPLAY=:0
screen -dmS rig1  (rig1 can be named whatever you like)
screen -x rig1
bash '/media/m1/1263-A96E/oneBash'
newbie
Activity: 2
Merit: 0
Hi,

Trying out your OS, but first thing i'm encountering is when trying to change pools to european ones, the miner can't connect anymore (even with nanopool).
Second, when trying to use ethermine as a pool (with the switch turned to YES), it loops to read response failed end of file and cannot resolve hostname and read response failed end of file again, etc etc - no, it's not an internet or dns problem i assure, it works great.

Could you link a working onebash example with eu1.ethermine.org:4444 , so I can see what I did wrong?

are you using the Newest_oneBash linked on the OP?

I think you might be having a problem related to the way workers were being named when their host address is less than 100.

I fixed this in the Newest_oneBash.

also ensure

Code:
ETHERMINEdotORG="YES"

Let me know if this is the problem.


Hi,
I found the solution, i think it was because i tried to put a custom worker name - maybe you can include such a feature in the future?
Also, I couldn't find how I can see the current mining process. I did see the screen -r commands, but that implies killing the current process and restarting it. I'd like to be able to see, from SSH, the current mining process without killing it. Is this possible?
newbie
Activity: 35
Merit: 0
Pls anybody explain me the difference between MANUAL_FAN="YES" + FAN_SPEED=75 and  MANUAL_FAN="NO". Stability? Consumption? Because I have used both of them and did no see difference in hash and consumption too.

I can not find the optimized setup for 1070. Do you have suggested setup? Or the newest oneBash is the suggested?

Thank you!

zec - pl 125 , cc 125 , mc 650 = > 460 sols . Rig running v solid - 8 GPU
eth - pl 85  , cc 150 , mc 1100= > 30.3mh  . Solid .. no crashes ..

You can try different settings and see how it behaves. I don't like to watch them all day for crashes .
hero member
Activity: 651
Merit: 501
My PGP Key: 92C7689C
Pls anybody explain me the difference between MANUAL_FAN="YES" + FAN_SPEED=75 and  MANUAL_FAN="NO". Stability? Consumption? Because I have used both of them and did no see difference in hash and consumption too.

They won't make any difference in hashrate, but automatic speed control will probably allow your GPUs to run warmer than a relatively high manual speed setting.  It allows you to trade more fan noise for cooler operating temperatures.

Quote
I can not find the optimized setup for 1070. Do you have suggested setup? Or the newest oneBash is the suggested?

I don't know that I'd consider it fully optimized, but you could use the settings in my auto-profitability switcher as a start (see the link a few posts ago).  The Equihash and DaggerHashimoto settings are probably the most tested at this point, as those have been the most profitable lately.
member
Activity: 97
Merit: 10
With this OS no problem to manage power with nvidias like under windows?

What tool for monitoring or managing such rigs can i use with restarts etc? (like temaviewer to manage rigs) Only ssh or any other options and maybe also auto monitoring/managing someone know?
newbie
Activity: 16
Merit: 0
Pls anybody explain me the difference between MANUAL_FAN="YES" + FAN_SPEED=75 and  MANUAL_FAN="NO". Stability? Consumption? Because I have used both of them and did no see difference in hash and consumption too.

I can not find the optimized setup for 1070. Do you have suggested setup? Or the newest oneBash is the suggested?

Thank you!
hero member
Activity: 651
Merit: 501
My PGP Key: 92C7689C
I will hold off on integrating this for now then (and wait for your changes); in the meantime I will make a link to your repo on the OP.

I've committed an update that, if it pans out, rolls everything into one Python script...no auxiliary shell scripts.  I'm testing it right now to verify that it behaves the same as the previous version.  I suspect I'll know in the morning.

Edit: Just did some accelerated testing by manually switching to a less-profitable coin first...the script killed the miner and fired up the appropriate miner.  I think the most recent update is ready for wider testing:

https://gitlab.com/salfter/nvoc-nicehash-switcher
newbie
Activity: 4
Merit: 0
Hi fullzero,

great work on this distro.
My next rig will be based on a biostar tb250 btc pro with 12 PCIe Interfaces. Do you know if i can use 12 similar nvidia cards or will there be any limitations of the driver like in Windows (max. 8 GPUs of a kind)?

Cheers
tiefschwarz
newbie
Activity: 12
Merit: 0
Love this OS.   

I've been using a new mobo:  Biostar Z270GT6; got it to work all day yesterday on 5 GPUs.  Tinkered a bit today, tried to boot mobo, got pass BIOS screen, then got spammed with PCIE bus error severity=corrected.  It just spammed and spammed non stop, and would probably never stop if I hadn't powered down. 

A similar issue that is answered can be found here: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected 

PS:  32GB USB still has some imaging issues, I've tried reformatting without the "quick" option, and tried both NTFS and FAT32. 

did you change picie to gen 2 in the bios?
newbie
Activity: 14
Merit: 0
Love this OS.   

I've been using a new mobo:  Biostar Z270GT6; got it to work all day yesterday on 5 GPUs.  Tinkered a bit today, tried to boot mobo, got pass BIOS screen, then got spammed with PCIE bus error severity=corrected.  It just spammed and spammed non stop, and would probably never stop if I hadn't powered down. 

A similar issue that is answered can be found here: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected 

PS:  32GB USB still has some imaging issues, I've tried reformatting without the "quick" option, and tried both NTFS and FAT32. 
newbie
Activity: 2
Merit: 0
Greetings,

Awesome work on this distro! I'm using it on a TB250-BTC with 6x EVGA 1070 SC and trying to tweak the OC to match my Windows 10 setup.

The issue I'm running into is that I cannot increase the power limit above 170 and it's limiting my overclock and reducing overall Sol/s by around 100-150.  Power is cheap where I live so the extra watts are not an issue. Using a 1000w and 1200w dual PSU setup so that is no concern either.

In Windows I can get 2850 Sol/s but only around 2700-2750 in Linux.

Do you know if the power limit cap is a limitation of the driver?

Any help is much appreciated.

Thanks,

Dragnan
newbie
Activity: 2
Merit: 0
Hi,

Trying out your OS, but first thing i'm encountering is when trying to change pools to european ones, the miner can't connect anymore (even with nanopool).
Second, when trying to use ethermine as a pool (with the switch turned to YES), it loops to read response failed end of file and cannot resolve hostname and read response failed end of file again, etc etc - no, it's not an internet or dns problem i assure, it works great.

Could you link a working onebash example with eu1.ethermine.org:4444 , so I can see what I did wrong?
newbie
Activity: 2
Merit: 0
Sorry im a Genoil newb.  I was using Claymore until nvOC 17 and now that I am using Genoil I am getting some crashes possibly from overclock.  Is there a switch or a watchdog or something to auto restart Genoil like Claymore does?  I've lowered the OC a bit.  For now it could be down for hours before I realize Genoil crashed.  With Claymore I could just look back and see if it reset itself / instable etc.  Thanks a bunch !!

A 0 hash detector / restarter has already been requested and added to the list.

For now I recommend lowering your clocks / moving your powerlimit up or down (depending on what it is currently ) each time you have an error. 

I have stabilized all my rigs running genoil this way; and they are all outperforming claymore.

Thank you! I saw that before but didnt really put 2 and 2 together.
newbie
Activity: 27
Merit: 0
@Fullzero - thanks so much.  This is working so well for me.  Grin

A few requests if I could be be so selfish:

1) Can you confirm that this version of Genoil the optimizations merged from this pull request?  Seems like people are getting 6%-10% performance improvement on GTX 1060's  (https://github.com/Genoil/cpp-ethereum/pull/228)

2) Any chance of adding in the Creep Miner for Burstcoin (proof of capacity) mining (https://github.com/Creepsky/creepMiner).  Then we'd have GPU, CPU, and Hard Drive mining in one!

3) Beyond cleaning out headers, are there other ways to get the image smaller so we can have more disk space, and/or offer a bigger image because most of us are using 32GB thumb drives?  I like to add a few personalizations but don't have the space to download all the dependencies and make the apps

Keep up the great work!

1 -

yes this is new cuda implementation I used when compiling Genoil:  the hash changes DRAMITACALLY as you increase the memory clock; however most of these clocks are currently unstable. 

I suspect most of the individuals who have reported 10% gains; did so before having a soft crash and realizing that although the client is capable of significantly higher hashrates; it is not stable with most of them. 

In my experience with this; I have found running the client with less cards is more stable and can reach higher OC (thus more gains).

2 -

I will add this to the list.


3 -

You can extend the primary partition on any key / ssd; by connecting it to a computer with nvOC that has already booted and clicking the ubuntu launcher at the top left and typing

gp

then click Gparted.  Find the sdb drive select the larger partition; it it is mounted unmount it; then rightclick and select resize and set the max size.

click the green checkmark to execute the change, wait for completion and it should be ~17gb larger.

I am planning on increasing the image to 32gb + add the cmds to enable Claymore / other clients to use 16gb VM in a later version.



Awesome!  You rock.

What's your address for sending hashes?
newbie
Activity: 17
Merit: 0
Sorry im a Genoil newb.  I was using Claymore until nvOC 17 and now that I am using Genoil I am getting some crashes possibly from overclock.  Is there a switch or a watchdog or something to auto restart Genoil like Claymore does?  I've lowered the OC a bit.  For now it could be down for hours before I realize Genoil crashed.  With Claymore I could just look back and see if it reset itself / instable etc.  Thanks a bunch !!

A 0 hash detector / restarter has already been requested and added to the list.

For now I recommend lowering your clocks / moving your powerlimit up or down (depending on what it is currently ) each time you have an error.  

I have stabilized all my rigs running genoil this way; and they are all outperforming claymore.

Sorry I missed that had already been implemented.  I had another hang less than 8 hours from the the one this morning.  This time was different though.  All the previous Genoil issues gave me a memory error so I attributed it to OC.  This one was "Error CUDA mining: the launch timed out and was terminated. CUDA error in func 'search' at line 346: the launch timed out and was terminated. "

I have no power limits set.  I now have my clocks set to -100 core and +950 memory on gtx 1070s.  On Claymore I was running +100 and +1150 stable.  Ill see how it does now.  I do believe it is still beating Claymore but I may do some 24hr tests back to back.  

I will add that I am now running 8 cards in this rig via 2 M.2 adapters.  It does seem as though the stability issues rose not long after the last 2 cards however I had only been running V17 / Genoil a day or so before adding the 7th and 8th card.  Running 6 1070s, 1 1060, and 1 970 in the nvOC rig. 

Thanks again for all your hard work!!  Is the default address in your onebash files yours?  I would like to give you some hashes
hero member
Activity: 1260
Merit: 1009
@Fullzero - thanks so much.  This is working so well for me.  Grin

A few requests if I could be be so selfish:

1) Can you confirm that this version of Genoil the optimizations merged from this pull request?  Seems like people are getting 6%-10% performance improvement on GTX 1060's  (https://github.com/Genoil/cpp-ethereum/pull/228)

2) Any chance of adding in the Creep Miner for Burstcoin (proof of capacity) mining (https://github.com/Creepsky/creepMiner).  Then we'd have GPU, CPU, and Hard Drive mining in one!

3) Beyond cleaning out headers, are there other ways to get the image smaller so we can have more disk space, and/or offer a bigger image because most of us are using 32GB thumb drives?  I like to add a few personalizations but don't have the space to download all the dependencies and make the apps

Keep up the great work!

1 -

yes this is new cuda implementation I used when compiling Genoil:  the hash changes DRAMATICALLY as you increase the memory clock; however most of these clocks are currently unstable.  

I suspect most of the individuals who have reported 10% gains; did so before having a soft crash and realizing that although the client is capable of significantly higher hashrates; it is not stable with most of them.  

In my experience with this; I have found running the client with less cards is more stable and can reach higher OC (thus more gains).

2 -

I will add this to the list.


3 -

You can extend the primary partition on any key / ssd; by connecting it to a computer with nvOC that has already booted and clicking the ubuntu launcher at the top left and typing

gp

then click Gparted.  Find the sdb drive select the larger partition; it it is mounted unmount it; then rightclick and select resize and set the max size.

click the green checkmark to execute the change, wait for completion and it should be ~17gb larger.

I am planning on increasing the image to 32gb + add the cmds to enable Claymore / other clients to use 16gb VM in a later version.

newbie
Activity: 27
Merit: 0
@Fullzero - thanks so much.  This is working so well for me.  Grin

A few requests if I could be be so selfish:

1) Can you confirm that this version of Genoil the optimizations merged from this pull request?  Seems like people are getting 6%-10% performance improvement on GTX 1060's  (https://github.com/Genoil/cpp-ethereum/pull/228)

2) Any chance of adding in the Creep Miner for Burstcoin (proof of capacity) mining (https://github.com/Creepsky/creepMiner).  Then we'd have GPU, CPU, and Hard Drive mining in one!

3) Beyond cleaning out headers, are there other ways to get the image smaller so we can have more disk space, and/or offer a bigger image because most of us are using 32GB thumb drives?  I like to add a few personalizations but don't have the space to download all the dependencies and make the apps

Keep up the great work!
Jump to: