Author

Topic: [ mining os ] nvoc - page 164. (Read 418546 times)

newbie
Activity: 32
Merit: 0
October 29, 2017, 08:41:17 AM
What does the pool you're connected to show for that worker?

Friend of mine is running an Asus Mining Expert as well with 7x 1070's and he's not having that issue with ZM.

I am using http://zcash.flypool.org
Maybe problem with card quantity, I will try less
full member
Activity: 558
Merit: 194
October 29, 2017, 08:31:47 AM
What does the pool you're connected to show for that worker?

Friend of mine is running an Asus Mining Expert as well with 7x 1070's and he's not having that issue with ZM.
newbie
Activity: 32
Merit: 0
October 29, 2017, 08:30:29 AM
Hello

ZM_Miner gives 350Sol/s but EWBF gives 560 Sol/s on my 1080

That seems to me happening only on my Asus b250 mining expert MoBo with 13 GPU installed, other boards (Asus Prime Z270-A with 9 GPU and old Gigabyte with 5 GPU) are ok.

Is it a bug in ZM?

Thank you
full member
Activity: 558
Merit: 194
October 29, 2017, 08:30:28 AM
Hi again, fullzero and all Smiley I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!
Just get a 30$ SSD

Agreed.  Had the same issue using name brand (SanDisk 32GB Cruzer Ultra Fit USB 3.0 drives).  Rigs would fail to reboot after running fine for a month plus.  I went with these $33 60GB SSD's and never had a problem since, and the rigs are much faster to boot.

https://www.amazon.com/gp/product/B01M2UUACN/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
October 29, 2017, 07:56:18 AM
Hi again, fullzero and all Smiley I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!
Just get a 30$ SSD
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
October 29, 2017, 07:52:38 AM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)


1060 3Gb has same hash rate as 6 Gb on both equihash and ethash?
Is it even possible to mine eth with 3 Gb?
newbie
Activity: 15
Merit: 0
October 29, 2017, 07:00:42 AM
Hi again, fullzero and all Smiley I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!
member
Activity: 117
Merit: 10
October 29, 2017, 04:36:41 AM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)



It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.
I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?
Nope, only on windows with gpuz

Driver Version: 384.59 and 384.90
my cards have Samsung memory
member
Activity: 224
Merit: 13
October 29, 2017, 04:07:04 AM
I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?
Nope, only on windows with gpuz

Here is confirmation of that:

https://devtalk.nvidia.com/default/topic/1018512/memory-brand-type/

... and a few other things that we won't see on Linux.
member
Activity: 126
Merit: 10
October 29, 2017, 04:01:56 AM
Quote
Nope, only on windows with gpuz
Hi Papampi,
Thanks for the help.
It's time to add an external drive with a Windows partition.
At the price we are buying the GPU they'd better have Samsung memory :-D
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
October 29, 2017, 03:24:20 AM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)



It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.
I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?
Nope, only on windows with gpuz
member
Activity: 126
Merit: 10
October 29, 2017, 03:11:17 AM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)



It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.
I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?
member
Activity: 126
Merit: 10
October 28, 2017, 09:53:25 PM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)



It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.
member
Activity: 117
Merit: 10
October 28, 2017, 07:56:48 PM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:


__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

newbie
Activity: 36
Merit: 0
October 28, 2017, 06:32:41 PM
After a restart, one of my rigs has a few GPUs where the minimum fan speed doesn't get set - i.e. min fan speed = 40; however, 1-3 GPUs are spinning @ 32%.

Granted they don't go over their temp limit, but they're constantly within +/- 1° without the fan speeding up (I have max temp diff set to 2). It's totally random at times which card(s) are affected, so I don't think there's something intrinsically wrong with any of the cards (MSI GTX 1070 x 7). Any idea why this could be happening? Not a big deal, just wanted to make sure that this isn't an omnious sign for anything.
newbie
Activity: 32
Merit: 0
October 28, 2017, 06:21:58 PM
Hi

Check pls http://prntscr.com/h3azap
what does that mean? the rig does not restart, could you please give an advice what to fix?

Thank you in advance
member
Activity: 224
Merit: 13
October 28, 2017, 04:30:57 PM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.
member
Activity: 126
Merit: 10
October 28, 2017, 04:13:36 PM
Quote
So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not
member
Activity: 224
Merit: 13
October 28, 2017, 03:44:30 PM
@papampi and @stubo
Thank you so much for your answers :-)
I can see the logs perfectly now.
 
Just another question
In 5_restartlog I get an error "GPU under threshold found" message every 3 to 5 seconds (see below).
Can I do something about this?
I am running the latest version of nvOC v0019-1.3
 
 
Code:
GPU UTILIZATION:  97 100 93 100 100
      GPU_COUNT:  5
GPU UTILIZATION:  99 97 100 97 100
      GPU_COUNT:  5
GPU UTILIZATION:  93 100 100 93 91
      GPU_COUNT:  5
GPU UTILIZATION:  96 99 100 97 82
Sat Oct 28 18:38:37 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 99 100 97 82
      GPU_COUNT:  5
GPU UTILIZATION:  96 100 100 100 89
Sat Oct 28 18:38:47 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 100 100 100 89
      GPU_COUNT:  5
GPU UTILIZATION:  100 96 98 100 98
      GPU_COUNT:  5
GPU UTILIZATION:  100 100 100 100 96


That comes from the watchdog (screen -r wdog) script 'IAmNotAJeep_and_Maxximus007_WATCHDOG'. On line 34, you will see the threshold set:

Code:
THRESHOLD=90

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?
member
Activity: 126
Merit: 10
October 28, 2017, 03:19:05 PM
@papampi and @stubo
Thank you so much for your answers :-)
I can see the logs perfectly now.
 
Just another question
In 5_restartlog I get an error "GPU under threshold found" message every 3 to 5 seconds (see below).
Can I do something about this?
I am running the latest version of nvOC v0019-1.3
 
 
Code:
GPU UTILIZATION:  97 100 93 100 100
      GPU_COUNT:  5
GPU UTILIZATION:  99 97 100 97 100
      GPU_COUNT:  5
GPU UTILIZATION:  93 100 100 93 91
      GPU_COUNT:  5
GPU UTILIZATION:  96 99 100 97 82
Sat Oct 28 18:38:37 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 99 100 97 82
      GPU_COUNT:  5
GPU UTILIZATION:  96 100 100 100 89
Sat Oct 28 18:38:47 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 100 100 100 89
      GPU_COUNT:  5
GPU UTILIZATION:  100 96 98 100 98
      GPU_COUNT:  5
GPU UTILIZATION:  100 100 100 100 96
Jump to: