[ mining os ] nvoc - page 164. | Bitcointalksearch.org

sergixc

newbie

Activity: 32

Merit: 0

Quote from: crazydane on October 29, 2017, 08:31:47 AM

What does the pool you're connected to show for that worker?

Friend of mine is running an Asus Mining Expert as well with 7x 1070's and he's not having that issue with ZM.

I am using http://zcash.flypool.org
Maybe problem with card quantity, I will try less

crazydane

full member

Activity: 558

Merit: 194

What does the pool you're connected to show for that worker?

Friend of mine is running an Asus Mining Expert as well with 7x 1070's and he's not having that issue with ZM.

sergixc

newbie

Activity: 32

Merit: 0

Hello

ZM_Miner gives 350Sol/s but EWBF gives 560 Sol/s on my 1080

That seems to me happening only on my Asus b250 mining expert MoBo with 13 GPU installed, other boards (Asus Prime Z270-A with 9 GPU and old Gigabyte with 5 GPU) are ok.

Is it a bug in ZM?

Thank you

crazydane

full member

Activity: 558

Merit: 194

Quote from: papampi on October 29, 2017, 07:56:18 AM

Quote from: woodl1 on October 29, 2017, 07:00:42 AM

Hi again, fullzero and all

I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!

Just get a 30$ SSD

Agreed. Had the same issue using name brand (SanDisk 32GB Cruzer Ultra Fit USB 3.0 drives). Rigs would fail to reboot after running fine for a month plus. I went with these $33 60GB SSD's and never had a problem since, and the rigs are much faster to boot.

https://www.amazon.com/gp/product/B01M2UUACN/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1

papampi

full member

Activity: 686

Merit: 140

Linux FOREVER! Resistance is futile!!!

Quote from: woodl1 on October 29, 2017, 07:00:42 AM

Hi again, fullzero and all

I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!

Just get a 30$ SSD

papampi

full member

Activity: 686

Merit: 140

Linux FOREVER! Resistance is futile!!!

Quote from: kk003 on October 28, 2017, 07:56:48 PM

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

1060 3Gb has same hash rate as 6 Gb on both equihash and ethash?
Is it even possible to mine eth with 3 Gb?

woodl1

newbie

Activity: 15

Merit: 0

Hi again, fullzero and all

I've got a question for you. Some of my rigs now have a problem booting from USB stick. I have many different sticks but after a month or two of normal booting some of them started to fail on booting process, I/O errors etc... I understand that this is a quality problem in most cases, but also I think it's the usecase problem. So now I'm thinking of nvOC/rxOC mod that could do network booting to completely exclude this kind of issues while using these distros. The question is - have someone tried to boot nvOC from network? If yes, please share your experience to us!

kk003

member

Activity: 117

Merit: 10

Quote from: papampi on October 29, 2017, 03:24:20 AM

Quote from: WaveFront on October 29, 2017, 03:11:17 AM

Quote from: WaveFront on October 28, 2017, 09:53:25 PM

Quote from: kk003 on October 28, 2017, 07:56:48 PM

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.

I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?

Nope, only on windows with gpuz

Driver Version: 384.59 and 384.90
my cards have Samsung memory

Stubo

member

Activity: 224

Merit: 13

Quote from: papampi on October 29, 2017, 03:24:20 AM

Quote from: WaveFront on October 29, 2017, 03:11:17 AM

I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?

Nope, only on windows with gpuz

Here is confirmation of that:

https://devtalk.nvidia.com/default/topic/1018512/memory-brand-type/

... and a few other things that we won't see on Linux.

WaveFront

member

Activity: 126

Merit: 10

Quote

Nope, only on windows with gpuz

Hi Papampi,
Thanks for the help.
It's time to add an external drive with a Windows partition.
At the price we are buying the GPU they'd better have Samsung memory :-D

papampi

full member

Activity: 686

Merit: 140

Linux FOREVER! Resistance is futile!!!

Quote from: WaveFront on October 29, 2017, 03:11:17 AM

Quote from: WaveFront on October 28, 2017, 09:53:25 PM

Quote from: kk003 on October 28, 2017, 07:56:48 PM

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.

I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?

Nope, only on windows with gpuz

WaveFront

member

Activity: 126

Merit: 10

Quote from: WaveFront on October 28, 2017, 09:53:25 PM

Quote from: kk003 on October 28, 2017, 07:56:48 PM

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.

I just read that not all memory behaves the same way and that I might have non Samsung memory on my GPUs.
Is there an easy way to check for memory manufacturer on the GPUs from ubuntu?

WaveFront

member

Activity: 126

Merit: 10

Quote from: kk003 on October 28, 2017, 07:56:48 PM

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

It's quite interesting. Which driver version do you use?
My GPU are all GTX 1060 6Gb, and in 1bash the setups are:

POWERLIMIT_WATTS=80
__CORE_OVERCLOCK=100
MEMORY_OVERCLOCK=1050

I cannot get anywhere close to your settings. When the memory overclock get over 1150 I start to have crashes every 15 seconds or so.
Unless it is a driver version problem I cannot see where the problem with my setup is.

kk003

member

Activity: 117

Merit: 10

Quote from: Stubo on October 28, 2017, 04:30:57 PM

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.

My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -a /GPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a /GPUGraphicsClockOffset[3]=-100

in 1bash would be:

__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500

__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500

__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300

for gpus 1,2,3

mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC

Running time: 500 - 1000 hours

hope help as reference ;-)

codereddew12

newbie

Activity: 36

Merit: 0

After a restart, one of my rigs has a few GPUs where the minimum fan speed doesn't get set - i.e. min fan speed = 40; however, 1-3 GPUs are spinning @ 32%.

Granted they don't go over their temp limit, but they're constantly within +/- 1° without the fan speeding up (I have max temp diff set to 2). It's totally random at times which card(s) are affected, so I don't think there's something intrinsically wrong with any of the cards (MSI GTX 1070 x 7). Any idea why this could be happening? Not a big deal, just wanted to make sure that this isn't an omnious sign for anything.

sergixc

newbie

Activity: 32

Merit: 0

Hi

Check pls http://prntscr.com/h3azap
what does that mean? the rig does not restart, could you please give an advice what to fix?

Thank you in advance

Stubo

member

Activity: 224

Merit: 13

Quote from: WaveFront on October 28, 2017, 04:13:36 PM

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.

WaveFront

member

Activity: 126

Merit: 10

Quote

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not

Stubo

member

Activity: 224

Merit: 13

Quote from: WaveFront on October 28, 2017, 03:19:05 PM

@papampi and @stubo
Thank you so much for your answers :-)
I can see the logs perfectly now.

Just another question
In 5_restartlog I get an error "GPU under threshold found" message every 3 to 5 seconds (see below).
Can I do something about this?
I am running the latest version of nvOC v0019-1.3

Code:

GPU UTILIZATION:  97 100 93 100 100
      GPU_COUNT:  5
GPU UTILIZATION:  99 97 100 97 100
      GPU_COUNT:  5
GPU UTILIZATION:  93 100 100 93 91
      GPU_COUNT:  5
GPU UTILIZATION:  96 99 100 97 82
Sat Oct 28 18:38:37 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 99 100 97 82
      GPU_COUNT:  5
GPU UTILIZATION:  96 100 100 100 89
Sat Oct 28 18:38:47 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 100 100 100 89
      GPU_COUNT:  5
GPU UTILIZATION:  100 96 98 100 98
      GPU_COUNT:  5
GPU UTILIZATION:  100 100 100 100 96

That comes from the watchdog (screen -r wdog) script 'IAmNotAJeep_and_Maxximus007_WATCHDOG'. On line 34, you will see the threshold set:

Code:

THRESHOLD=90

So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?

WaveFront

member

Activity: 126

Merit: 10

@papampi and @stubo
Thank you so much for your answers :-)
I can see the logs perfectly now.

Just another question
In 5_restartlog I get an error "GPU under threshold found" message every 3 to 5 seconds (see below).
Can I do something about this?
I am running the latest version of nvOC v0019-1.3

Code:

GPU UTILIZATION:  97 100 93 100 100
      GPU_COUNT:  5
GPU UTILIZATION:  99 97 100 97 100
      GPU_COUNT:  5
GPU UTILIZATION:  93 100 100 93 91
      GPU_COUNT:  5
GPU UTILIZATION:  96 99 100 97 82
Sat Oct 28 18:38:37 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 99 100 97 82
      GPU_COUNT:  5
GPU UTILIZATION:  96 100 100 100 89
Sat Oct 28 18:38:47 CEST 2017 - GPU under threshold found - GPU UTILIZATION:  96 100 100 100 89
      GPU_COUNT:  5
GPU UTILIZATION:  100 96 98 100 98
      GPU_COUNT:  5
GPU UTILIZATION:  100 100 100 100 96

Topic: [ mining os ] nvoc - page 164. (Read 418546 times)