Author

Topic: [ mining os ] nvoc - page 147. (Read 418546 times)

full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 09, 2017, 03:02:56 AM
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever:

Code:
  BITCOIN="theGROUND"
  
   while [ $BITCOIN == "theGROUND" ]
   do
      sleep 60
   done

So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run.

Hope this helps.


I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself:

Code:
screen -dmS top top

will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen.

Hope this helps.


Thanks a lot mate.
It was always a question for me what those loops are for...
Tested the miner start script without the loops and all is good.

I think as you said its some legacy code from old nvOC and can be removed.

And as I said before that wtm-miner is just a copy of 3main miner starts so wtm auto switch dont restart 3main which will take so long and just switch miner.

agreed, this code seems redundant... I haven't fully analyzed 3main but the only logic behind it is to stop code below it to execute if the condition was met.

we need to clean and optimize the current code before adding any more features which will make the code even messier than it is now.

while we are on the programming topic... I really can't stand the misuse of logs to pass information/variables between the scripts. not only that the logs slow down everything, they are killing the usb sticks, causing all kinds of problems, hangs, freezes, etc. and i see more and more people are recomanding usage of screenlog

how about using named pipes (fifo) to pass info between the scripts. nothing is saved/logged by using a named pipe and it is much faster. leave the logs for what they were really ment to: to record problems

example: we want to get the fan speed and temps from the temp control so we can send it trough telegram. the easy solution is to write all those values from the temp control script into a log, then read from the log and send telegram. the problem is that in order to satisfy the telegram, the temp script has to write line by line that info for each gpu into a log, 24/7 so that the info is there when telegram needs it.

if we rewrite the temp control to keep sending all that info to a named pipe, the info will be there when telegram needs it without writing to a log over and over, 24/7, and write to a log only critical info/errors that would be needed for troubleshooting

i started rewriting the temp control and i intend to incorporate the named pipes but that will break compatibility with telegrams, web stats, and whatever else is fetching the info from logs. i just can't find enough free time to rewrite all the scripts in nvoc

if interested in the development and optimization of nvoc, please google "bash named pipe" and research then share your thoughts, maybe we can split the workload if you are interested

I'm all the way game with optimizing any codes
I think the best approach on optimizing temp or other codes is whomever write, test and optimized it, send it to fullzero and other devs to optimize other codes with it before releasing new version with it

Quote
agreed, this code seems redundant... I haven't fully analyzed 3main but the only logic behind it is to stop code below it to execute if the condition was met.
We can use "if elif else fi " to stop the code from reading next conditions, right?
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 09, 2017, 02:25:13 AM
Are there plans to add dstm equihash miner?

It bumped my 1070 TI's up 50 sols making them hash at 4.7 sols per watt O_o

video for reference
https://youtu.be/jtp4plChU9Y


It's already included since 19-1.3, look for "zm or ewbf" in 1bash
Here is easy way to update it

Update dstm zm miner easy way
full member
Activity: 200
Merit: 101
November 08, 2017, 11:11:49 PM
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever:

Code:
   BITCOIN="theGROUND"
   
   while [ $BITCOIN == "theGROUND" ]
   do
      sleep 60
   done

So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run.

Hope this helps.


I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself:

Code:
screen -dmS top top

will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen.

Hope this helps.


Thanks a lot mate.
It was always a question for me what those loops are for...
Tested the miner start script without the loops and all is good.

I think as you said its some legacy code from old nvOC and can be removed.

And as I said before that wtm-miner is just a copy of 3main miner starts so wtm auto switch dont restart 3main which will take so long and just switch miner.

agreed, this code seems redundant... I haven't fully analyzed 3main but the only logic behind it is to stop code below it to execute if the condition was met.

we need to clean and optimize the current code before adding any more features which will make the code even messier than it is now.

while we are on the programming topic... I really can't stand the misuse of logs to pass information/variables between the scripts. not only that the logs slow down everything, they are killing the usb sticks, causing all kinds of problems, hangs, freezes, etc. and i see more and more people are recomanding usage of screenlog

how about using named pipes (fifo) to pass info between the scripts. nothing is saved/logged by using a named pipe and it is much faster. leave the logs for what they were really ment to: to record problems

example: we want to get the fan speed and temps from the temp control so we can send it trough telegram. the easy solution is to write all those values from the temp control script into a log, then read from the log and send telegram. the problem is that in order to satisfy the telegram, the temp script has to write line by line that info for each gpu into a log, 24/7 so that the info is there when telegram needs it.

if we rewrite the temp control to keep sending all that info to a named pipe, the info will be there when telegram needs it without writing to a log over and over, 24/7, and write to a log only critical info/errors that would be needed for troubleshooting

i started rewriting the temp control and i intend to incorporate the named pipes but that will break compatibility with telegrams, web stats, and whatever else is fetching the info from logs. i just can't find enough free time to rewrite all the scripts in nvoc

if interested in the development and optimization of nvoc, please google "bash named pipe" and research then share your thoughts, maybe we can split the workload if you are interested
sr. member
Activity: 1414
Merit: 487
YouTube.com/VoskCoin
November 08, 2017, 10:40:39 PM
Are there plans to add dstm equihash miner?

It bumped my 1070 TI's up 50 sols making them hash at 4.7 sols per watt O_o

video for reference
https://youtu.be/jtp4plChU9Y
full member
Activity: 200
Merit: 101
November 08, 2017, 10:22:47 PM

Yep, good call mate, i agree there were so many other corner cases still missing, like @leenoox saying we need to go through most parts of our code and re write it, it wasn't the best but it does the job more or less, it has reached that point with the help of our early contributors. Every one occupied with so many things, even though we want to spend time on those things, we only have 24 hours a day and we have a life too, not easy to spend time, but we can do bit by bit like 20-30 mins once in a while (even weekly works) and improve it together, join those pieces together.

Commit for nothing, deliver something as something is always better than nothing.

I think it would be nice to gather all these points and improve it step by step, @leenoox you also suggested some change in the code where "bitcoin = the ground" stuff like that, that has been lost some where in chat. Any such things please PM me here or on Discord, I will put everything on discord in 'to_do' locked channel (not putting it open because of lots of messages!!) , or any other place, suggestions are welcome.

Thanks everyone.

yup, i wish i had little bit more time to dedicate to this too Wink

btw, regarding the bitcoin=theground... I posted solution how to rewrite it on discord right after you asked in the same subchannel, i beleive it was the oc channel, about two days ago...

if you can't find it, just replace all instances of this:

Code:
BITCOIN="theGROUND"
while [ $BITCOIN == "theGROUND" ]
do

with this:

Code:
while true
do

or, depending on your style, with one-liner

Code:
while true; do
full member
Activity: 200
Merit: 101
November 08, 2017, 10:09:32 PM
Watchdog Improvement?

I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER,  initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart.

My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod.

Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this:

Code:
  # Begin Stubo Mod
   # Look for no miner screen and get right to miner restart
   if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
   then
      COUNT=0
 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
   fi  
   # End Stubo Mod

By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action.

Thoughts?

I've mentioned this problem a while ago... incorporated quick and dirty patch on my rigs to fix it and wanted to rewrite the watchdog... it is still on my TODO list...

Your solution is also not the best one but it helps... that code is there to detect semi-freeze, when some card is acting up, however, as you noticed it is not well written and in some cases it takes hours before it reacts. On few ocasions I had one card freezing, pulling whole rig to a crawl... on 13 GPU rig it took about 3-4 hours for watchdog to realize that it was time to restart... sigh...  a quick patch was to lower the counter as well... I'll jump to it once I finish rewriting the temp control.
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 08, 2017, 06:38:24 PM
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever:

Code:
   BITCOIN="theGROUND"
   
   while [ $BITCOIN == "theGROUND" ]
   do
      sleep 60
   done

So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run.

Hope this helps.


I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself:

Code:
screen -dmS top top

will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen.

Hope this helps.


Thanks a lot mate.
It was always a question for me what those loops are for...
Tested the miner start script without the loops and all is good.

I think as you said its some legacy code from old nvOC and can be removed.

And as I said before that wtm-miner is just a copy of 3main miner starts so wtm auto switch dont restart 3main which will take so long and just switch miner.
member
Activity: 224
Merit: 13
November 08, 2017, 06:23:37 PM
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever:

Code:
   BITCOIN="theGROUND"
   
   while [ $BITCOIN == "theGROUND" ]
   do
      sleep 60
   done

So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run.

Hope this helps.


I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself:

Code:
screen -dmS top top

will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen.

Hope this helps.
member
Activity: 224
Merit: 13
November 08, 2017, 06:01:13 PM
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever:

Code:
   BITCOIN="theGROUND"
   
   while [ $BITCOIN == "theGROUND" ]
   do
      sleep 60
   done

So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run.

Hope this helps.
full member
Activity: 378
Merit: 104
nvOC forever
November 08, 2017, 05:21:54 PM
Sorry if you want a laff read on, many dumb questions will follow.
how to I get the nicehash profit switching to work?
I think I am using an earlier version of ncOC, Im not sure which one but it wont update, and for sure its not v0019-1.4.
This is the top part of 1bash
***************
# XMR  SIGT  ZPOOL_SKUNK  UBQ  ONION
# DMD GRS  ZPOOL_LYRA2V2  ZPOOL_BLAKE2S  
# ZEC  ZCOIN  HUSH  ZEN  ZCL  
# NICE_ETHASH  ETH    MUSIC  ETC  EXP  DCR  PASC
# MONA  VTC    DGB  SIA  FTC  LBC
# DUAL_ETC_DCR    DUAL_ETC_PASC    DUAL_ETC_LBC    DUAL_ETC_SC
# DUAL_EXP_DCR    DUAL_EXP_PASC    DUAL_EXP_LBC    DUAL_EXP_SC
# DUAL_ETH_DCR    DUAL_ETH_PASC    DUAL_ETH_LBC    DUAL_ETH_SC
# DUAL_MUSIC_DCR  DUAL_MUSIC_PASC  DUAL_MUSIC_LBC  DUAL_MUSIC_SC
# SALFTER_NICEHASH_PROFIT_SWITCHING
# SALFTER_MPH_PROFIT_SWITCHING

COIN="FTC"

Maxximus007_AUTO_TEMPERATURE_CONTROL="YES"

IAmNotAJeep_and_Maxximus007_WATCHDOG="YES"


************************

Ive downloaded v0019-1.4, but how do I go about installing it? Is it just a case of unpacking it in the home directory?

Set your coin to this
Quote
COIN="SALFTER_NICEHASH_PROFIT_SWITCHING"

Make sure you have added your BTC_ADDRESS at necessary places. If you are not sure, search for 'BTC_ADDRESS' in 1bash, some where in the middle you will see this
Quote
BTC_ADDRESS="replace_with_your_BTC_address"

Add your BTC address there, that should do the trick.

Regarding 1.4 version, you can't update to 1.4 using 4update or other commands, you need to do a fresh install (new flash)

EDIT : About 1.4 version

Extract the zip, take another memory stick/ssd write that image using HDDRAW (what ever you have used to write the image before)and plug and play.
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 08, 2017, 05:21:35 PM
My other suggestion for faster miner restart is to separate miner start lines from 3main, so wdog only start miner and not 3main
full member
Activity: 378
Merit: 104
nvOC forever
November 08, 2017, 05:14:58 PM
Watchdog Improvement?

I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER,  initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart.

My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod.

Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this:

Code:
  # Begin Stubo Mod
   # Look for no miner screen and get right to miner restart
   if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
   then
      COUNT=0
 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
   fi  
   # End Stubo Mod

By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action.

Thoughts?

Yep, good call mate, i agree there were so many other corner cases still missing, like @leenoox saying we need to go through most parts of our code and re write it, it wasn't the best but it does the job more or less, it has reached that point with the help of our early contributors. Every one occupied with so many things, even though we want to spend time on those things, we only have 24 hours a day and we have a life too, not easy to spend time, but we can do bit by bit like 20-30 mins once in a while (even weekly works) and improve it together, join those pieces together.

Commit for nothing, deliver something as something is always better than nothing.

I think it would be nice to gather all these points and improve it step by step, @leenoox you also suggested some change in the code where "bitcoin = the ground" stuff like that, that has been lost some where in chat. Any such things please PM me here or on Discord, I will put everything on discord in 'to_do' locked channel (not putting it open because of lots of messages!!) , or any other place, suggestions are welcome.

Thanks everyone.
newbie
Activity: 4
Merit: 0
November 08, 2017, 05:13:56 PM
Sorry if you want a laff read on, many dumb questions will follow.
how to I get the nicehash profit switching to work?
I think I am using an earlier version of ncOC, Im not sure which one but it wont update, and for sure its not v0019-1.4.
This is the top part of 1bash
***************
# XMR  SIGT  ZPOOL_SKUNK  UBQ  ONION
# DMD GRS  ZPOOL_LYRA2V2  ZPOOL_BLAKE2S  
# ZEC  ZCOIN  HUSH  ZEN  ZCL  
# NICE_ETHASH  ETH    MUSIC  ETC  EXP  DCR  PASC
# MONA  VTC    DGB  SIA  FTC  LBC
# DUAL_ETC_DCR    DUAL_ETC_PASC    DUAL_ETC_LBC    DUAL_ETC_SC
# DUAL_EXP_DCR    DUAL_EXP_PASC    DUAL_EXP_LBC    DUAL_EXP_SC
# DUAL_ETH_DCR    DUAL_ETH_PASC    DUAL_ETH_LBC    DUAL_ETH_SC
# DUAL_MUSIC_DCR  DUAL_MUSIC_PASC  DUAL_MUSIC_LBC  DUAL_MUSIC_SC
# SALFTER_NICEHASH_PROFIT_SWITCHING
# SALFTER_MPH_PROFIT_SWITCHING

COIN="FTC"

Maxximus007_AUTO_TEMPERATURE_CONTROL="YES"

IAmNotAJeep_and_Maxximus007_WATCHDOG="YES"


************************

Ive downloaded v0019-1.4, but how do I go about installing it? Is it just a case of unpacking it in the home directory?
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 08, 2017, 05:11:12 PM
papampi:

I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it:

Code:

   #IAmNotAJeep MOD from V002
   if [ $JEEP -gt 0 ]
   then
      echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART"
      
      # Begin Stubo Mod
      # Look for no miner screen and get right to miner restart
      if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
      then
         COUNT=0
echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
      fi  
      # End Stubo Mod

      if [ $COUNT -le 0 ]
      then
         INTERNET_IS_GO=0


With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds.

Thanks mate
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background

Can I ask what this line is for ?

Code:
     echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART"


Update:
Got it after add your edits,
Thanks a lot for edit
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 08, 2017, 05:06:59 PM
papampi:

I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it:

Code:

   #IAmNotAJeep MOD from V002
   if [ $JEEP -gt 0 ]
   then
      echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART"
      
      # Begin Stubo Mod
      # Look for no miner screen and get right to miner restart
      if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
      then
         COUNT=0
echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
      fi  
      # End Stubo Mod

      if [ $COUNT -le 0 ]
      then
         INTERNET_IS_GO=0


With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds.

Thanks mate
Can you please have a look at the ~/z_papampi_versions/wtm-miner
and see why it wont exit when done?
I tried with done, exit, exit 0, .... and it stays running in the background
member
Activity: 224
Merit: 13
November 08, 2017, 04:58:31 PM
papampi:

I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it:

Code:

   #IAmNotAJeep MOD from V002
   if [ $JEEP -gt 0 ]
   then
      echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART"
      
      # Begin Stubo Mod
      # Look for no miner screen and get right to miner restart
      if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
      then
         COUNT=0
echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
      fi  
      # End Stubo Mod

      if [ $COUNT -le 0 ]
      then
         INTERNET_IS_GO=0


With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds.
full member
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
November 08, 2017, 04:44:29 PM
Watchdog Improvement?

I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER,  initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart.

My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod.

Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this:

Code:
   # Begin Stubo Mod
   # Look for no miner screen and get right to miner restart
   if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
   then
      COUNT=0
 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
   fi  
   # End Stubo Mod

By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action.

Thoughts?

Amazing idea,
What I was testing is to just restart miner instead of restart 3main which will take 2-3 minutes, by using a separate miner start file which I use for WTM switcher
You can check it in ~/z_papampi_versions/wtm-miner
It has the 3main miner startup lines ( without salfter, zpool, mph) and I use that instead of restarting 3main

instead of :
Code:
       echo "WARNING: $(date) - Utilization is too low: restart 3main" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
    # If miner runs in screen 'miner' kill the screen to be sure it's gone
    pkill -e miner
    bash '/home/m1/telegram'
    # Best to restart oneBash - settings might be adjusted already
    target=$(ps -ef | awk '$NF~"3main" {print $2}')
    kill $target #| tee -a ${LOG_FILE}
    echo "" #| tee -a ${LOG_FILE}
    RESTART=$(($RESTART + 1))
    REBOOTRESET=0
    COUNT=$GPU_COUNT

I use :

Code:
       echo "WARNING: $(date) - Utilization is too low: restart 3main" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
    # If miner runs in screen 'miner' kill the screen to be sure it's gone
    pkill -e miner
    sleep 1
    bash /home/m1/wtm-miner
    bash '/home/m1/telegram'
    RESTART=$(($RESTART + 1))
    REBOOTRESET=0
    COUNT=$GPU_COUNT

Can you please tell me where you add your edit?
I made so many edits that line 98 doesnt look like the correct place


Update: I think that wtm-miner included in 1.4 is missing a done or exit at the end.
member
Activity: 224
Merit: 13
November 08, 2017, 03:55:28 PM
Watchdog Improvement?

I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER,  initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart.

My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod.

Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this:

Code:
   # Begin Stubo Mod
   # Look for no miner screen and get right to miner restart
   if [[ `screen -ls |grep miner |wc -l` -eq 0 ]]
   then
      COUNT=0
 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE}
   fi  
   # End Stubo Mod

By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action.

Thoughts?
full member
Activity: 378
Merit: 104
nvOC forever
November 08, 2017, 03:49:17 PM
Guys; i just found this new (??) ethash miner, not sure it dual mines but would like someone to test it and see whether it earns place in nvOC??

https://github.com/ethash/eminer-release/releases/


It comes with some gadgets to see the hashrate and stuff, please have a look and put your opinion out. Would suggest to run it on a test RIG (I don;t have one Sorry Sad )


You can also join the discussion here on discord https://discord.gg/trw4c3c

It seems this miner is using OpenCL, not CUDA. Should be ok for AMD cards but not so much for Nvidia.

How on earth i missed that point  Huh Huh Huh

I only saw NVIDIA from this line; "Fully support AMD and NVIDIA OpenCL devices".

Thanks @leenoox
member
Activity: 117
Merit: 10
November 08, 2017, 03:25:15 PM
@damNmad:

I found a small bug in your telegram configuration. Command which use for getting GPUs count is invalid when rig have installed over 9 cards...

Your command:
Code:
GPU_COUNT=$(nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }')
and result:
Code:
m1@rig-bafomet:~$ nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }'
2

There isn't any reason, why use awk. Easier way is use wc only.

So, fix:
Code:
GPU_COUNT=$(nvidia-smi -L | wc -l')
and result:
Code:
m1@rig-bafomet:~$ nvidia-smi -L | wc -l
11

I used to use same cmd, fixed it with this one for 1.4:

Code:
nvidia-smi --query-gpu=count --format=csv,noheader,nounits | tail -1



Here is even more optimized code, less cycles, no need to use pipe:

Code:
nvidia-smi -i 0 --query-gpu=count --format=csv,noheader,nounits

every one has GPU0 plugged in so we can query only one GPU to get the total number of GPU's instead of querying all GPU's to return the same number then pipe it trough tail Wink



Totally correct and thanks for the help 👍👍👍

I've never understood why in this query every GPU shows the total number of GPUs instead of just the number of GPUs.!!!
Jump to: