Pages:
Author

Topic: CGWatcher 1.4.0, the GUI/monitor for CGMiner and BFGMiner to prevent downtime - page 26. (Read 180485 times)

sr. member
Activity: 434
Merit: 251
CGWatcher & CGRemote
Thanks for the feedback. A few things that have been changed in 1.1.4 that are related to each of your issues:

- Instead of watching for total share counts to stop increasing, it now only watches accepted share counts. Those are the only shares that matter so if those stop increasing, a restart is probably a good idea. Users will need to modify this setting as necessary (increase the number of minutes), because accepted share counts may not increase as often as all share counts combined were increasing. The Utility value on the Status tab (as well as the Report tab) shows accepted shares/min so this should help you determine a good estimate for how often accepted share count increases.

- Someone else has reported that the miner sometimes fails to close completely, or hangs. I had never seen this but am learning more about the quirks of the miners every day. I've corrected this to watch the miner after sending the quit command, and if necessary killing the process. When killing the process, it will attempt to kill it every second for up to 10 seconds before giving up. Hopefully this is enough to remedy this situation.

- If you can provide more information on the HW errors, I will see if there is anything I can do to fix this. Right now there is not much to reporting this, it is a number that comes straight from the miner and is shown in CGWatcher without modification.

The idea you mentioned would definitely indicate a problem, but I believe when temperatures drop suddenly and significantly, the miner will soon report the GPUs as sick or dead. This usually takes 5 - 10 seconds or so. If you see this happen and the GPUs do not become sick or dead (causing a restart by CGWatcher), please let me know. Also, the accepted share count not increasing check should catch this as well, as this number should stop increasing if the miner is not working correctly.

I've re-written most of the monitoring code and have made a lot of improvements and fixes, with a lot of testing. 1.1.4 will also introduce profiles and some other new features. It has a new option to ensure the miner stays running, meaning if CGWatcher is open with this turned on... it will start the miner if it is not running. I had to make this an option because it is impossible to determine if the miner was closed outside of CGWatcher by the user, or if the miner closed because of a problem. If the user happened to close the miner by clicking the X for example, I didn't want CGWatcher immediately re-launching it when the user didn't want it to. Trying to anticipate what the user is going to do and why is the challenging part of this project, because there is a very wide range of experience levels and I think that software should adapt to the how the user works instead of the user adapting to how the software works... as much as possible anyway.

I am hoping to have 1.1.4 ready by this weekend.
newbie
Activity: 31
Merit: 0
Quick feedback after a few days testing on Win7 64bit:

Works great Smiley

Small issues:

- When restarting the system, the message "do you really want to close CGwatcher..." prevents successful restart. You have to click it away, else it won't restart.

- Once a miner was out of action for a few hours. I have to look at the logs to see what exactly happened, but basically CGWatcher was trying to restart CGMIner again and again because the total shares didn't increase. But it kept restarting, and the restarted CGMiner would just hand. Reason was that in the background a CGMiner window was still open, but stuck/hung.

- HW errors seemingly were not reported in CGWatcher, even CGMiner showed a lot (I can try to duplicate this if it helps)

And one idea:

- restart card and/or cgminer if temperature suddenly drop significantly. I am not sure if this is maybe already covered by the "total shares not increasing", but I had a situation right now where everything looked fine, just the temperatures n all cards in a miner rig dropped - as if the cards suddenly were not really working much any more.
newbie
Activity: 51
Merit: 0
Just send you 0,2 btc for:
"- CGRemote (in development)..." Smiley
legendary
Activity: 2702
Merit: 1468
af_newbie: I'd say congrats to you, but your comment seems to already do that. I hadn't heard of it. If I had, I may not have went through the trouble of creating one. The now ~100 downloads/day I am getting indicates that at least a small group of miners haven't heard of it either, or that there is enough room for both. I don't think the fact that you created something similar means it should be the only one, and with over a year of development (vs. a month for me) I'm sure it does things mine doesn't - including some things on my to-do list. I'm in the process of open-sourcing as well so others can contribute or customize it to their needs. When I no longer mine or have users, then I'll stop development.  

As to "been there, done that", I never claimed I had invented the wheel or created something never done before. It was something I wrote for myself and decided to share with others.

Sorry, I did not want to discourage you from writing your own thing.  Just review the features of akbash.  You'll find that some hung scenarios CANNOT be detected by just looking at cgminer process or its APIs.  You have to poke the cards directly through ADL, look at the pools, network card, memory, handles among other things to determine whether to restart the process or reboot the machine.  Be careful though, too many false positives and you'll be restarting miner process where you shouldn't be.  When you get WER popups, you have to dig down and find the process IDs that belong to cgminer process that just died and close them all.

cgminer APIs are all good when everything is running smoothly, but when things go haywire they cannot be trusted.

Sometimes, you'll find that cgminer APIs will be reporting everything is fine, but some GPUs will be unresponsive or APIs will not be responding but cgminer will be mining ok.  Make sure you cover those cases.  Otherwise, your watchdog will be just displaying API stats and checking the miner status. There are better apps for that: ANUBIS is a good example.  No need to re-invent the wheel.

If you want a real watchdog, at least implement the features I put into akbash.
newbie
Activity: 13
Merit: 0
Been there, done that.

https://bitcointalksearch.org/topic/akbash-1012-open-source-cgminer-watchdog-remote-monitoring-emails-http-76208

My watchdog monitors ADL, GPU/FPGA rates, slowdowns, pool performance, crashes in AMD drivers, handles WER errors, sends email notifications, has HTTP interface to control the miner and OS, smart metering etc.
No limit on number of devices, written in C, blah, blah...

I like your logo though.

I've looked at Akbash and it looks like an in depth application that provides a lot of functionality.  It sounds like you've put a lot of technical work into insuring miners are only restarted when absolutely unnecessary.  I'm not above editing a config file or working in command line but there is a lot to be said for a nice GUI and an app that just works.  I dropped CGWatcher in with my cgminer folder, it found my cgminer config file and and just worked.  Its been reliable, restarted my GPU and cgminer when needed and has worked really well with a nice informative interface that makes editing miner settings a snap.  It also starts cgminer hidden and minimizes to the tray which keeps it out of the way and is helpful in keeping anyone from accidentally shutting down the miner.  Just my two bitcents on why I prefer CGWatcher over Akbash. 

PB
newbie
Activity: 31
Merit: 0
af_newbie, thanks a lot for pointing out the other tools. I would have been surprised if anything like this hasn't been tried a few times before. Knowing other tools seems good both for reference as well as people looking for other functions/priorities. Both tools you mentioned look very useful, and Anubis even seems open source by virtue of being a PHP script (unless it's obfuscated or something like that, didn't check yet).

I guess CGWatcher still would have its place in this world, though, even if it was 100% redundant in functionality to those two tools: With a GUI (no frightening  Grin console app) and no messing too much with config files, PHP etc., it seems especially Windows newbie friendly. Which is not to say that to many, including me, a non-GUI tool is absolutely fine, if not preferable.

Thanks also for pointing out some of the general but very critical issues you encountered while writing your tool. Since milone admitted being very new to this topic, hints like that might save him lots of grey hair hehe. At least it's good to know about possible limitations and previously found issues as well as potential work ahead... in case milone wants to go the whole nine yards.
legendary
Activity: 2702
Merit: 1468
Been there, done that.

https://bitcointalksearch.org/topic/akbash-1012-open-source-cgminer-watchdog-remote-monitoring-emails-http-76208

My watchdog monitors ADL, GPU/FPGA rates, slowdowns, pool performance, crashes in AMD drivers, handles WER errors, sends email notifications, has HTTP interface to control the miner and OS, smart metering etc.
No limit on number of devices, written in C, blah, blah...

I like your logo though.
sr. member
Activity: 434
Merit: 251
CGWatcher & CGRemote
grottenolm: Thanks, it is greatly appreciated. Even small donations can provide a boost to motivation. This is my first experience with writing free software - and soon to be my first open-source project - and the number of requests vs. donations can be discouraging. My goal wasn't to make money from it, but it is impossible to keep it a priority and add the features that everyone is asking for when I still have to take projects to make a living. This experience has definitely caused me to give even more to developers of free software I use.

The information on the Status tab is what I check most often, particularly hashrate, temperatures, and accepted/rejected/etc. share totals. I want to display that info as least intrusively as possible so may end up creating a small window or overlay you can switch to that sits in the corner or stays on top of other windows. I'll also work on increasing the chart height in the Status tab to display more GPU temps without making the text too small.

I'm making good progress on CGRemote, and it will allow remote monitoring of miners with or without CGWatcher. Having CGWatcher on the mining computer is preferable and will still focus on keeping the miner working properly, but won't be necessary. If the mining computer doesn't have CGWatcher running, CGRemote will communicate with the miner directly. However, in these cases if CGRemote loses API access or unexpectedly closes after you send it a restart command, you would have to relaunch the miner yourself because CGWatcher won't be there to do it.


af_newbie: I'd say congrats to you, but your comment seems to already do that. I hadn't heard of it. If I had, I may not have went through the trouble of creating one. The now ~100 downloads/day I am getting indicates that at least a small group of miners haven't heard of it either, or that there is enough room for both. I don't think the fact that you created something similar means it should be the only one, and with over a year of development (vs. a month for me) I'm sure it does things mine doesn't - including some things on my to-do list. I'm in the process of open-sourcing as well so others can contribute or customize it to their needs. When I no longer mine or have users, then I'll stop development.  

As to "been there, done that", I never claimed I had invented the wheel or created something never done before. It was something I wrote for myself and decided to share with others.
newbie
Activity: 31
Merit: 0
May I suggest including all your donation addresses also here in your signature, not only on the website?

Quote
BTC: 19msnBddmcaHnbTTQgFgzPDuy6PqfBgFJh
LTC: LM6Un6hZvPzLBggJWiAVG6E6w2GfaHukXY
NMC: NJjD4rP5xy2mgSK8gXXsZwFkdknbvtvy3q

Anyway, sent you BTC0.1 for this great tool. Thanks a lot, this is indeed helping a lot to earn more money by mining and spending less time worrying - increasing quality of life  Grin

And it will become even more awesome once the remote part is done. Fortunately donating with BTC is so easy. I hope many people honor your great work, for which probably many people would charge.

PS: one very minor issue I found is that when using 6 GPUs not all temperatures are shown. Sometimes the first 5 are shown, with "..." below them, sometimes only the "...". Just for info, not bothering me at all Smiley
newbie
Activity: 31
Merit: 0
Yes, discarded and stale are two different things.

From the CGMiner 2.9 ReadMe...

SS is stale shares discarded (detected and not submitted so don't count as rejects)

DW is discarded work items (work from block no longer valid to work on)

[...] stratum may result in more discarded work but reduces rejected or stale shares so ultimately it is better.

Aaaah, thanks a lot! So my assumption would be that "discarded work items" have not actually been worked on, so they are nothing to worry about much - it's not lost computing time.

I hope someone can chime in and confirm whether or not that is correct.
full member
Activity: 134
Merit: 100
I would just like to say thank you, and well done on a well designed piece of software.

I was initially looking for a way to reboot a SICK card and stumbled across this and It has that and all the functionality I could ever want.
I am really looking forward to the remote app, as I just have all my rigs open in different teamviewer sessions on my main PC.

Jama
sr. member
Activity: 434
Merit: 251
CGWatcher & CGRemote
Yes, discarded and stale are two different things.

From the CGMiner 2.9 ReadMe...

SS is stale shares discarded (detected and not submitted so don't count as rejects)

DW is discarded work items (work from block no longer valid to work on)

DW is no longer displayed in newer versions of CGMiner and therefore is not mentioned in the latest ReadMe.

I've seen an increased discarded rate on my miners and believe it is either a result of using the stratum protocol or using a pool that does merged mining. I haven't been able to find much information on it, but I want to say that I read somewhere stratum may result in more discarded work but reduces rejected or stale shares so ultimately it is better. Hopefully someone more knowledgeable can correct me if I'm wrong or provide more information. My discarded rate is usually around 20% with BitMinter, but rejected and stale are always 0 or at worst < 1%.

I will put the codes used in CGMiner into the info boxes (tooltips) that appear when you put your mouse over a value in CGWatcher, though as I mentioned some may no longer be displayed in new versions of CGMiner.

Admittedly I have quite a bit to learn about how mining software works so there may be a better monitoring option than checking to see if accepted, rejected, stale, or discarded don't increase over x minutes (all four have to not increase to trigger the restart.) My thinking was a problem with the pool would cause these numbers to not change, and I've remoted into my own miners and have seen them stuck because the pool went down/had issues and CGMiner did not switch to the backup pool correctly. So as I find problems with my own miners, I try to add options to catch them. As I learn of better ways to check for problems, these options may change in future versions.
newbie
Activity: 31
Merit: 0
First of all thanks a lot for creating this tool. Especially for LTC mining, where I get a lot of times sick/dead GPUs, this might turn out handy Smiley If I end up using this tool productively, for sure some donation will come your way...


■ Restart the miner if shares (accepted, rejected, discarded, and stale) do not increase for X number of minutes.

Maybe a newbie question, but what do those "discarded" shares that are also shown on the status page refer to?

I was not aware that CGMINER shows any discarded shares. Except for the ill-named "SS" of course, which according to the documentation refers to "stale shares detected and discarded before submitting". But in my case the "SS" value in CGMINER shows 0, while the discarded value in CGWatcher shows something like 15-25%... o it can't be the same I guess.

Any hint how this value relates to the info available in the CGMINER text interface?

newbie
Activity: 13
Merit: 0
I just downloaded this and its really sharp.  I've been looking for something like this.  I'm currently using Anubis but it can be kind of clunky for changing settings for the miner.  Good work.
sr. member
Activity: 434
Merit: 251
CGWatcher & CGRemote
You'd have to be running CGMiner with no pools or with no GPUs to get this error. Up until now I assumed at least one of each for anyone who used the program. It does not yet support FPGA or ASIC. I have fixed this error so it will at least expect you don't have any GPU or pool, although you may not see hashrates or other data. You can download the update using the same download link, I did not change version numbers.

The update also includes GPU temps on the Status tab and displays them in the same graph as the hashrate, since that is something I had already added. Anyone who wants this feature now can re-download using the same link as well.
legendary
Activity: 1134
Merit: 1005
I am getting errors
Code:
[4/24/2013 11:58:40 PM]   Begin Process--------------------------------------------------------
[4/24/2013 11:58:41 PM]   CGMiner instance found.
[4/24/2013 11:58:41 PM]   Exception occurred during Refresh: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
sr. member
Activity: 434
Merit: 251
CGWatcher & CGRemote


CGWatcher - a GUI/monitor for CGMiner and BFGMiner

Latest version: 1.4.0
Latest version release date: June 29, 2014


New in this version:
  • Improved support for SGMiner 4.2.x.
  • New setting to prevent CGWatcher from modifying config file to enable API in case this causes problems with new miners.
  • SGMiner pool property settings better handled, able to use or not use "pool-" prefix depending on what names are used in the config file. Also a setting has been added to specify whether pool settings should begin with "pool-" by default.
  • "Disable temporary config file creation" setting not being saved correctly fixed.
  • Other minor fixes and improvements.

Description
CGWatcher is a GUI for bitcoin miners CGMiner and BFGMiner. Along with giving a graphical interface to the miner, it has several options to monitor the miner and correct problems when they are detected. It helps to minimize downtime while providing something a little easier to look at.

It works via the miner's API, which was created for this purpose - to allow other software to communicate with the miner. While there are several web applications to allow remote monitoring of these miners, that is not the purpose of CGWatcher. It is designed to run on the same computer as the miner, and will watch for the conditions you set to determine if the miner is working properly. If it is not, CGWatcher takes the appropriate actions to correct the problem (usually restarting the miner.) The idea is to create a program that does the monitoring for you, so you don't have to use those web applications to constantly check on your miners.

CGWatcher is a small and portable .NET application. It will run as a 32-bit application in 32-bit Windows, or a 64-bit application in 64-bit Windows so it can work with 64-bit miners. It can be run inside sandbox environments like Sandboxie if you don't trust it (although settings may not be saved after closing.) Included in the archive is the ReadMe text file, and libraries (links to library information are on the CGWatcher download page.) The program creates its own config file (CGWatcher.exe.ini), log (cgwatcher.log) and a couple data files once started to store profile and config file data. It also creates miner.log that records some mining-specific information like pool changes, hardware errors, GPU or pool status changes, etc. You can start CGWatcher while the miner is running, or use it to launch the miner (preferred).


                   

               

Screenshots of each tab in CGWatcher as of version 1.2.0

Profiles
CGWatcher allows you to create mining profiles using different miners, config files, and/or arguments. When you first use it, it will create a default profile and try to locate a miner if one is running or one is located in the same directory or subdirectories of CGWatcher. If it cannot find a miner, you will have to manually specify where it is located and (optionally) a config file and/or arguments you want to use. You can do this by clicking the 'Manage Profiles...' button in the Settings tab. You can create as many profiles as you'd like for the different crypto-currencies you mine. You can also rename the default profile if you'd like, it just names the first one Default because I had to name it something. When you switch to a new profile ("activate" a profile), CGWatcher will use that profile's settings any time it starts or restarts the miner. However, if you switch profiles while a miner is running, you will obviously need to restart the miner in order for the new profile to be used. You can see which profile a currently running miner is using on the Status tab. Ideally it would always be the same as the active profile you've set... but if you changed profiles while mining and chose not to restart the miner when prompted, keep in mind that the miner will still be running on the previous profile until it is restarted (or stopped and started).


Monitor
The main purpose of CGWatcher is to keep the miner running properly. To do this, the monitor must be enabled (default). You can enable it by checking the first option on the Monitor tab, and set the interval (seconds) for how often it checks the miner's status and refreshes information. Monitor options include:
  • Restart the miner if the total hashrate falls below X for a specified number of seconds.
  • Restart the miner after X hours of continuous mining to cover any problems that other checks may have missed. That ensures that should all other checks fail to detect a temporary problem, at worst the downtime should be limited to the number of hours you set here.
  • Restart the miner if accepted shares or total shares do not increase for X number of minutes.
  • Restarting the miner or computer when a sick or dead GPU is detected since sometimes the miner is unable to restart it itself.
  • Restart the miner if it had full API access but now only has read-only (in the same miner process), as I've learned this usually indicates a problem. It will also restart if it had any API access to the miner process but now it has none.
  • Ensure the miner stays running unless you pause or stop it inside CGWatcher. If this option is enabled and the miner is closed for any reason outside of CGWatcher (including you closing the miner window), it will be restarted.
  • Scheduled mining - Scheduled actions give you complete control over what your miner does and when. Actions include start mining, stop mining, restart mining, restart computer, change intensity, switch profile, etc. Along with creating actions to run at specified times, you can create actions that run at set intervals.
    You can create profiles for each coin you mine, then set CGWatcher to switch to whatever is most profitable at the times or intervals you specify.
    You can also set CGWatcher to increase GPU intensities when the computer is idle or at certain times or intervals, and have it return them to their original values once you start using the computer again. You set the intensity, you set how long the computer must be idle before intensities are changed.


Statistics
These miners provide a lot of information. CGWatcher attempts to present it in an easier-to-read interface, using tabs to separate information. Ultimately I'd like it to record some data so you can see statistics over a given time period.

Control
You can easily change miner settings while it is running. Change GPU core, memory, voltage, or intensity. Re-prioritize and enable/disable pools. A large Pause/Resume button allows you to easily stop and resume mining (using "exit" command so GPUs and fans are returned to normal values.) Changing miner settings while it is running is temporary, as the changes will be lost when the miner closes. If you want to make the changes permanent, you can change the profile's settings or use the Config File Editor (or Notepad) to edit the profile's config file.

Overheat Protection
CGMiner provides overheat protection for AMD cards. Using the temp-target, temp-overheat, and temp-cutoff settings, it can adjust fan and clock speeds to maintain a target temperature and disable devices that get too hot (if auto-gpu is enabled.) CGWatcher now also provides similar protection for cards not protected by the miner (including Nvidia cards) by adjusting intensity to maintain the target temperature and disabling GPUs that get too hot. It will enable and/or slowly raise intensity back to their original values once temperatures cool down back into the target range. I'm not sure if anyone mines with Intel HD integrated graphics since modern CPUs have better OpenCL support. Currently CGWatcher does not support overheat protection for Intel devices, but I will be doing some tests to see whether the CPU temperatures it is now capable of getting are enough to provide similar support for these devices. You can see if the miner or CGWatcher is providing overheat protection for a GPU in the GPU tab next to the temperature.

For GPUs that miner is providing overheat protection for (AMD), CGWatcher takes a hands-off approach except for when the miner disables them for exceeding temp-cutoff. Although the miner tries to re-enable them once they return to target temperatures, this usually isn't successful so CGWatcher will restart the GPU once it has returned to temp-target temperature.

You can disable CGWatcher's overheat protection in the Monitor tab if you don't want it to perform any of these actions.

Config File Editor
The Config File Editor attempts to make editing your miner's configuration easier. To start, it displays the config file in a grid allowing you to see all available settings and a description of each. Settings that can only be enabled or disabled will have a true or false option. Settings that allow numbers only (not including lists of numbers) will only allow numbers. The 'Validate' button attempts to check your settings for errors that may prevent the miner from starting or working correctly. Some things to know about the Config File Editor:


  • Settings that are set to default values are not written to the config file upon saving. They are also not converted to arguments, because they are set to default values and don't need to be explicitly set.
  • To add, edit, or remove pools, locate Pools in the config file grid. (There may be a Pools category heading as well in Category view mode), but you want the Pools that says '(Collection)' in the cell next to it. Click on the word '(Collection)' and a small [...] button will appear in the cell. Click on this [...] button to open the pool window. If you've ever used a property grid in Microsoft or similar software, you will recognize this type of grid and the accompanying collection editor.
  • When editing pools, you can create names for them as well so they are more easily identifiable when editing them later on. Pool names are saved inside the config file, but will not cause a problem with the miner. To change pool priorities, use the up and down arrows in the pools window to move pools up and down the list. The top of the list is the first priority, the bottom of the list is last priority.
  • 'Name #' textbox - You can name your config files so when you're using them in profiles they will be easier to access. Enter a name for the config file in the Name textbox. Then when managing your profiles, you can select a Named config file from the Config File textbox drop-down instead of needing to browse your computer for it. After clicking out of the Config File textbox, it will be converted to the config file path automatically.
  • 'Validate before saving' : By default, the Config File Editor will check most settings to make sure the values are valid and in the correct format. If you experience problems with validation failing due to your operating system's culture settings and are sure the values are correct, you can un-check this box to bypass validation.
  • 'Ensure API is enabled upon saving' : If enabled, the API access needed by CGWatcher will always be enabled when saving the config file, regardless if these settings were enabled in the grid. It will not affect other groups/IP address in the api-allow setting, it only makes sure api-listen is enabled and that 127.0.0.1 is included in the W: group of api-allow.
  • The Config File Editor Menu
         ■ File -> New - Create a new config file.
          ■ File -> Open - Open an existing config file.
          ■ File -> Save (As) - Save the current config file.
          ■ File -> Close - Close the Config File Editor.
          ■ Tools -> Import Settings -> From Config File... - select an existing config file to import settings from. The current settings will be overwritten, but will not be permanent until you save the config file.
          ■ Tools -> Import Settings -> From Named Config File ->
© 2020, Bitcointalksearch.org