Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 623. (Read 5805874 times)

sr. member
Activity: 349
Merit: 250
Has anyone written a script that checks if any GPUs are DEAD/OFF via the API?  I'd like to create something that checks periodically and restarts cgminer if that happens (since a restart usually fixes the issue).  If someone already has a simple script to talk to cgminer, I'd love to use it as a starting point.
https://bitcointalksearch.org/topic/bash-scripts-for-rpc-interface-to-cgminer-suggestions-welcome-66779

An example would be line 60, however all it does is pop up a non modal window.  You can modify it to email you with

echo "$node gpu$i reports $STATUS" | mail -s "Device Failure" [email protected]

or whatever else you want.

hero member
Activity: 737
Merit: 500
Has anyone written a script that checks if any GPUs are DEAD/OFF via the API?  I'd like to create something that checks periodically and restarts cgminer if that happens (since a restart usually fixes the issue).  If someone already has a simple script to talk to cgminer, I'd love to use it as a starting point.
donator
Activity: 1218
Merit: 1079
Gerald Davis
The way I identified display was to start miner with 1120/1000 for gpu0, 1121/1000 for gpu1, 1122/1000 for gpu2 and then
opened AMD Control Center to look at what display adapter it was using in the "Performance Center".  The core 1120 set by cgminer was displayed in the Performance Center so I figured that must have been the display gpu.  There is only one shown, the other ones are displayed as "disabled" on Information/Hardware screen.  (Control Panel shows all enabled, of course)

That is weird.  Makes me think something larger is going on with your system.  AMD drivers can be weird sometimes but I don't think any GPU should show "disabled" while under load.
legendary
Activity: 1540
Merit: 1002
... and on a different subject ... ZTEX

I wasted an hour or so on it so far and worked out what appears to be the commands
(and changed all the cgminer code that needs changing and created a template ztex.c)

So if anyone has one it would be good if they could drop by #cgminer on FreeNode and I'll ask you to run a few commands so I can work out what the commands really are.

I asked the ZTEX people and got 2 useless replies in one email:
1) Read the java code, the mining commands for the bitstreams aren't documented
(yeah I had already done that that's why I asked to confirm what I had worked out)
2) "You cannot do this within a day."
lol - well I've only spend a couple of hours on it so far ...
(and then decided after than comment ... stuff it I'll do it tomorrow ... or whenever ...)

You may want to read my code instead, as python is much more readable than java and you have everything I needed on mpbm in one single source.

https://github.com/nelisky/Modular-Python-Bitcoin-Miner/blob/ztex/worker/fpgamining/ztexdev.py

I will try to find you on irc once I get some work things out of the way.
hero member
Activity: 807
Merit: 500
Actually he said:
Quote
When I enable dynamic intensity, other gpus get most of the action, and the harder they go, the slower the dynamic (monitor gpu) gets.
Which sounds like it could be the wrong GPU is 'd'
i.e. as you increase the intensity on the display GPU it should of course get less responsive
I took that to mean that the MH/s gets slower on the GPU with 'd' but you could be right.  Regardless, the advice from DAT after your last post (and just before this post) is exactly what Windows miners need.  (and there's still no thumbs up emoticon)
donator
Activity: 1218
Merit: 1079
Gerald Davis
The #0 GPU is rarely the one driving the display.  It is up to windows crazy device number scheme.  Only only Windows rig (3x watercooled 5970s) the "display GPU" was #4.  When one 5970s died and I RMAed it (4 GPUs installed) he "display GPU" become #1.  When I got the RMA back (6 GPUs) the "display GPU" after a reinstall become #2.  So it varied and it was never #0 in this limited sample set.

So windows users should identify the display GPU.  If you have multiple physical displays as long as they are all connected to a single card one GPU will handle them all even on dual GPU cards like 5970s.  If you have multiple displays connected to multiple cards well change that it is sub optimal.

To identify the display GPU:
1) stop mining
2) open GPU-Z sensor tab (be sure record data when not shown is checked), AB, or CCC.
3) All GPU should be ~0% GPU load.
4) Do something which requires the GPU acceleration flash accelerated video, (although you should turn it back off when mining), Afterburner test tool, play some 3D game (in windowed mode would be best).
5) Look which GPU load increases.  Record this GPU number.

Now you think you might be done but you aren't (because windows & ADL library used by cgminer order cards differently sometimes).
Leave whatever tool you are using open.
5) Start cgminer  (you can use low intensity and stock clocks for stability)
6) You should see all GPU go to 99% load
7) Disable each GPU one at a time until you find the one which causes load on the GPU identified in #4 (in GPU-Z / AB) to go to zero.
Wait a few seconds between each GPU disable as sometimes it take a couple seconds for the tools to show a drop in GPU load.

That is the GPU in cgminer that is driving the display.  Yes windows is asinine.  You likely want to write it down.

On edit: clarified a few steps.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
...
EDIT: Even assuming there's no good way to determine which is which short of such testing, wouldn't display performance have been noticeably worse in the previously attempted
Code:
-I d,9,9,9
if it wasn't making the display GPU dynamic?
Actually he said:
Quote
When I enable dynamic intensity, other gpus get most of the action, and the harder they go, the slower the dynamic (monitor gpu) gets.
Which sounds like it could be the wrong GPU is 'd'
i.e. as you increase the intensity on the display GPU it should of course get less responsive
hero member
Activity: 807
Merit: 500
Actually - that's the smartest comment yet on the subject ...
Edit: though I should point out ... Smiley The 7970 is a single GPU Smiley
Being smart without actually being smart is awesome...  I forgot about the gpu ordering issue, so basically each of these needs tried if there's not an easier way to work out which is the display (since board order and driver order don't match and even differ between OS):
Code:
-I 9,7,9,9
-I 9,9,7,9
-I 9,9,9,7
Good thing it isn't dual-GPU, that would make it a real PITA.

EDIT: Even assuming there's no good way to determine which is which short of such testing, wouldn't display performance have been noticeably worse in the previously attempted
Code:
-I d,9,9,9
if it wasn't making the display GPU dynamic?
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
... and on a different subject ... ZTEX

I wasted an hour or so on it so far and worked out what appears to be the commands
(and changed all the cgminer code that needs changing and created a template ztex.c)

So if anyone has one it would be good if they could drop by #cgminer on FreeNode and I'll ask you to run a few commands so I can work out what the commands really are.

I asked the ZTEX people and got 2 useless replies in one email:
1) Read the java code, the mining commands for the bitstreams aren't documented
(yeah I had already done that that's why I asked to confirm what I had worked out)
2) "You cannot do this within a day."
lol - well I've only spend a couple of hours on it so far ...
(and then decided after than comment ... stuff it I'll do it tomorrow ... or whenever ...)
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Isn't the 7970 a dual-gpu card?  Maybe having the display on the first card means both of those GPUs affect Windows because of this?  IOW, instead of
Code:
-I 7,9,9,9,9,9,9,9
maybe
Code:
-I 7,7,9,9,9,9,9,9
or with 1/3rd the 9's if you only have 2 cards.
Actually - that's the smartest comment yet on the subject ...
Because it could be that GPU0 isn't the display ...
i.e. you'd have to make sure which GPU's are the display and set it lower.
Then I'm almost certain the rest could be higher ...

Edit: though I should point out ... Smiley The 7970 is a single GPU Smiley
hero member
Activity: 807
Merit: 500
Isn't the 7970 a dual-gpu card?  Maybe having the display on the first card means both of those GPUs affect Windows because of this?  IOW, instead of
Code:
-I 7,9,9,9,9,9,9,9
maybe
Code:
-I 7,7,9,9,9,9,9,9
or with 1/3rd the 9's if you only have 2 cards.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Well I have actually discussed recently what the intensity actually does.

It simply divides up the work based on: a higher intensity means longer work for the GPU to do non-stop.

... now where was that post ...
https://bitcointalksearch.org/topic/m.771425

So each step up in intensity is making the GPU process twice as long between very short 'rests'
(though they are only cgminer GPU rests while the CPU readies the next GPU request, not OS rests)
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
...
Try it before you laugh.
The point he has already stated is that if you are using more than one GPU in your computer - you are wasting the others.

The stability issue is the display GPU.
Set the other GPU's to higher intensity.

You are reducing the performance of the other GPU's by using the same settings on all of them that you have determined are needed to keep the display GPU stable.

Kano,

You are not hearing me.  Here are some visuals:

I: d,9,9 E: 1130 M: 1000
http://www.fileswap.com/dl/YWZrbwsLyU/d-9-9-1130-1000.png.html

I: d,11,11 E: 1130 M: 1000
http://www.fileswap.com/dl/d6Akac5gdI/d-11-11-1130-1000.png.html

I: 7 E: 1130 M: 1000
http://www.fileswap.com/dl/URJ9zQ5EDs/7-7-7-1130-1000.png.html

I'm just reporting what your software does on Windows with three 7970s not what it should do.

Best RATE and stability is achieved with intensity 7 for ALL GPUs.  Maybe it is a bug, but that is what is happening.

Try it for yourself.

BTW, your "gpu" command is working great.  I get status of each gpu by parsing your reply json string.
I wrote a little watchdog to monitor and restart cgminer when it crashes or status goes to sick or dead.

Heh Smiley I only wrote the API (and a few patches here and there)

But yeah I didn't see any comments about using multiple intensities - that's why I said that.

Have you tried anything like 7,X,X for any of X=8, 9, 10 or 11?
(since d has other effects also)

That's very strange that the other cards doing nothing in the system except mining would require a setting of 7

Is there anything unusual about your setup?
(obviously not this: but what I mean would be like having 1 display on each card or some other unusual setup)

Edit: I'll also add that a few 100 share test is pretty small ... and could be unreliable.
Give it an hour at least to be sure of the value.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
ckvolias:

forgive me if this has already been reported but I have been super busy lately and not able to follow the thread

using latest cgminer 2.3.1-2 win7 64x

after awhile I loose temps and fan speed displays both on screen and from API - a cgminer reboot fixes the issue

screenshot....

http://btcwebhost.com/images/anubistempsgone.png
Yep it is known.
It "appears" to be the ATI ADL library gives up after a long time so it's been resigned to the can't fix basket.
Of course it could be some obscure hard to find cgminer bug, but if it was and since it takes such a long time to happen, it's not likely to be found through any testing looking for the cause of it.
hero member
Activity: 896
Merit: 1000
Buy this account on March-2019. New Owner here!!
ckvolias:

forgive me if this has already been reported but I have been super busy lately and not able to follow the thread

using latest cgminer 2.3.1-2 win7 64x

after awhile I loose temps and fan speed displays both on screen and from API - a cgminer reboot fixes the issue

screenshot....

newbie
Activity: 22
Merit: 0
Not sure if this has been covered in previous posts, but MAN, 227 pages of posts to read, and searches don't turn up anything, so what the hey.

Just went from 2.2.6 to 2.3.1 and went from mid 500s on each 7970 and dropped to... 25.

At first I assumed it was to do with the new GPU interval setting it was annoying me about, but after wasting an hour playing with that to no avail I started looking for other causes. The first thing I noticed was it seemed to be taking 99% of each GPU to turn up 25 Mhash/sec. I tried phatk and went to 150 per GPU. POCLBM came in first with 552 at 12 intensity (I don't want the fans offensively loud).

So, not sure what's changed. I have not tried other kernels in the past so I can't tell if phatk has always sucked or, like Diablo, has just started sucking.

Anyway, I hope this helps anyone else with a 7970 confused by a sudden performance hit after an update!

outsidefactor
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
...
Try it before you laugh.
The point he has already stated is that if you are using more than one GPU in your computer - you are wasting the others.

The stability issue is the display GPU.
Set the other GPU's to higher intensity.

You are reducing the performance of the other GPU's by using the same settings on all of them that you have determined are needed to keep the display GPU stable.
donator
Activity: 1218
Merit: 1079
Gerald Davis
So you go and try it on WINDOWS with multiple 7970s and post your findings.

Why would I buy 7970s just to test out your flawed theory?

Try turning the SINGLE GPU connected to a display to dynamic as suggested in the README and you can increase the intensity to where it should be without issue.   As far as 5 threads being optimal?  LOLZ. 
Vbs
hero member
Activity: 504
Merit: 500
Is the SDK 2.5 in 11.11 differenet/better than what shipped with 11.9?

They are all different between SDK packs or Catalyst versions. As for "better", that's more difficult to answer... Tongue

Awhile ago I checked some of them:
Code:
Package - Version Number
------------------------
SDK2.4  - SDK 2.4.595.10
11.6    - SDK 2.4.650.9   <- Newest 2.4
11.7    - SDK 2.5.684.213
SDK2.5  - SDK 2.5.684.213
11.8    - SDK 2.5.709.2
11.9    - SDK 2.5.732.1
11.10   - SDK 2.5.775.2
11.11   - SDK 2.5.793.1   <- Newest 2.5
SDK2.6  - SDK 2.6.831.4
11.12   - SDK 2.6 (10.0.831.4)
12.1    - SDK 2.6 (10.0.851.4)

member
Activity: 88
Merit: 10
Gliding...
What is your intensity setting? I bet it is more than 7.

Intensity is the most critical setting for stability. Set the intensity low than play with the rest.

No reason Intensity needs to be less than 7 on a 7970.  I run 8 on 5970 and that is only because I am using p2pool.  With conventional pool intensity 9 is more appropriate.  conman I believe found optimal intensity at 10 or 11 on 7970.

That was Linux.  My Windows 7/GD70 experience is different.  I was running 9, 10, 11, tried various other settings but I was constantly crashing cgminer.  I thought that it was my core/mem/vddc settings, but by accident I found out that on WINDOWS, intensity is the most critical setting.

7 is optimal on Windows with multiple overclocked cards.  Less than 7, runs well, but the hash rate suffers.
More than 7, CPU and GPU overloads are likely.  Depending what else is installed and running.

On Windows 7, overclocked cards with 9+  sooner or later get "idle for 60 seconds", "too busy GPU event log" etc.  That is on a clean, mean
and well tuned system.  It might take an hour, it might take few hours, but within one day cgminer crashes when run with -I 9/10/11 and overclocked core/memory.

Now, with -I 7 and 5 threads, my GPU loads are constant at 99%.

BTW, cgminer gets C5 exception (werfault.exe) when GPU gets overloaded and restarted by the BIOS.

You might be lucky with your "-I 9" for a while until the system crashes in the middle of the night and you lose 10 hours of run time :-(
Running with more than 7 creates more problems than it solves.  Hash rate does not improve by much as GPUs are at 99% already, so what is the point.  To crash cgminer?


Thank a lot, I'll use your experience at once  :-) Workss great (Win 7, 1170/1350 -I 7).
Jump to: