OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 614.

jake262144

full member

Activity: 210

Merit: 100

Quote from: ?? on ??

Hi, I'm sure I am doing something really stupid but I can't get my 3x 7970s over 200 MHash/S each.
Running 12.2 driver and the newest version of cgminer.
Is there anything extra I need to do for 7970? I used cgminner for all my other cards. Thanks.

Wow, with that much information I can only suggest that you change ANYTHING, starting with cgminer kernel configuration and ending with the whole OS Roll Eyes

EDIT::In case you missed it, a lot of GCN-related info can be found here and here.

phorensic

hero member

Activity: 630

Merit: 500

Alright, my previous attempt at giving info on new drivers in this thread was squashed...let's try again. Wink

AMD Cataliyst 12.4 OpenCL 1.2 (8.960.0 March 15) AMD Official BETA

http://forums.guru3d.com/showthread.php?t=360362
http://developer.amd.com/Downloads/OpenCL1.2betadriversWindows.exe

Edit: amdocl(64).dll is version 10.0.923.1

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: boozer on March 21, 2012, 05:16:50 PM

As a follow up... I decided to try a different pool aside from gpumax. I've ran stable with overclocks for about 5 hours now. Seems like some problem with cgminer and gpumax, but only on my 5970 rigs. I have the same version of cgminer running on the same linux version on my two 7970 rigs and it seems to work fine.... so it appears to be isolated to my dual gpu 5970 rigs...

Auto fan control was known broken in the last release of cgminer for 5970 and this was causing problems for people on 5970s with spontaneous restarts due to unexpected overheats. I don't think I released a newer release version with the fix for it. If you download and build the latest git tarball and build from that you can get that fix. Alternatively, disabling auto fan control should have the same effect.

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: boozer on March 21, 2012, 06:21:25 PM

I am supposed to put some switch before the number 4?

Code:

# echo -n "config" | nc -4 127.0.0.1 4028 ; echo
nc: invalid option -- '4'
nc -h for help

Well you can ignore the -4 option (remove it) - but I guess you must have a really old version of nc?
What does "uname -r" say on that computer? (and what OS version is it?)

boozer

sr. member

Activity: 309

Merit: 250

I am supposed to put some switch before the number 4?

Code:

# echo -n "config" | nc -4 127.0.0.1 4028 ; echo
nc: invalid option -- '4'
nc -h for help

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Then also the simple check about ADL being enabled - you do see GPU Temp/RPM as you said before?

If you don't have network API access enabled, but do have just --api-listen (like your command line shows) then on each machine:

Code:

echo -n "config" | nc -4 127.0.0.1 4028 ; echo

... and compare all 3 computers.

I'm actually only asking this coz it seems strange that the GPUs are running amok and it really does sound similar to ADL not working.
But of course you have already said it is working but ... well ... that command is another way to verify it.

I guess the other possibility is that when gpumax does let the cards go idle from a getwork perspective of having nothing to do, something in cgminer could be getting confused about the cards status ... but that is just a guess since (as I've mentioned before) I've not looked closely at the internal Driver/ADL code that handles the card status/problems (and I got that idle bit wrong before as ckolivas pointed out)
If I get a chance today to add storing some of that info and making it available in the API I may know a bit more about it then

boozer

sr. member

Activity: 309

Merit: 250

As a follow up... I decided to try a different pool aside from gpumax. I've ran stable with overclocks for about 5 hours now. Seems like some problem with cgminer and gpumax, but only on my 5970 rigs. I have the same version of cgminer running on the same linux version on my two 7970 rigs and it seems to work fine.... so it appears to be isolated to my dual gpu 5970 rigs...

boozer

sr. member

Activity: 309

Merit: 250

I ran at stock gpu clocks for 14 minutes and had restarts on half (3) of my gpus during this short time, increasing ram used from 175 meg to 260 meg.

Before I stopped it, I checked GPU status and only GPU 5 had its "Last Initialized" time updated.... From this screen, all other gpus appeared to never have restarted. However, going through the log, I can see that gpu 0, 2, and 5 restarted. The entire log file is about 1 meg... 100k zipped. I can email it or post it somewhere if you want to look at the whole thing.

Code:

root@skynet:~# cat run.20120321105206.19993.log | grep idle
[2012-03-21 10:55:22] Device 0 idle for more than 60 seconds, GPU 0 declared SICK!
[2012-03-21 10:56:52] Device 2 idle for more than 60 seconds, GPU 2 declared SICK!
[2012-03-21 10:57:11] Device 0 idle for more than 60 seconds, GPU 0 declared SICK!
[2012-03-21 11:03:52] Device 5 idle for more than 60 seconds, GPU 5 declared SICK!
root@skynet:~# cat run.20120321105206.19993.log | grep restart
[2012-03-21 10:55:22] Attempting to restart GPU
[2012-03-21 10:55:23] Thread 0 restarted
[2012-03-21 10:55:24] Thread 1 restarted
[2012-03-21 10:56:52] Attempting to restart GPU
[2012-03-21 10:56:53] Thread 4 restarted
[2012-03-21 10:56:54] Thread 5 restarted
[2012-03-21 10:57:11] Attempting to restart GPU
[2012-03-21 10:57:12] Thread 0 restarted
[2012-03-21 10:57:13] Thread 1 restarted
[2012-03-21 11:03:52] Attempting to restart GPU
[2012-03-21 11:03:53] Thread 10 restarted
[2012-03-21 11:03:54] Thread 11 restarted

GPU 0: 331.0 / 283.9 Mh/s | A:54  R:0  HW:0  U:3.90/m  I:8
73.5 C  F: 60% (3630 RPM)  E: 725 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 10:52:09]
Intensity: 8
Thread 0: 165.6 Mh/s Enabled ALIVE
Thread 1: 165.0 Mh/s Enabled ALIVE

GPU 1: 329.8 / 335.7 Mh/s | A:62  R:0  HW:0  U:4.48/m  I:8
71.5 C  F: 60% (3630 RPM)  E: 725 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 10:52:12]
Intensity: 8
Thread 2: 164.4 Mh/s Enabled ALIVE
Thread 3: 165.3 Mh/s Enabled ALIVE

GPU 2: 328.8 / 307.9 Mh/s | A:60  R:0  HW:0  U:4.34/m  I:8
73.0 C  F: 54% (3329 RPM)  E: 725 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 10:52:14]
Intensity: 8
Thread 4: 163.9 Mh/s Enabled ALIVE
Thread 5: 165.3 Mh/s Enabled ALIVE

GPU 3: 329.7 / 334.3 Mh/s | A:60  R:0  HW:0  U:4.34/m  I:8
74.0 C  F: 54% (3326 RPM)  E: 725 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 10:52:16]
Intensity: 8
Thread 6: 164.9 Mh/s Enabled ALIVE
Thread 7: 164.9 Mh/s Enabled ALIVE

GPU 4: 335.3 / 336.7 Mh/s | A:61  R:0  HW:0  U:4.41/m  I:8
73.0 C  F: 79% (4378 RPM)  E: 735 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 10:52:19]
Intensity: 8
Thread 8: 168.0 Mh/s Enabled ALIVE
Thread 9: 165.6 Mh/s Enabled ALIVE

GPU 5: 334.1 / 309.8 Mh/s | A:61  R:0  HW:0  U:4.41/m  I:8
74.5 C  F: 79% (4378 RPM)  E: 735 MHz  M: 240 Mhz  V: 1.050V  A: 99% P: 0%
Last initialised: [2012-03-21 11:03:54]
Intensity: 8
Thread 10: 177.9 Mh/s Enabled ALIVE
Thread 11: 158.3 Mh/s Enabled ALIVE

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: boozer on March 21, 2012, 07:41:49 AM

Quote from: jake262144 on March 21, 2012, 04:43:10 AM

Obviously, something is wrong since 2> and 2>> are the official ways to log cgminer activity.
Do you know what the mistake is?
You're galloping headfirst, launching screen and cgminer in a single step. Care to take a guess which of the two subroutines you're logging the output of?
Gotta slow down a bit there, cowboy Cheesy

Thanks Jake! Surprisingly, that started logging Tongue

That is in a file for a startup script... How do I start cgminer attached to screen x in two lines?

My cgminer.sh looks like this:

Code:

#!/bin/sh
#
now="`date +%Y%m%d%H%M%S`"
#
./cgminer-231j -S /dev/ttyUSB0 -S /dev/ttyUSB1 -Q 4 --api-port 4028 --api-listen --api-allow W:127.0.0.1,W:192.168.7.0/24 --api-description Subaru -I 9 --submit-stale --auto-fan --auto-gpu --gpu-engine 900 --gpu-memclock 775 --gpu-memdiff -125 --temp-target 70 "$@" 2> run.$now.$$.log

Just in case you wanted a hint

You must of course also

Code:

chmod +x cgminer.sh

The "$@" means I can add more arguments - like pool configurations files e.g. "./cgminer.sh -c pool.json"

Edit: so in case it wasn't obvious

Code:

/usr/bin/screen -dmS cgminer /opt/miners/cgminer/cgminer.sh

and of course cgminer.sh would have all your normal options before the "$@"

boozer

sr. member

Activity: 309

Merit: 250

Quote from: jake262144 on March 21, 2012, 04:43:10 AM

Obviously, something is wrong since 2> and 2>> are the official ways to log cgminer activity.
Do you know what the mistake is?
You're galloping headfirst, launching screen and cgminer in a single step. Care to take a guess which of the two subroutines you're logging the output of?
Gotta slow down a bit there, cowboy Cheesy

Thanks Jake! Surprisingly, that started logging Tongue

That is in a file for a startup script... How do I start cgminer attached to screen x in two lines?

jake262144

full member

Activity: 210

Merit: 100

Quote from: boozer on March 20, 2012, 11:19:34 PM

...

Code:

/usr/bin/screen -dmS cgminer /opt/miners/cgminer/cgminer -D --verbose -Q 2 --api-listen --auto-fan --temp-target 75 -I 8 -o http://x http://x 2> "/root/run.`date +%Y%m%d%H%M%S`.$$.log"

...
The logfile was created, but is empty...

Quote

-rw-r--r-- 1 root root 0 Mar 20 16:20 run.20120320162057.3911.log
-rw-r--r-- 1 root root 0 Mar 20 16:33 run.20120320163321.3713.log
-rw-r--r-- 1 root root 0 Mar 20 23:04 run.20120320230402.8150.log

Obviously, something is wrong since 2> and 2>> are the official ways to log cgminer activity.
Do you know what the mistake is?
You're galloping headfirst, launching screen and cgminer in a single step. Care to take a guess which of the two subroutines you're logging the output of?
Gotta slow down a bit there, cowboy Cheesy

cuz0882

sr. member

Activity: 392

Merit: 250

Is it efficient to use balance or rotation mode? I would like to get text alerts from a pool but I don't like putting on my hashing power there. Balance seems like a good option if LP still works correctly. Does it matter how many pools are used?

boozer

sr. member

Activity: 309

Merit: 250

Quote from: kano on March 19, 2012, 08:09:11 PM

When it comes to the GPU Driver/ADL I'm not too helpful.

As I mentioned before I'd guess that turning on debug (-D) and verbose (--verbose) and logging the output might help shed some light

As I sorta suggested before I'll look into making a change to get the actual numbers of bad events recorded against each device and probably also the details of the last event for each device if that is possible.
Then make this available through the API with a new command (e.g. something like 'events')
This may be of help to more easily see problems like this (though it may not help resolve them)

Edit: Though - it will of course have to be on my git until ckolivas is back in action (and give me a couple of days also)

I turned on -D and--verbose using this string and running at stock:

Code:

/usr/bin/screen -dmS cgminer /opt/miners/cgminer/cgminer -D --verbose -Q 2 --api-listen --auto-fan --temp-target 75 -I 8 -o http://x http://x 2> "/root/run.`date +%Y%m%d%H%M%S`.$$.log"

cgminer started at 175 meg resident and 415 meg virtual ram utilized. After 4 hours, its at 284 meg resident and 607 meg virtual... gpu5 threads were the only ones restarted according to the timestamp.

The logfile was created, but is empty...

Quote

-rw-r--r-- 1 root root 0 Mar 20 16:20 run.20120320162057.3911.log
-rw-r--r-- 1 root root 0 Mar 20 16:33 run.20120320163321.3713.log
-rw-r--r-- 1 root root 0 Mar 20 23:04 run.20120320230402.8150.log

is a stderr message supposed to be generated when the thread is restarted?

vapourminer

legendary

Activity: 4354

Merit: 3614

what is this "brake pedal" you speak of?

Quote from: kano on March 20, 2012, 07:33:18 PM

Well the API 10004 error is interesting ...

According to Microsoft it can't happen coz the function that causes it has been removed (and I don't call it anyway) Tongue

Though, it looks exactly like a network problem with the computer?

they are all on gigabit, hardwired. all can see each other fine, matter of fact my HTPC (with the 6770) has a lot of media on the 6870 computers drives as the HTPC is low on drive space. network (seems to anyway) run flawlessly. terracopy (which verifies transfers with MD5) copies multi gig files across the network perfectly all the time.

and, no transfers were in progress, so network saturation shouldnt be it. this latest crash was while we were at work, and previously its crashed in the middle of the night while nothing was using network except the miners. not that multigig transfers have ever bothered the miners before.

router and the 6770/6870 rigs are on UPSs too.

damn thing is the 5830 is on a POS computer, no UPS, longest ethernet run with the lowest quality cable. and it runs perfect. but with 2.2.7

go figure, eh?

switched the 6870 back to 2.2.7. left the 6770 on 2.3.1. Ill see what happens tomorrow

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: vapourminer on March 20, 2012, 06:47:05 PM

been getting this error on 2 of my miners lately. they run about 6 hours every time too. been crashing like clockwork for a few days now

[regular log stuff, then suddenly this]
[2012-03-20 14:04:09] Failed to create submit_work_thread
[lots of statistics]
[2012-03-20 14:04:09] API failed (Socket Error: (10004) Interrupted system call) - API will not be available
[2012-03-20 14:04:09] longpoll failed for http://api.bitcoin.cz:8408, sleeping for 30s
and thats the end, they are then sitting at "press any key to continue . . ."

same exact error on both

they are both running 2.3.1, no special flags aside from autoclock and autofan settings.

my 6870 is win7 64 bit, 11.11 driver, 2.5 sdk, and the 6770 is vista 32, 12.1, 2.3 sdk. clean installs, I never reuse bins, and delete bins when upgrading drivers and/or sdks. cgminer is always installed fresh, never over itself.

my 5830 on 11.4, 2.1 and XP with cgminer 2.2.7 runs perfect. same pool settings as the 6870 and 6770, so if its the longpoll fail thats killing the 2.3.1 miners the 2.2.7 version is OK with whatever it is.. is longpoll handled differently between 2.2.7 and 2.3.1?

any ideas?

Well the API 10004 error is interesting ...
According to Microsoft it can't happen coz the function that causes it has been removed (and I don't call it anyway) Tongue

Though, it looks exactly like a network problem with the computer?

vapourminer

legendary

Activity: 4354

Merit: 3614

what is this "brake pedal" you speak of?

been getting this error on 2 of my miners lately. they run about 6 hours every time too. been crashing like clockwork for a few days now

[regular log stuff, then suddenly this]
[2012-03-20 14:04:09] Failed to create submit_work_thread
[lots of statistics]
[2012-03-20 14:04:09] API failed (Socket Error: (10004) Interrupted system call) - API will not be available
[2012-03-20 14:04:09] longpoll failed for http://api.bitcoin.cz:8408, sleeping for 30s
and thats the end, they are then sitting at "press any key to continue . . ."

same exact error on both

they are both running 2.3.1, no special flags aside from autoclock and autofan settings.

my 6870 is win7 64 bit, 11.11 driver, 2.5 sdk, and the 6770 is vista 32, 12.1, 2.3 sdk. clean installs, I never reuse bins, and delete bins when upgrading drivers and/or sdks. cgminer is always installed fresh, never over itself.

my 5830 on 11.4, 2.1 and XP with cgminer 2.2.7 runs perfect. same pool settings as the 6870 and 6770, so if its the longpoll fail thats killing the 2.3.1 miners the 2.2.7 version is OK with whatever it is.. is longpoll handled differently between 2.2.7 and 2.3.1?

any ideas?

boozer

sr. member

Activity: 309

Merit: 250

Quote from: kano on March 19, 2012, 08:09:11 PM

When it comes to the GPU Driver/ADL I'm not too helpful.

As I mentioned before I'd guess that turning on debug (-D) and verbose (--verbose) and logging the output might help shed some light

As I sorta suggested before I'll look into making a change to get the actual numbers of bad events recorded against each device and probably also the details of the last event for each device if that is possible.
Then make this available through the API with a new command (e.g. something like 'events')
This may be of help to more easily see problems like this (though it may not help resolve them)

Edit: Though - it will of course have to be on my git until ckolivas is back in action (and give me a couple of days also)

Thanks. I'm starting it with the debugging going. I planned to run it at stock... and i didnt throw overclock values in the start command, but somehow it is running with my old overclock settings.... these are the commands I used to start
Any idea how my old overclock values are getting set when I am not specifying them in the CLI?

EDIT: Nevermind.... I had switched to phoenix while i was working out the issue with cgminer and had set the overclocks there.... rebooted and everything is running stock now.

tucenaber

sr. member

Activity: 337

Merit: 252

Since perhaps two weeks I've had a problem with the aggregate cgminer stats when starting up. What happens is basically this: after 50-70 accepted shares, the total accepted shares ("A") counter stops incrementing. The "A" counter for each individual GPU keeps incrementing but on the other hand the uploaded/min "U" just increases without limit. If I restart a few times it eventually works beyond 70 shares and after that it keeps working.

When it happens only the stats seem to be wrong, the actual hashing keeps working.

Anybody else seing this problem?

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: boozer on March 18, 2012, 11:03:49 PM

Quote from: kano on March 17, 2012, 10:33:50 PM

The GPU's are showing Temp/RPM as you can see - so yep that means ADL is working.
That's all I was thinking there that maybe ADL wasn't able to get the Temp/RPM for some reason (even if the DISPLAY was set) so the cards were failing all the time.

There's no device failure statistics in cgminer other than HW which is 0 in your case.
Send ckolivas some BTC and ask him to implement it (or if he's not expecting to be able to do that soon - send it to me

)
Then I could add a device status command in the API to return those numbers.

Edit: hmm the thread does have the timestamp it was last sick though ...

So... I still have this problem with threads being restarted and eating up memory. Even when I run at stock speeds for all gpus. The thread for GPU5 keeps getting restarted on both my 5970 rigs... i even tried underclocking gpu 5 on both rigs and it still occurred. Phoenix seems to work fine.... should i try a different cgminer build? I assume I'm the only one with this issue which makes it likely some configuration/hardware problem on my side, but I'm not sure how to isolate it.

When it comes to the GPU Driver/ADL I'm not too helpful.

As I mentioned before I'd guess that turning on debug (-D) and verbose (--verbose) and logging the output might help shed some light

As I sorta suggested before I'll look into making a change to get the actual numbers of bad events recorded against each device and probably also the details of the last event for each device if that is possible.
Then make this available through the API with a new command (e.g. something like 'events')
This may be of help to more easily see problems like this (though it may not help resolve them)

Edit: Though - it will of course have to be on my git until ckolivas is back in action (and give me a couple of days also)

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: ?? on ??

Question about running cgminer with icarus boards:

Can I have 5 icarus boards hooked up to a USB hub, hub hooked up to a PC, USB-COM driver maps it to COM5,
and run cgminer with "-S COM5"

How would cgminer communicate with all the icarus devices. It opens COM5, but is there a device ID that it uses?

All I see in the cgpu_info is device_id (icarus.c), which is really a handle to opened com port.

I'm not clear how is this distribution of work done.

By looking at the code, it seems like I'd need 5 COM ports for my 5 icarus cards.
Would the USB-COM driver "create 5 virtual COM ports" if I hookup 5 icarus cards through USB hub?

My guess is that this relates to the fact that there are USB driver problems on windows (nothing to do with cgminer)
If you go visit the Icarus thread (in my sig) there are comments about that.

I guess it means you need to update or get a new USB driver for windows - (for the USB interface that's on the Icarus card)

You should see multiple COMn on windows - one for each Icarus.

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 614. (Read 5806103 times)