Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 611. (Read 5806103 times)

legendary
Activity: 4354
Merit: 3614
what is this "brake pedal" you speak of?

BTW how'd you get your 6770 cranked up to 1005?  I'm currently running 960/300 and  I've been unable to run higher reliably.


luck of the draw I guess. very good cooling in the HTPC case, probably 80% of the case air intake is right at the card (horizontal case). cgminer keeps it @ 70C max, 60% fan max.

although its @ 1000 now, it blew up while mining and watching a .MKV bluray rip a while back. generally runs 227 MH/s @ -I 7. gets 75 MH/s while watching movies @ 1080p at -I D though, heh.

use MSI Afterburner to set to 1000/300 on boot, then fire up cgminer.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Using intensity of 13 for the 7970 and 9 for the 6950 the machine becomes very irresponsible (exported GPU_USE_SYNC_OBJECTS=1, CPU is used 50% by compiz/X and 50% by cgminer). Not sure if CPU is the bottleneck now or if there are better settings for this setup  Undecided

Don't go over 11 for 7970 on linux with sync objects or 9 on windows.
donator
Activity: 919
Merit: 1000
zefir,

Are you running a 32 or a 64 bit kernel?
Hi,

using 32bit kernel. Finally while trying every possible PCIe-slot / driver combination it works.

The XFX Black is getting hot already at stock clock (82°C@1GHz), cgminer is not even trying to OC, ending up at ~570 MH/s. The Gigabyte is being pushed to the max locked freq of 1.2GHz, pulling highly fluctuating 660-740 MH/s and staying at ~72°C.

I let my old 6950 in the rig for reference that pulls what it did before: 360 MH/s @ 890 MHz @ 75°C.

Using intensity of 13 for the 7970 and 9 for the 6950 the machine becomes very irresponsible (exported GPU_USE_SYNC_OBJECTS=1, CPU is used 50% by compiz/X and 50% by cgminer). Not sure if CPU is the bottleneck now or if there are better settings for this setup  Undecided
legendary
Activity: 916
Merit: 1003
Yup.  I'm currently running 2.3.1-2 on my 6770 (using diablo kernel) very happily and stably.  I'll need  a really good reason to upgrade again but I don't foresee ever needing to.

BTW how'd you get your 6770 cranked up to 1005?  I'm currently running 960/300 and  I've been unable to run higher reliably.

legendary
Activity: 4354
Merit: 3614
what is this "brake pedal" you speak of?

[2012-03-20 14:04:09] Failed to create submit_work_thread

Failing to create  submit_work_thread is the key here. It has nothing to do with how it fails after that. Inability to create the thread suggests a system resource problem, like running out of memory or too many threads starting up for some reason.

been waiting to update this.

I updated cgminer on the 6770 and 6870 systems to 2.3.1-2 (was straight 2.3.1) and no problems since. the updated version may or may not have anything to do with it as I kinda glazed over the differences. need to pay more attention  Grin

just seems odd that 2 completely different systems (with different windows OSs, sdks and driver versions) gave the exact same error multiple times in the same time frame. only thing the same was 2.3.1 cgminer.

wonder if it was a MS "patch tuesday" thing as I recall a pile of patches came down around that time.

anyway all is well now.

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
The only one that helped was setting GPU threads to 1.  

Since I have done that, I have been stable for 8 hours with no restarts, so looks promising, but we'll see if it lasts...  Undecided   Any disadvantage to only running 1 thread per gpu?  Hash rates seem about the same.
That's interesting and unusual. I have certainly heard that particular driver and sdk combos would make 5970s unstable, but I don't know if that's true. If you're getting good hashrates it doesn't matter what settings you particularly chose now does it? I found 2 threads simply smoothed out the hashrate rather than improved it substantially, but some cards had slightly higher hashrates with 2 instead of 1.
sr. member
Activity: 309
Merit: 250
Did you read my advice about fan control and 5970s?

Yea, I tried your latest git, which you said had the fix for 5970 autofan.  I then tried kano's latest with the extra monitoring... just saying threads idle. Then I tried just manually setting the fans which still had no effect. I tried another pool and only had 3 restarts in an hour instead of 12 like on gpumax.... that seemed more reasonable and I adjusted overclock accordingly, but didnt run long enough to see if that stabled it completely.  Finally I started messing around with different options while on gpumax.  The only one that helped was setting GPU threads to 1.  

Since I have done that, I have been stable for 8 hours with no restarts, so looks promising, but we'll see if it lasts...  Undecided   Any disadvantage to only running 1 thread per gpu?  Hash rates seem about the same.
hero member
Activity: 642
Merit: 500
zefir,

Are you running a 32 or a 64 bit kernel?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
Nice to see you're back, Con.

X is up and running at the machine with me being logged in. Before I started to test the 7970 it perfectly worked with 3*6950. Confused...

Code:
zefir@miner0:~/work/cgminer$ aticonfig --list-adapters
* 0. 07:00.0 AMD Radeon HD 7900 Series
  1. 04:00.0 AMD Radeon HD 7900 Series

* - Default adapter

Any way to explicitly enable the second card if it really is disabled?

When you started experimenting with 7970 you wrote that it helped to swap PCIe-slots, right?

I'll try that and test it with 7970+6950 meanwhile. Thanks.

I'm not back. I drop in once a week to answer the accumulated questions. I'll be back once my PSU is replaced. Developing without hardware that can run the software ends up leading to issues which leads to more unnecessary development which leads to more issues and so on...

I had to put the 7970 into the first slot and the 6970s in the rest of the slots. It doesn't even look like your first card is even working properly since it is detecting only one device and then saying it can't use it. Your xorg setup looks fine. The usual thing to point the finger at after doing the obvious things is the ati driver/sdk combo. Try an older one or different one. The first driver they released for the 7970 I hear is the least worst one so far, so try downgrading to that.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Did you read my advice about fan control and 5970s?
... and in case you weren't sure boozer - my git changes were added on top of those changes ckolivas did after the version was released, so that version you compiled will have the changes ckolivas mentioned - so you can do as he suggested with that version.

... and yeah you can clearly see which GPU's are getting 60s idle sick problems (1, 3, 4 & 5) and GPU 4 has twice.

I've updated miner.php in my git to show the notify command (it shows the times as H:M:S and also any warning/errors are orange/red)

The change is on by default - you can switch it off - see the comment near the top of the code - i.e. set '$notify = false;' to switch it off
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Kano, just tried to compile your latest, getting this:

Code:
bitforce.c: In function ‘bitforce_scanhash’:
bitforce.c:310: error: ‘REASON_THERMAL_CUTOFF’ undeclared (first use in this function)
bitforce.c:310: error: (Each undeclared identifier is reported only once
bitforce.c:310: error: for each function it appears in.)

Haven't looked into it myself yet, but thought I'd post it straight away...
Oops - forgot to do another ./configure to catch that Sad
Fixed it now.
donator
Activity: 919
Merit: 1000
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
Nice to see you're back, Con.

X is up and running at the machine with me being logged in. Before I started to test the 7970 it perfectly worked with 3*6950. Confused...

Code:
zefir@miner0:~/work/cgminer$ aticonfig --list-adapters
* 0. 07:00.0 AMD Radeon HD 7900 Series
  1. 04:00.0 AMD Radeon HD 7900 Series

* - Default adapter

Any way to explicitly enable the second card if it really is disabled?

When you started experimenting with 7970 you wrote that it helped to swap PCIe-slots, right?

I'll try that and test it with 7970+6950 meanwhile. Thanks.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Folks, anything that needs to be considered for running cgminer on a multi-7970 setup?

I set up a rig with a XFX-7970 Black Edition and a Gigabyte GV7970-OC, installed the latest APP and drivers, but can't get them running.

Code:
zefir@miner0:~/work/cgminer$ export DISPLAY=:0 && ./cgminer -n -D
[2012-03-23 23:32:22] CL Platform 0 vendor: Advanced Micro Devices, Inc.
[2012-03-23 23:32:22] CL Platform 0 name: AMD Accelerated Parallel Processing
[2012-03-23 23:32:22] CL Platform 0 version: OpenCL 1.2 AMD-APP (923.1)
[2012-03-23 23:32:22] Platform 0 devices: 1
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] GPU 0 AMD Radeon HD 7900 Series hardware monitoring enabled
[2012-03-23 23:32:22] 1 GPU devices max detected


I read people driving up to 4*7970 setups with cgminer under Linux, so I must miss something really obvious   Undecided

Any ideas?
#define ADL_ERR_DISABLED_ADAPTER      -10

Looks like something is wrong in your setup. I assume you have X running and have permission to access X with the user you're logged in as?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
So was testing over heat protection on my water cooled farm ...

Code:

 cgminer version 2.3.1 - Started: [2012-03-23 15:45:20]
--------------------------------------------------------------------------------
 (10s):2507.1 (avg):2573.0 Mh/s | Q:3286  A:1932  R:26  HW:0  E:59%  U:35.30/m
 TQ: 8  ST: 9  SS: 21  DW: 1506  NB: 7  LW: 3806  GF: 0  RF: 0
 Connected to http://192.168.0.189:9332 with LP as user user/1000+1
 Block: 000006e1c8f6fcf1aa1e1f358d344831...  Started: [16:36:55]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  58.0C  960RPM | 324.2/335.3Mh/s | A:263 R:2 HW:0 U:   4.81/m I: 8
 GPU 1:  59.0C  960RPM | REST  /333.1Mh/s | A:261 R:6 HW:0 U:   4.77/m I: 8
 GPU 2:  60.0C  960RPM | REST  /225.7Mh/s | A:188 R:4 HW:0 U:   3.44/m I: 8
 GPU 3:  60.5C  960RPM | REST  /328.6Mh/s | A:246 R:5 HW:0 U:   4.49/m I: 8
 GPU 4:  56.0C  960RPM | 324.5/358.6Mh/s | A:217 R:3 HW:0 U:   3.96/m I: 8
 GPU 5:  59.5C  960RPM | REST  /330.7Mh/s | A:239 R:0 HW:0 U:   4.37/m I: 8
 GPU 6:  59.5C  960RPM | REST  /330.4Mh/s | A:261 R:3 HW:0 U:   4.77/m I: 8
 GPU 7:  58.5C  960RPM | REST  /333.3Mh/s | A:262 R:3 HW:0 U:   4.79/m I: 8

Notice the 10s avg hashrate is inaccurate.  Looks like when card goes idle due to overheat its last hashrate is still added to global average.
It's just the asynchronous way the counter is updated. It starts reading wrong when the card is not reporting in and only works after it knows what's going on with the unresponsive card. I can make the counter more accurate but I'm loathe to making it use more CPU time. I went out of my way to make all these counters use hardly any CPU, and since multiple threads report to the global counter there is a lot of locking required to update the counter which is a potential source of overhead if you update it often.
donator
Activity: 919
Merit: 1000
Folks, anything that needs to be considered for running cgminer on a multi-7970 setup?

I set up a rig with a XFX-7970 Black Edition and a Gigabyte GV7970-OC, installed the latest APP and drivers, but can't get them running.

Code:
zefir@miner0:~/work/cgminer$ lspci | grep VGA
04:00.0 VGA compatible controller: ATI Technologies Inc Device 6798
07:00.0 VGA compatible controller: ATI Technologies Inc Device 6798

Code:
zefir@miner0:~/work/cgminer$ cat /etc/X11/xorg.conf
Section "Monitor"
Identifier   "aticonfig-Monitor[0]-0"
Option     "VendorName" "ATI Proprietary Driver"
Option     "ModelName" "Generic Autodetecting Monitor"
Option     "DPMS" "true"
EndSection

Section "Monitor"
Identifier   "aticonfig-Monitor[1]-0"
Option     "VendorName" "ATI Proprietary Driver"
Option     "ModelName" "Generic Autodetecting Monitor"
Option     "DPMS" "true"
EndSection

Section "Device"
Identifier  "aticonfig-Device[0]-0"
Driver      "fglrx"
BusID       "PCI:4:0:0"
EndSection

Section "Device"
Identifier  "aticonfig-Device[1]-0"
Driver      "fglrx"
BusID       "PCI:7:0:0"
EndSection

Section "Screen"
Identifier "aticonfig-Screen[0]-0"
Device     "aticonfig-Device[0]-0"
Monitor    "aticonfig-Monitor[0]-0"
DefaultDepth     24
SubSection "Display"
Viewport   0 0
Depth     24
EndSubSection
EndSection

Section "Screen"
Identifier "aticonfig-Screen[1]-0"
Device     "aticonfig-Device[1]-0"
Monitor    "aticonfig-Monitor[1]-0"
DefaultDepth     24
SubSection "Display"
Viewport   0 0
Depth     24
EndSubSection
EndSection

Code:
zefir@miner0:~/work/cgminer$ export DISPLAY=:0 && ./cgminer -n -D
[2012-03-23 23:32:22] CL Platform 0 vendor: Advanced Micro Devices, Inc.
[2012-03-23 23:32:22] CL Platform 0 name: AMD Accelerated Parallel Processing
[2012-03-23 23:32:22] CL Platform 0 version: OpenCL 1.2 AMD-APP (923.1)
[2012-03-23 23:32:22] Platform 0 devices: 1
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] Failed to ADL_Adapter_ID_Get. Error -10
[2012-03-23 23:32:22] GPU 0 AMD Radeon HD 7900 Series hardware monitoring enabled
[2012-03-23 23:32:22] 1 GPU devices max detected


I read people driving up to 4*7970 setups with cgminer under Linux, so I must miss something really obvious   Undecided

Any ideas?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Did you read my advice about fan control and 5970s?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Isn't there a way in OpenCL to check what version of the SDK the .bin was compiled in and re-compile only if the SDK did not match currently installed? Or, encode the version into .bin filename. Sorry for nitpicking Smiley.
Doing this would circumvent the cheating people can do by using a bin from an older sdk like 2.1 with a newer installed sdk indefinitely to get the performance from the older sdk while installing newer drivers without losing their hashrate.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Code:
[2012-03-22 22:46:43] Started cgminer 2.3.1

[2012-03-22 22:46:43] Started cgminer 2.3.1
[2012-03-22 22:46:43] Probing for an alive pool
[2012-03-22 22:46:44] Long-polling activated for http://mining.eligius.st:8337/LP
Segmentation fault
root@ds-r:~/cgminer-2.3.1-2#

Running on debian squeeze, sdk 2.4, newer/ish fglrx

I get the same error from both a self built and the pre-built ubuntu binary.
Partial install of 2 different SDKs.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/

I would assume this is a false positive, but I guess it doesn't hurt to ask official advice. MS Security Essentials didn't pick up on anything but AVG did.


Read the increasingly unread FAQ in the readme included in the zip file about cgeminer being a virus.
Jump to: