Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 605. (Read 5805746 times)

hero member
Activity: 868
Merit: 1000
Hi ckolivas,

a few weeks back I ran into some communication failures and high reject numbers at BTCGuild after a LP-change

I escalated the problem to Eleuthria https://bitcointalksearch.org/topic/m.839599 and he said he would contact you about it

Apparently the problem doesn't come up with everyone just with low-latency miners and pools with Poolservj running

I'm running into the same problems with Clipse's pool but have NO problems when mining on Deepbit for instance (tried out a lot of different pools in the last few days)

I submitted CGMiner output on BTCGuild here https://bitcointalksearch.org/topic/m.839444

and this is the output from Clipse's pool

Code:
cgminer version 2.3.1 - Started: [2012-04-14 04:56:26]
--------------------------------------------------------------------------------
 (5s):583.9 (avg):321.3 Mh/s | Q:2867  A:2263  R:105  HW:4  E:79%  U:4.18/m
 TQ: 2  ST: 2  SS: 6  DW: 267  NB: 66  LW: 5131  GF: 12  RF: 21
 Connected to http://pool.bonuspool.co.cc:80 with LP as user XXXXX
 Block: 000007211c0c28968da77075f714da8e...  Started: [13:51:09]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  76.0C 2574RPM | 324.4/321.3Mh/s | A:2263 R:105 HW:4 U:4.18/m I: 4
--------------------------------------------------------------------------------

[2012-04-14 13:56:21] Accepted 00000000.77b43c68.4d7bec56 GPU 0 thread 0 pool 0
[2012-04-14 13:56:30] Accepted 00000000.c2bb7890.abfddea3 GPU 0 thread 0 pool 0
[2012-04-14 13:56:31] Accepted 00000000.80051067.164ef42f GPU 0 thread 0 pool 0
[2012-04-14 13:56:49] Accepted 00000000.4c40ecfa.9643c5b7 GPU 0 thread 0 pool 0
[2012-04-14 13:56:57] longpoll failed for http://pool.bonuspool.co.cc:80/LP, sle
eping for 30s
[2012-04-14 13:57:02] Pool 0 communication failure, caching submissions
[2012-04-14 13:57:07] Pool 0 communication resumed, submitting work
[2012-04-14 13:57:07] Rejected 00000000.a53bea12.3d3ea9c4 GPU 0 thread 0 pool 0
[2012-04-14 13:57:38] Rejected 00000000.daff5e55.77bcdd04 GPU 0 thread 0 pool 0
[2012-04-14 13:57:39] longpoll failed for http://pool.bonuspool.co.cc:80/LP, sle
eping for 30s
[2012-04-14 13:57:41] Pool 0 communication failure, caching submissions

I hope you can make sense of it and I am curious if others are having the same problems....
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
There's another issue tho - I can't change voltage for 6870 in cgminer, for 5770 its working fine. This is the only one issue which stops me from moving to linux.
 Btw it's possible to integrate similiar overclocking system as in ati tray tools, so it will be possible to underclock memory to 300 mhz and lower with cgminer for 6xxx series? Maybe ATT dev using some simple trick for it.
No, they're the limits of software manipulation without device specific hacks that are only available to certain vendors (i.e. not me).
Any other linux apps then which can undervolt it?  Huh

 Maybe just ask Ray Adams (ATT developer) how he did it? It's actually pretty funny that ATT is the only one app that can underclock memory of 6xxx series to 300 mhz or lower. Shocked Something is plain wrong here. Eh, maybe because he's russian?  Grin
WTF is ATT and is it cross platform software with source?
hero member
Activity: 535
Merit: 500
There's another issue tho - I can't change voltage for 6870 in cgminer, for 5770 its working fine. This is the only one issue which stops me from moving to linux.
 Btw it's possible to integrate similiar overclocking system as in ati tray tools, so it will be possible to underclock memory to 300 mhz and lower with cgminer for 6xxx series? Maybe ATT dev using some simple trick for it.
No, they're the limits of software manipulation without device specific hacks that are only available to certain vendors (i.e. not me).
Any other linux apps then which can undervolt it?  Huh

 Maybe just ask Ray Adams (ATT developer) how he did it? It's actually pretty funny that ATT is the only one app that can underclock memory of 6xxx series to 300 mhz or lower. Shocked Something is plain wrong here. Eh, maybe because he's russian?  Grin
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
There's another issue tho - I can't change voltage for 6870 in cgminer, for 5770 its working fine. This is the only one issue which stops me from moving to linux.
 Btw it's possible to integrate similiar overclocking system as in ati tray tools, so it will be possible to underclock memory to 300 mhz and lower with cgminer for 6xxx series? Maybe ATT dev using some simple trick for it.
No, they're the limits of software manipulation without device specific hacks that are only available to certain vendors (i.e. not me).
hero member
Activity: 535
Merit: 500
I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate?
No, it never stops, shares are submitted with same speed at same temp. I've tested identical config with 2.2.1 and it worked without any issues.
Okay thanks. I have reviewed the code in question and indeed it would not cut out. I have fixed this bug in the current git tree.
Thanks! Working now. There's another issue tho - I can't change voltage for 6870 in cgminer, for 5770 its working fine. This is the only one issue which stops me from moving to linux.
 Btw it's possible to integrate similiar overclocking system as in ati tray tools, so it will be possible to underclock memory to 300 mhz and lower with cgminer for 6xxx series? Maybe ATT dev using some simple trick for it.
hero member
Activity: 896
Merit: 1000
Buy this account on March-2019. New Owner here!!
Version 2.3.3 - April 15, 2012

Human readable summary:
- Over temperature GPUs that should have had mining suspended but did not, should now be fixed.
- Windows lusers that had the ATI Display Library fail and stop reporting fan speed, which would then cause cgminer to just abruptly stop, should now have cgminer spontaneously restart from scratch if it detects this mode of AMD failure. It is a gamble, but should work based on feedback from people that had this problem.
- There is now a restart cgminer option within cgminer.
- When mining with more than 8 devices, the display will only show a summary instead of corruption.

Full changelog:
- Don't even display that cpumining is disabled on ./configure to discourage
people from enabling it.
- Do a complete cgminer restart if the ATI Display Library fails, as it does on
windows after running for some time, when fanspeed reporting fails.
- Cache the initial arguments passed to cgminer and implement an attempted
restart option from the settings menu.
- Disable per-device status lines when there are more than 8 devices since
screen output will be corrupted, enumerating them to the log output instead at
startup.
- Reuse Vals[] array more than W[] till they're re-initialised on the second
sha256 cycle in poclbm kernel.
- Minor variable alignment in poclbm kernel.
- Make sure to disable devices with any status not being DEV_ENABLED to ensure
that thermal cutoff code works as it was setting the status to DEV_RECOVER.
- Re-initialising ADL simply made the driver fail since it is corruption over
time within the windows driver that's responsible. Revert "Attempt to
re-initialise ADL should a device that previously reported fanspeed stops
reporting it."
- Microoptimise poclbm kernel by ordering Val variables according to usage
frequency.

Quote
Windows lusers

my only question is did you intend to call them Windows Lusers? hahaha

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Installed 2.3.3 and it still shows "cgminer version 2.3.2" for the Ubuntu binary. Also been getting some weird screen lag on my display GPU, but it's still set at dynamic intensity like it's always been. The display GPU is getting about half the hashrate that it usually gets too.

EDIT: It looks like this was triggered by my trying out a new pool and changing from d,7 intensities and -g 1 to d,8 and default threads. Switching back to a p2pool node with d,7 and -g 1 again fixed it.
The version number was the only thing missing from the binary. I've reuploaded it (it's the same but just shows 2.3.3).
hero member
Activity: 591
Merit: 500
Installed 2.3.3 and it still shows "cgminer version 2.3.2" for the Ubuntu binary. Also been getting some weird screen lag on my display GPU, but it's still set at dynamic intensity like it's always been. The display GPU is getting about half the hashrate that it usually gets too.

EDIT: It looks like this was triggered by my trying out a new pool and changing from d,7 intensities and -g 1 to d,8 and default threads. Switching back to a p2pool node with d,7 and -g 1 again fixed it.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Version 2.3.3 - April 15, 2012

Human readable summary:
- Over temperature GPUs that should have had mining suspended but did not, should now be fixed.
- Windows lusers that had the ATI Display Library fail and stop reporting fan speed, which would then cause cgminer to just abruptly stop, should now have cgminer spontaneously restart from scratch if it detects this mode of AMD failure. It is a gamble, but should work based on feedback from people that had this problem.
- There is now a restart cgminer option within cgminer.
- When mining with more than 8 devices, the display will only show a summary instead of corruption.

Full changelog:
- Don't even display that cpumining is disabled on ./configure to discourage
people from enabling it.
- Do a complete cgminer restart if the ATI Display Library fails, as it does on
windows after running for some time, when fanspeed reporting fails.
- Cache the initial arguments passed to cgminer and implement an attempted
restart option from the settings menu.
- Disable per-device status lines when there are more than 8 devices since
screen output will be corrupted, enumerating them to the log output instead at
startup.
- Reuse Vals[] array more than W[] till they're re-initialised on the second
sha256 cycle in poclbm kernel.
- Minor variable alignment in poclbm kernel.
- Make sure to disable devices with any status not being DEV_ENABLED to ensure
that thermal cutoff code works as it was setting the status to DEV_RECOVER.
- Re-initialising ADL simply made the driver fail since it is corruption over
time within the windows driver that's responsible. Revert "Attempt to
re-initialise ADL should a device that previously reported fanspeed stops
reporting it."
- Microoptimise poclbm kernel by ordering Val variables according to usage
frequency.
newbie
Activity: 73
Merit: 0
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.

It's true. So I just restart cgminer every seven days, usually on saturdays lol
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.
Bug's been mentioned. Working on it now.
full member
Activity: 174
Merit: 100
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I get the same error in on 12.1, 12.2, and 12.3 Sad I even thought maybe I didn't unzip my files in root or something and redid all the SDKs but no luck.

2.4 Looks like it will almost work then Segfaults just about as it looks like its going to LP. Is there more debugging info I can provide somehow? -n just goes right to segfault with 2.5, and 2.4 actually looks like it works when you -n but doesn't once you start it up.

Sigh..

Basically the issue is you have a corrupt sdk installation, usually a mixture of files from multiple SDKs and not a complete installation of any one. You've gotta clear it all out somehow...
sr. member
Activity: 327
Merit: 250
I get the same error in on 12.1, 12.2, and 12.3 Sad I even thought maybe I didn't unzip my files in root or something and redid all the SDKs but no luck.

2.4 Looks like it will almost work then Segfaults just about as it looks like its going to LP. Is there more debugging info I can provide somehow? -n just goes right to segfault with 2.5, and 2.4 actually looks like it works when you -n but doesn't once you start it up.

Sigh..

I have a segfault issue id like to see if anyone can help me fix. The segfault only happens when I try to use the 2.4, or 2.5 SDK Drivers with the 12.3 drivers, on Debian/unstable. What is odd is even when I recompile with the 2.6 SDK it still doesn't work I have to reinstall the drivers to get it to stop segfaulting. Witch leads me to believe I'm doing something wrong with the SDKs, Although I switch between 2.4, and 2.5 without an issue using a stable build of Debian with older ATI drivers, so I'm basically at a loss.

Im using a single 5850 on this machine., and cgminer compiles fine, and even shows the correct SDK loaded when I compile for 2.4. The 2.5 just flat out segfaults even with -n.

Id really like to get the 2.4/2.5 SDK working with the 12.3 drivers if possible.

Thanks

Doff
12.3 is your problem. It's a stinker. Drop down to 12.1 or 12.2 if you need 79x0 support.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Kano,

You've got in the devs list, 'ID' and 'BFL'/'ICA' etc. to give us some number. What's the difference between these numbers? Are they equivilent, or might you get something like:
ID 0, BFL 0
ID 1, BFL 1
ID 2, ICA 0
...

And when you request data with gpu|N or PGA|N, is that N 'ID' or 'BFL'/'ICA' etc?

Thanks
Not quite Smiley It's:
PGA, Name, ID

The PGA number starts at 0 and goes up to the number of PGA devices -1
The ID matches the order on the screen - and it's the cgminer internal sequential device_id

The point of PGA is that you send PGA only commands and they refer to the PGA devices in device_id order but not skipping numbers.
Thus it's always just a simple number range starting at 0.

So on my rig (2x6950 + 2xIcarus):
GPU=0,...|
GPU=1,...|
PGA=0,Name=ICA,ID=2,...|
PGA=1,Name=ICA,ID=3,...|
legendary
Activity: 1795
Merit: 1208
This is not OK.
Kano,

You've got in the devs list, 'ID' and 'BFL'/'ICA' etc. to give us some number. What's the difference between these numbers? Are they equivilent, or might you get something like:
ID 0, BFL 0
ID 1, BFL 1
ID 2, ICA 0
...

And when you request data with gpu|N or PGA|N, is that N 'ID' or 'BFL'/'ICA' etc?

Thanks
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate?
No, it never stops, shares are submitted with same speed at same temp. I've tested identical config with 2.2.1 and it worked without any issues.
Okay thanks. I have reviewed the code in question and indeed it would not cut out. I have fixed this bug in the current git tree.
hero member
Activity: 535
Merit: 500
Huge issue with cgminer cant shut down gpu if its overheats. Cgminer trying it to disable the gpu over and over again, but its continuing to mine!
 
 
I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate?
No, it never stops, shares are submitted with same speed at same temp. I've tested identical config with 2.2.1 and it worked without any issues.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
blabla
Well that settles it, I cannot successfully re-initialise ADL. I haven't said it for a while since I've been away for a week, but thanks AMD  Roll Eyes

I guess the other solution is for cgminer to completely restart with all its original settings. Would people like cgminer to attempt to do this? The problem with doing this unconditionally is that if a GPU has hung, usually the other GPUs can keep mining, but if you try to stop cgminer, they all stop mining. So I would need to make it try to restart itself from scratch only if it hasn't got a dead GPU. Comments?

Yes, I would like to see this ;-) Also, restart with all stats still available if possible ;-)

I will donate 10 BTC for this feature ;-)
Restart with all stats carried over might be more than a little messy, and much more prone to failure...
sr. member
Activity: 349
Merit: 250
blabla
Well that settles it, I cannot successfully re-initialise ADL. I haven't said it for a while since I've been away for a week, but thanks AMD  Roll Eyes

I guess the other solution is for cgminer to completely restart with all its original settings. Would people like cgminer to attempt to do this? The problem with doing this unconditionally is that if a GPU has hung, usually the other GPUs can keep mining, but if you try to stop cgminer, they all stop mining. So I would need to make it try to restart itself from scratch only if it hasn't got a dead GPU. Comments?

Yes, I would like to see this ;-) Also, restart with all stats still available if possible ;-)

I will donate 10 BTC for this feature ;-)
Jump to: