Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 469. (Read 5805546 times)

legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
After messing with the MMQ initialisation code for a while I've rewritten it and solved the hang I was getting.
It's not in any git yet.

Is anyone using (working) cgminer on an MMQ on linux at the moment?

If so - what do all of
Code:
uname -r
cat /etc/*release
ls -las /dev/tty[AU]*
return?

As I've mentioned before, the current git code just hangs during initialisation for me on Xubu 11.04 and Fedora 17 and it's due to termios not handling the ACM device as a tty on both my linux versions

I'm going to redo the old Frequency management code (cgminer doesn't currently contain the newer Frequency code in BarbieMiner)
then put up a pull with just those 2 changes so people can try it out and let me know if there are any problems with them
The initialisation change is only for linux

Redoing the threading will be done next after that
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Yes that is interesting. I'm guessing you have underclocked your memory exceptionally low, as that was found to be an issue with use of atomic ops. Some people found a bump of 15 in memory was enough to correct it. Lack of atomic functions there could lead to HW errors and loss of shares. It's a tradeoff either way. The change was put in there to make sure no shares were lost, which can happen with the old opencl code (though it's only a very small number that would be lost).

Ah, ok! Thanks for the info. Yep, I'm at 150MHz mem clock. It's to prevent the case of simultaneous nonce finds on different vectors to overwrite the result on the same address, right?

I prefer the tradeoff tbh, I did the math a while ago on the probability of that happening (P=1/(2^32)*1/(2^32)=1/(2^64). On a 1GH/s card, that will happen on average once every ~585 years)

I'm still using that optimization tradeoff I posted for more than a year now! Grin
Code:
#elif defined VECTORS2
uint result = W[117].x ? 0u:W[3].x;
             result = W[117].y ? result:W[3].y;
if (result)
     SETFOUND(result);
No, you're not quite right there btw. There are a few issues that made me use the atomic ops instead.

There is no way to return a nonce value of 0.
Bitmasked nonce values can also be zero meaning they get lost.
It is not just vectors that find nonces at the same time, it's a whole wave front of threads finding nonces at the same time and corrupting both values.
Bitmasked nonce values from results found in the same global worksize can come out the same value and overwrite each other.
It's to consolidate the return values from different kernels and decrease the CPU usage of the return code that checks the nonce values.

Again, very small but far from 2^64. Since bitcoin mining is a game of odds, I didn't see the point of losing that - provided you don't drop the hashrate of course. It's unusual that some devices need higher memory speed just for one atomic op but clearly it's a massively memory intensive operation that affects the whole wave front. Considering increasing ram speed by 15 or 20 would not even register in terms of extra power usage and temperature generated, to me at least it seems a better option.

But the beauty of free software is you can do whatever you like to the code if you don't like the way I do it Wink
Vbs
hero member
Activity: 504
Merit: 500
Yes that is interesting. I'm guessing you have underclocked your memory exceptionally low, as that was found to be an issue with use of atomic ops. Some people found a bump of 15 in memory was enough to correct it. Lack of atomic functions there could lead to HW errors and loss of shares. It's a tradeoff either way. The change was put in there to make sure no shares were lost, which can happen with the old opencl code (though it's only a very small number that would be lost).

Ah, ok! Thanks for the info. Yep, I'm at 150MHz mem clock. It's to prevent the case of simultaneous nonce finds on different vectors to overwrite the result on the same address, right?

I prefer the tradeoff tbh, I did the math a while ago on the probability of that happening (P=1/(2^32)*1/(2^32)=1/(2^64). On a 1GH/s card, that will happen on average once every ~585 years)

I'm still using that optimization tradeoff I posted for more than a year now! Grin
Code:
#elif defined VECTORS2
uint result = W[117].x ? 0u:W[3].x;
             result = W[117].y ? result:W[3].y;
if (result)
     SETFOUND(result);
hero member
Activity: 742
Merit: 500
Its as easy as 0, 1, 1, 2, 3
Current code

Code:
./cgminer --scrypt -o http://ltc.kattare.com:9332 -u jasinlee.1 -p 1 -g 1 --thread-concurrency 8000 -I 18 -w 256 --auto-fan


I Assume this probably has to do something with it.

Code:
 [2012-10-14 21:35:49] Failed to init GPU thread 0, disabling device 0
 [2012-10-14 21:35:49] Restarting the GPU from the menu will not fix this.
 [2012-10-14 21:35:49] Try restarting cgminer.
Press enter to continue:

 [2012-10-14 21:36:09] Your scrypt settings come to 524288000
 [2012-10-14 21:36:09] Error -61: clCreateBuffer (padbuffer8), decrease CT or i
ncrease LG
 [2012-10-14 21:36:09] Failed to init GPU thread 7, disabling device 7

Oh,

use --thread-concurrency 7168 or 5632.  that's cgminer sucking because of ocl memory consumption problems.

Any ideas you could give us on this ? Tried so many things already and its just not quite working.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Just updated to 2.8.3.  I have to say I really don't like the new precision on numeric output:

Quote
GPU 0:  73.0C 2900RPM | 604.2M/661.2Mh/s | A:3 R:0 HW:0 U: 7.03/m I: 9
GPU 1:  74.0C 2912RPM | 595.1M/659.3Mh/s | A:5 R:0 HW:0 U:11.72/m I: 9
GPU 2:  74.0C 2922RPM | 611.8M/658   Mh/s | A:10 R:0 HW:0 U:23.44/m I: 9

Could you put a .0 on there instead of all that blank space?  It's a minor thing, but I like it better that way.
This is a side effect of trying to find a generic format that is aligned on the screen and fits values from 0 to 18,446,744,073,709,551,616 in a generic way, while still maintaining adequate precision for the relative rate for that device. It is not entirely straight forward and what to do about zeroes is not ever going to be to everyone's satisfaction. 001.0 or 01.00 or 1.000 ?  

By the way, that massive value would show up as 18.45EH/s with that current scheme, so that it could show up aligned on the same screen as something with 0.001 H/s.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Sorry if this has been asked many times, but why is one version stable, and another version is simply latest, what does it take to get a version to be stable.
I ask because 2.8.3 is crashing about every 3 days.
I think you answered your own question... I will only call 2.8.x stable once it is stable everywhere. I'm trying hard to find the remaining bug(s) in 2.8.3, and it appears to only be affecting windows users, but the bug is eluding me.
sr. member
Activity: 336
Merit: 250
Sorry if this has been asked many times, but why is one version stable, and another version is simply latest, what does it take to get a version to be stable.
I ask because 2.8.3 is crashing about every 3 days.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I was running cgminer 2.5.0 till today, as the latest versions had a performance hit on my 5850's (win x64, SDK 2.5 on Cat 12.1), since the last phatk update (~400MH/s to ~320MH/s).

Since I had some free time today I went to check what had changed from phatk120223 to phatk120823 and found what was causing me that performance hit. Commenting the problematic lines fixed it:

Code:
//#if defined(OCL1)
#define SETFOUND(Xnonce) output[output[FOUND]++] = Xnonce
//#else
// #define SETFOUND(Xnonce) output[atomic_add(&output[FOUND], 1)] = Xnonce
//#endif

I still find it strange how the atomic_add could be responsible for that much of a mhash hit, since it will only be called on found nonces. Tongue

(Running 2.8.3 now, will keep monitoring performance for any issues)
Yes that is interesting. I'm guessing you have underclocked your memory exceptionally low, as that was found to be an issue with use of atomic ops. Some people found a bump of 15 in memory was enough to correct it. Lack of atomic functions there could lead to HW errors and loss of shares. It's a tradeoff either way. The change was put in there to make sure no shares were lost, which can happen with the old opencl code (though it's only a very small number that would be lost).
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
This post is the most interesting and do you know why? The GPU code is UNCHANGED between 2.7.6 and 2.8.3. The only thing that is different is the stratum code. Now why would that make your GPUs SICK now when they didn't previously? Because with the stratum code, the device is busier than ever. There is no possible way to keep the device as busy as doing it in the c code internally in cgminer. So your devices are no longer getting any rest between work.

EDIT: This is precisely why stratum was developed by the way; in order to be able to keep much higher hashrate devices busy. If you want to test this theory, use version 2.8.3 and connect to the proxy in http mode by using --fix-protocol.

That is quite interesting, i must admit.
My GPUs are mostly at 99% load, i have the CCC open all the time, but must admit i don't look all the time.
Is it that the load drops sometimes for a short period and that makes the GPU cool or recover so it can look like lasting longer, and on stratum it is on 99% load 24h a day?

Also, i have a request:
Is it possible to show the fan percentage with the rpm's? I use CCC to monitor fan percentage, because that is my measure when and if  my computers need to be air condititoned.
As long as fan percentage is below my set max of 55% i declare them sufficient and do not reduce engine clock. (But on the other hand it seems i need to to have stratum work?)
Yes but it is only absolutely tiny amounts of time the load drops and may not even visible in the reported GPU load which isn't an accurate measure anyway.

I only show fan percentage when fan RPM is not available for that particular device. The reason? The fan percentage is just the value you have told the device to run... it could happily say 55% and the fan may have stopped spinning for some reason which is inherently dangerous. The fan speed rpm is a monitor, it is not a setting.
legendary
Activity: 2576
Merit: 1186
Just updated to 2.8.3.  I have to say I really don't like the new precision on numeric output:

Quote
GPU 0:  73.0C 2900RPM | 604.2M/661.2Mh/s | A:3 R:0 HW:0 U: 7.03/m I: 9
GPU 1:  74.0C 2912RPM | 595.1M/659.3Mh/s | A:5 R:0 HW:0 U:11.72/m I: 9
GPU 2:  74.0C 2922RPM | 611.8M/658   Mh/s | A:10 R:0 HW:0 U:23.44/m I: 9

Could you put a .0 on there instead of all that blank space?  It's a minor thing, but I like it better that way.
You might prefer BFGMiner's format:
Code:
BFL 0:  68.3C         | 877.2/868.3/865.6Mh/s | A:14233 R:164 HW:  28 U:12.09/m
BFL 1:  68.3C         | 1.477/1.468/1.465Gh/s | A:28533 R:164 HW:  28 U:24.18/m

Cgminer now displays the actual share difficulty target it hit as well as the current pool difficulty like so:
Code:
 [2012-10-12 21:00:31] Accepted 2687d42a Diff 6/3 GPU 1 pool 2
 [2012-10-12 21:00:41] Accepted 00f98044 Diff 262/3 GPU 3 pool 2
 [2012-10-12 21:00:45] Accepted 3840818b Diff 4/3 GPU 2 pool 2
 [2012-10-12 21:00:55] Accepted 35777786 Diff 4/3 GPU 1 pool 2

I'm not understanding this.  Aren't these difficulty numbers multipliers of current difficulty?  How/why would a miner be working on a higher difficulty than the pool is requesting?
Miners are trying to find hashes that meet a minimum of the current difficulty. So if your hash only hit difficulty 1, it is not good enough for a difficulty 3 pool. If your hash hits the Bitcoin difficulty, you found a block.
sr. member
Activity: 406
Merit: 250
This post is the most interesting and do you know why? The GPU code is UNCHANGED between 2.7.6 and 2.8.3. The only thing that is different is the stratum code. Now why would that make your GPUs SICK now when they didn't previously? Because with the stratum code, the device is busier than ever. There is no possible way to keep the device as busy as doing it in the c code internally in cgminer. So your devices are no longer getting any rest between work.

EDIT: This is precisely why stratum was developed by the way; in order to be able to keep much higher hashrate devices busy. If you want to test this theory, use version 2.8.3 and connect to the proxy in http mode by using --fix-protocol.

That is quite interesting, i must admit.
My GPUs are mostly at 99% load, i have the CCC open all the time, but must admit i don't look all the time.
Is it that the load drops sometimes for a short period and that makes the GPU cool or recover so it can look like lasting longer, and on stratum it is on 99% load 24h a day?

Also, i have a request:
Is it possible to show the fan percentage with the rpm's? I use CCC to monitor fan percentage, because that is my measure when and if  my computers need to be air condititoned.
As long as fan percentage is below my set max of 55% i declare them sufficient and do not reduce engine clock. (But on the other hand it seems i need to to have stratum work?)
hero member
Activity: 591
Merit: 500
I'm not understanding this.  Aren't these difficulty numbers multipliers of current difficulty?  How/why would a miner be working on a higher difficulty than the pool is requesting?
It has to be able to hit a higher difficulty or else you'd never find a block. Tongue
legendary
Activity: 3583
Merit: 1094
Think for yourself
Cgminer now displays the actual share difficulty target it hit as well as the current pool difficulty like so:
Code:
 [2012-10-12 21:00:31] Accepted 2687d42a Diff 6/3 GPU 1 pool 2
 [2012-10-12 21:00:41] Accepted 00f98044 Diff 262/3 GPU 3 pool 2
 [2012-10-12 21:00:45] Accepted 3840818b Diff 4/3 GPU 2 pool 2
 [2012-10-12 21:00:55] Accepted 35777786 Diff 4/3 GPU 1 pool 2

I'm not understanding this.  Aren't these difficulty numbers multipliers of current difficulty?  How/why would a miner be working on a higher difficulty than the pool is requesting?
hero member
Activity: 591
Merit: 500
$readonly = false;
$poolinputs = true;

As per the API-README at the end about miner.php

Also see API-REAME about enabling privileged access to cgminer via the API
(since changing cgminer is privileged access)
Wow, that's really cool. Thanks. Smiley
hero member
Activity: 518
Merit: 500
Manateeeeeeees
Just updated to 2.8.3.  I have to say I really don't like the new precision on numeric output:

Quote
GPU 0:  73.0C 2900RPM | 604.2M/661.2Mh/s | A:3 R:0 HW:0 U: 7.03/m I: 9
GPU 1:  74.0C 2912RPM | 595.1M/659.3Mh/s | A:5 R:0 HW:0 U:11.72/m I: 9
GPU 2:  74.0C 2922RPM | 611.8M/658   Mh/s | A:10 R:0 HW:0 U:23.44/m I: 9

Could you put a .0 on there instead of all that blank space?  It's a minor thing, but I like it better that way.
Vbs
hero member
Activity: 504
Merit: 500
I was running cgminer 2.5.0 till today, as the latest versions had a performance hit on my 5850's (win x64, SDK 2.5 on Cat 12.1), since the last phatk update (~400MH/s to ~320MH/s).

Since I had some free time today I went to check what had changed from phatk120223 to phatk120823 and found what was causing me that performance hit. Commenting the problematic lines fixed it:

Code:
//#if defined(OCL1)
#define SETFOUND(Xnonce) output[output[FOUND]++] = Xnonce
//#else
// #define SETFOUND(Xnonce) output[atomic_add(&output[FOUND], 1)] = Xnonce
//#endif

I still find it strange how the atomic_add could be responsible for that much of a mhash hit, since it will only be called on found nonces. Tongue

(Running 2.8.3 now, will keep monitoring performance for any issues)
legendary
Activity: 1540
Merit: 1001
I'm getting very high CPU usage on Win7 using ver 2.7.7. Basically 1 core is almost always near 100%. Anyone else with the same issue?

Usually this is intensity related.  Anything higher than 9 will spike CPU usage.

M
newbie
Activity: 43
Merit: 0
In the night all my cgminers on win32 died all at the same time.

I believed you the first time. What I don't have, as yet, is an explanation or anything in particular to debug since I can't reproduce it yet.

Maybe if you run one instance in debug mode (-D -T) you can see what the last thing is that is logged.

For finding the error i am running one instance with -fix-protocol.
I had the idea maybe the Exception comes when the DSL-Router makes a reconnect and stratum looses his tcp connection.
Logging is also enabled now.

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
In the night all my cgminers on win32 died all at the same time.

I believed you the first time. What I don't have, as yet, is an explanation or anything in particular to debug since I can't reproduce it yet.

Maybe if you run one instance in debug mode (-D -T) you can see what the last thing is that is logged.
newbie
Activity: 43
Merit: 0
In the night all my cgminers on win32 died all at the same time.
Jump to: