Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 756. (Read 5805728 times)

donator
Activity: 1218
Merit: 1079
Gerald Davis
Quote
Are you saying that these stales do not show up on cgminer ?
I can't comment on slush as he uses his own custom pool software.  I suspect he's probably eating the aux chain stales as well.  Even if he's sending LP when the aux chains update and dealing with the double LP problem he still won't be able to force cgminer clients to switch.

It's probably worth noting that a proportional pool won't actually incur obvious costs from this problem.  Unless they go looking the problem will likely only become apparent after a while when they calculate pool luck for the alt chains.
[/quote]

I think I can provide some insight.  IIRC Slush doesn't calculate shares on NMC.   The NMC reward for all NMC blocks during one BTC block is simply split based on how the BTC block % (modified because not all miners are collecting NMC rewards).  Slush eventually wants to change this but this method was a way to get NMC off the ground quickly.  If the block is good for BTC then it isn't rejected as stale by Slush pool.

Thus an individual miner isn't going to see a problem on Slush pool.  I certainly haven't for my two rigs pointed there.  However if what you say is right then the efficiency is NMC collection is affected.  Meaning the pool as a whole isn't getting the max NMC for the raw hashpower it has which lowers average NMC reward for all miners (regardless of if they use cgminer or not).

Bingo on the "luck" for NMC.  I don't know if anyone has run any NMC luck stats.  Doing so might be tough with limited information available on NMC block (at least for slush pool) however I suspect your a right as length of history increases we would expect to see pools being "unlucky" when it comes to NMC blocks.   If 5% of hashes worked on by a pool have invalid NMC data then it isn't "luck" (they never had a chance) but it will show up as a 5% unlucky (5% less NMC collected then the expected value given hashrate if all hashes are valid).
sr. member
Activity: 266
Merit: 254
Quote
Are you saying that these stales do not show up on cgminer ?

It depends how the pool handles them.  As far as I can tell for most pools that use MMP the problem will be hidden so the miner is unlikely to see the stales even though they're occurring.  When psj-mm edition got to the part about handling these (because psj-mm has it's own native implementation and doesn't use merged-mining-proxy) we started seeing mass stales and began investigating.  Of course psj got the blame initially because it appeared other pools were working fine (even I thought it was something wrong with psj).  But after getting all the per chain block tracking working it became pretty clear what was going on...

I know for a fact that NMCBit is wearing the losses at the moment.  They are using PSJ + MMP.  Psj is NOT compatible with MMP, the problem being that psj doesn't know when to switch to new blocks because MMP doesn't send the right signals.  He is copping a LOT of stales but you are not seeing them and because the pool is PPS it's him that's wearing cost, he's not passing it on to miners.  When he switches to psj-mm edition this will probably change a little as you'll actually start seeing yr own real stales.  To clarify atm nmcbit is suffering two kinds of stales, the normal kind that is miner related (which miners should be paying for) and the bonus stales from the PSJ MMP incompatibility.  He's taking the losses for both right now.  With psj-mm the second kind won't happen but the first kind will be transferred back to the miners where it belongs.

I can't comment on slush as he uses his own custom pool software.  I suspect he's probably eating the aux chain stales as well.  Even if he's sending LP when the aux chains update and dealing with the double LP problem he still won't be able to force cgminer clients to switch.

It's probably worth noting that a proportional pool won't actually incur obvious costs from this problem.  Unless they go looking the problem will likely only become apparent after a while when they calculate pool luck for the alt chains.
hero member
Activity: 868
Merit: 1000
Shads,

Are you saying that these stales do not show up on cgminer ?

Because I am mining at both Slush and NMCBit and I have stales of less than 0.1%

I know Davinci at NMCBit is using your PoolservJ, but Slush is running his own modified pool-software and the stales there (as far as cgminer is reporting them) is equally imppresively low

On 1 miner I have 8.990 Accepted and 8 Rejected

I hardly ever see a rejected shares after LP finds a new block, I did in the beginning with S=60 but after I brought it down to S=8 it took care of that problem

Brat
newbie
Activity: 73
Merit: 0
I'd like to defend Merged Mining though I'm not a fan of it. Despite all the drawbacks, I can see one big advantage - more profit, so Merged Mining helps to pay for electricity bills in the time of crisis Wink
sr. member
Activity: 266
Merit: 254
c_k: yes and I don't see why it would cripple cgminer.  As I understand it (I'm no expert on cgminer) it lowers stales two ways, by prefetching work and by monitoring other pools to see if they move to a new block before the one yr mining does.

I'm not proposing to disable prev_block_hash checks, just to accept that when a LP comes in (as long it's a proper longpoll response with a full payload) that it should start working on that work and replace it's cached work.  Prefetching still happens, it isn't nerfed at all, it just happens a little more often.  Monitoring other pools still gives potential advantages, it's just limited to detecting new blocks on the parent chain (bitcoin).  

So no functionality is lost as far as I can see.  Some of the benefits of cgminer won't work as well for aux chain changes but there'll be no loss of functionality for the main chain so the net effect is +ve and still in excess of what other mining sw does.

kano: I'm not here to debate the merits of merged mining.   I'm not a fan either.  But the market has spoken and miners want it so if cgminer is crippled for merged-mining compared to other mining s/w (and I'd call stales several times higher competitively crippled) it's going to rapidly lose users.  I'm just presenting a solution to yet another problem MM has caused.  It's not really even in my interests do so.  Most of the miner specific issues I've had to deal with for poolserverj are issues with cgminer trying to be too clever so quite frankly it would make my life easier if conman gave MM the finger and refused to deal with it.  Still it would be a shame to lose such a well engineered tool to obsolescence.
c_k
donator
Activity: 242
Merit: 100
From my understanding, the simple solution is making a command line switch for a kind of obedient/dumb mode to disable internal checking and obey long polls sent by the pool(s).

Is this essentially all that needs to happen?
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
The issue is that it's a case of undoing/removing the code that helps make cgminer so good at avoiding stales and keeping the stats high.
It uses that information to do that.
Disabling it is, well, as I said above, crippling it.

My personal opinion of it would be that hopefully this merged mining fad will disappear soon.
It's pointless anyway.
I still don't see why people think it's a good idea to:
1) Define Namecoins as valueless (you get them for free why pay for them?) thus why bother mining them.
2) Add extra data into the bitcoin block-chain (yeah it's tiny - 46 bytes per block, but it is still extra)
3) Tie the nameconners to the bitcoin block-chain so they can keep themselves alive - lots of people would have been mining them before if they actually wanted them ... they didn't ... ergo who gives a damn? It's keeping it alive when it should be dying (or already dead and gone)
sr. member
Activity: 266
Merit: 254
yes it will... I would personally suggest that miners who want this fixed pool together and put up a bounty.  I've looked through cgminer code and I'd guess the work involved in building it is in the same order as poolserverj (i.e. months).  I enjoyed writing poolserverj but one of the least fun things about it is supporting new systems like merged-mining which you don't have any interest in.  I'm fairly sure I wouldn't have bothered supporting MM if it weren't for tempting bounties waved in front of me.

legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
What needs fixing?

The only possible problem is if the pool passes back LP/block info on 2 different block-chains.

If that is the case, then your so called "fix" is to cripple cgminer - so no I can't see that happening.
From the previous page ...

It actually keeps track of the blocks on the chain it is mining ... so yeah that will stuff it up wont it ...
sr. member
Activity: 266
Merit: 254
I'm not sure if conman feel terribly inclined to make changes that accomodate merged mining but I've come across a serious issue that is cgminer specific so perhaps in this case he may make an exception.  If some of my assumptions here are wrong my apologies but I've observed it's behaviour and looked through the source code and I'm fairly sure I've interpreted it correctly...

The problem is summarised here:
https://bitcointalksearch.org/topic/m.593497

Code:
cgminer may be a bit too clever for it's own good.  It does it's own checks on whether work is valid and I presume it uses prev_block_hash to check or maybe the X-Blocknum header.  If an NMC block is found but it isn't a BTC solution then the pseudo block number will advance but the X-Blocknum header (which is for BTC block in psj) and the prev_block_hash won't change.  So cgminer may think it's the same block and carry on using it's cached work.

and a bit more detail here:
https://bitcointalksearch.org/topic/m.594007

The end result is that cgminer refuses to acknowlege that the block has changed as it doesn't accept LP as an authoritive indicator that it should discard it's local work and use the new work.  This means, particularly in the case of PPS pools, that pools are likely to penalize cgminer users by either not awarding the share at all or only giving them a partial credit.  

Depending on which policy they choose this could end up costing the miner either a significant chunk of their namecoin rewards or possibly even a large chunk of bitcoin rewards.  I'll go through the scenario and the possible effects:

pool rejects partial shares:
What happens is this:  The BTC and NMC blocks change but at slightly different times.  If the BTC block changes first cgminer get LP and acts on it as it sees a new prevblockhash.  A few seconds later the NMC block changes.  This doesn't change the prevblockhash so cgminer carries on with the work it already has.  When it submits a share the pool sees that it's only valid for one of the blockchains and rejects it as stale.  This continues for as long as cgminer would normally hold onto the same work (up to 60 seconds) or until the BTC block changes again.  In between BTC block you'd usually exepct several NMC blocks as the difficulty is lower.  Se even if cgminer has given up the work after 60 seconds, as soon as another NMC comes along the same situation occurs.  The miner is working on work that is only valid for one chain.

This is not theoretical it's really happening and it's being hugely under reported on most pools (see next para for explanation of that), I've seen it in testing and on production pools.  Overall stales are a little higher they were pre-merged mining which is partly to do with more frequent block changes and partly due to some design clashes between merged mining and longpolling.  But cgminer stales are typically many times higher than for other miners.  The 'dumb' miners (i.e. the one's just accept new work from the LP and don't check if it's a new block) work fine because they trust that the pool is right when it says 'new block'.

Many merged mining pools which aren't using poolserverj probably have this problem also but it's invisible to them.  Merged-mining-proxy does not do any sort of LP and unless the pool ops have made some fairly invasive changes to pushpool it won't be internally aware of changing block on different chains either.   I suspect they've just been accepting shares they shouldn't have.  When they all start to realise what's going on expect some policy changes from these pools.

The fix seems fairly simple on the surface but the devil is always in the details...  I think conman is not keen on blindly accepting LPs as some pools send out bullshit LPs occasionaly (even poolserverj does if the block doesn't change after 10mins).  But it needs to be an option.  Perhaps a command line switch that miners can use only if they are merged mining.  All it needs to do is replace the work it's currently hashing with the contents of the longpoll (regardless of whether prevblockhash is different) and clear it's cache.  There's no cost to doing this except perhaps a few getworks to fill up it's cache again.


hero member
Activity: 812
Merit: 510
It was something that I hadn't fully planned out, but since I found the option to load balance, I'm happy.
donator
Activity: 1218
Merit: 1079
Gerald Davis
I have a feeling this might be a windows problem, and I'll be forced to move to linux for the future, but I cannot run more than two separate instances of CGMiner...

The first two run just fine, but every subsequent one I try to open will try to start cgminer, then the miner will crash.

I have tried moving each miner into it's own file and it still didn't help the problem.

OS is Windows 7 x64.

Why are you running multiple instances of cgminer?  The entire reason for cgminer is to avoid the need for multiple instances.  The ability for mine 8 GPU from a single console plus manage multiple pools, fallovers, overclocking, fan speeds, etc.  Throw in monitoring of output, temps, speeds, stales, etc.

hero member
Activity: 812
Merit: 510
I have a feeling this might be a windows problem, and I'll be forced to move to linux for the future, but I cannot run more than two separate instances of CGMiner...

The first two run just fine, but every subsequent one I try to open will try to start cgminer, then the miner will crash.

I have tried moving each miner into it's own file and it still didn't help the problem.

OS is Windows 7 x64.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
... read the FAQ in the README ...
legendary
Activity: 2955
Merit: 1049
I am running 2.0.7 and really liking it.  With my setup, when a worker dies it won't restart; it stays on "DEAD" forever.  If i try to quit by pressing q, it hangs and "ps" shows the process as defunct.  "kill" and "kill -9" don't seem to restart it.  If I reboot the computer while it is locked up, when I try to start cgminer again it doesn't see my GPUs and instead starts mining the CPU.  If I reboot it again it comes up working, but then the autofan and the overclocks don't work.

here is the same:
after a short time on the remote machine  GPU 1 is dead and  it stays on "DEAD" forever, - after q it hangs and "ps" shows the process as defunct.  "kill" and "kill -9" don't seem to work.

Code:
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                  
2535 xxx       20   0     0    0    0 Z  100  0.0   2247:46 cgminer
if I try to kill the process or the parent’s process with the command “kill -9 xxxx, it does not work
and with $ps -A | grep defunct nothing seems changed.

I have to hard reset the machine for new booting.

1. so is there any other way to kill the with no need to reboot?

2. I have now upgraded this machine to Ubuntu 11.10 and cgminer now has 100% CPU usage - any hints how to change this?
(has not been with Ubuntu natty...)
TIA

legendary
Activity: 2688
Merit: 1240
Oh no its not running on my mac, it is just a ssh shell to my linux box :-)
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Got an explanation for this ?

http://img820.imageshack.us/img820/8074/bildschirmfoto20111025u.png

It happens sometimes after 5, sometimes after 10 and sometimes after 15 minutes.

CGMiner sometimes "catches itself" and runs again fine for 5-10 Minutes or it crashes completely leaving the system
unstable and I have to reboot completely.

If I do not use auto-fan and auto-gpu it works fine..

(also tried with 2.0.7)
I guess you should also ask:
Is anyone else with a Mac getting the same problems with those versions (and who built that version you are using?)
(no I've no idea but you'll need someone with the same setup as you to be able to help you)
legendary
Activity: 2688
Merit: 1240
Got an explanation for this ?

http://img820.imageshack.us/img820/8074/bildschirmfoto20111025u.png

It happens sometimes after 5, sometimes after 10 and sometimes after 15 minutes.

CGMiner sometimes "catches itself" and runs again fine for 5-10 Minutes or it crashes completely leaving the system
unstable and I have to reboot completely.

If I do not use auto-fan and auto-gpu it works fine..

(also tried with 2.0.7)
hero member
Activity: 742
Merit: 500
I am running 2.0.7 and really liking it.  With my setup, when a worker dies it won't restart; it stays on "DEAD" forever.  If i try to quit by pressing q, it hangs and "ps" shows the process as defunct.  "kill" and "kill -9" don't seem to restart it.  If I reboot the computer while it is locked up, when I try to start cgminer again it doesn't see my GPUs and instead starts mining the CPU.  If I reboot it again it comes up working, but then the autofan and the overclocks don't work.

This is what it looks like right now after I started it after a crash

Code:
 cgminer version 2.0.7 - Started: [2011-10-24 22:04:16]
--------------------------------------------------------------------------------
 (5s):891.6 (avg):877.0 Mh/s | Q:56  A:37  R:6  HW:0  E:66%  U:8.32/m
 TQ: 2  ST: 3  SS: 1  DW: 1  NB: 3  LW: 9  GF: 1  RF: 9
 Connected to http://arsbitcoin.com:8344 with LP as user xxxx.xxxx
 Block: 000006862cdca4523deef46bf7e85089...  Started: [22:06:13]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0: 321.6/320.7Mh/s | A:16 R:2 HW:0 U:3.60/m I:4
 GPU 1: 321.6/316.2Mh/s | A:14 R:1 HW:0 U:3.15/m I:4
 GPU 2: 248.5/243.5Mh/s | A:7 R:4 HW:0 U:1.57/m I:4
---------------------------------------------------------

When I first ran the program, it displayed my fan RPMs and was able to change the gpu and memory clocks.  But those options are not showing.

If I open up "AMDOverdriveCtrl" I can overclock my cards and get above 1000Mh/s, but I like having everything in cgminer.  Is there some file that I need to delete or something?  I don't know how got it working again fully last time, all I did was reboot it.  What am I missing?

I expect to see something like

Code:
 cgminer version 2.0.7 - Started: [2011-10-24 22:04:16]
--------------------------------------------------------------------------------
 (5s):891.6 (avg):877.0 Mh/s | Q:56  A:37  R:6  HW:0  E:66%  U:8.32/m
 TQ: 2  ST: 3  SS: 1  DW: 1  NB: 3  LW: 9  GF: 1  RF: 9
 Connected to http://arsbitcoin.com:8344 with LP as user xxxx.xxxx
 Block: 000006862cdca4523deef46bf7e85089...  Started: [22:06:13]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0: 321.6/320.7Mh/s 4000RPM | A:16 R:2 HW:0 U:3.60/m I:4
 GPU 1: 321.6/316.2Mh/s | A:14 R:1 HW:0 U:3.15/m I:4
 GPU 2: 248.5/243.5Mh/s 2500RPM | A:7 R:4 HW:0 U:1.57/m I:4
---------------------------------------------------------

Thanks again for this awesome program.

EDIT: Well I was able to get the fan RPMs and clocks working again.  I had to "export DISPLAY=:0" (which I also have to do for AMDOverdriveCtrl) since I am starting this from a remote ssh session and I'm not on the box.  I still don't know why the GPUs aren't detected after a crash though.
donator
Activity: 1218
Merit: 1079
Gerald Davis
If you don't see anything in netstat other than the TCP Wait, I would guess there is only communication to check for new blocks and to submit a found block.  In that case, presumably a found block would be cached if communication was interrupted, but I don't know if it would be quickly discarded the way a share is.  Also, while I don't know, I would guess that something would indicate the communication problem when it became apparent to the miner (puddinpop's cuda miner will tell you communication is interrupted and continue mining and telling you it doesn't have communication).  If it does, watching for that would make more sense than keeping track of rejected shares that don't benefit the server anyway.  However, since communication isn't likely to be a miner problem, anything that might cuase a communication problem could probably be monitored elsewhere (monitor bitcoind or Internet connectivity or whatever you need for what you're doing).


who wants to go thru all that...  a quick glance at cgminer humming allong showing rejected shares is alot easier......  How about a menu option for it?

How about an explicit error message instead.  "Unable to connect to bitcond @ 192.168.0.x".  Shares don't exist for solo mining so reporting them is ambiguous at best and misleading at worse.  I agree the current implementation of "nothing" isn't ideal but going back to providing false data isn't ideal either.

Simply have the miner report solo errors w/ timestamp.
Jump to: