Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 223. (Read 5805531 times)

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
So... would it be useful to run w/o a backup pool to see if the problem persists? 
It certainly would not hurt to try. I have yet to upload debug builds of 3.4.3 and will do so later today, but the underlying stratum code has not changed since 3.4.2, it just had added features rather than changing the codebase.
legendary
Activity: 1540
Merit: 1001
I'm not sure of the threading level within stratum when on a single pool.
ckolivas will know the answer to that.
All writes are done from one thread. All reads are done from one thread. Therefore there is no chance of there being a race on reads or a race on writes. I very much doubt windows has an issue of writing and reading the same socket at the same time, nor would there be a problem using two different sockets at the same time. Debugging points to something in socket code, possibly on reconnecting or trying to communicate with a backup pool. I keep auditing the code and come up none the wiser, but I will continue to do so, or give up entirely and stop supporting windows (yeah right, 85% of cgminer users are on windows sigh).

So... would it be useful to run w/o a backup pool to see if the problem persists?  FYI I've only had it happen once.  But I regularly only let it run 24 hours before I restart it for one reason or another.  Until recently.  I'm remote from my miners for a bit ... from the looks of things my erupters ran for about 3 days before crashing.  I'm hoping my wife can restart it, but it's asking too much to get a screenshot of the error.  Last time I had the crash it was after 36 hours.  And my machine with only 2 erupters is still running.

M
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I'm not sure of the threading level within stratum when on a single pool.
ckolivas will know the answer to that.
All writes are done from one thread. All reads are done from one thread. Therefore there is no chance of there being a race on reads or a race on writes. I very much doubt windows has an issue of writing and reading the same socket at the same time, nor would there be a problem using two different sockets at the same time. Debugging points to something in socket code, possibly on reconnecting or trying to communicate with a backup pool. I keep auditing the code and come up none the wiser, but I will continue to do so, or give up entirely and stop supporting windows (yeah right, 85% of cgminer users are on windows sigh).
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
I'm not sure of the threading level within stratum when on a single pool.
ckolivas will know the answer to that.
legendary
Activity: 1540
Merit: 1001
Hmm. Thing is, the stratum code is all done raw now, not using any other libraries precisely because I had problems with libcurl and friends. So the code does raw socket calls which directly use winsock calls. So the problem is either in cgminer or the MS dll.

I hate to be the bearer of bad news, but I doubt it's the winsock dll.  A lot of programs use that, and if there was a native problem in there, we'd be getting crashes elsewhere from other apps.
Yes a lot use the winsock dll. Note the issue is in mswsock.dll though, and it is actually very rare that applications use raw sockets on windows so I wouldn't say it's quite the same - firefox only recently started using it and lo and behold it's been getting crashes in the same library. Nonetheless, logically my code which is always in heavy development is far more likely to be at fault than an OS provided library (though we've been burnt by that enough already to know it's not impossible). Every attempt so far to get debugging with a full debug build, it always ends up crashing with an overflow in mswsock.dll though, so not sure where to go next...

Firefox is under continual development as well.  socket coding isn't simple, especially windows socket coding.  I personally have not seen any app crashes with firefox, and that's all I use.  if it is indeed a problem with winsock, for microsoft to address it you'll need to be able to provide poc code that causes the crash. 

Kano: would it help if I changed my cgminer instance to not have any backup pools?  then it's only communicating with one pool. 

M
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
mwsock.dll ... hmm I wonder how many times that's been reported in here ... and your not running the latest version Smiley

I have now updated to 3.4.3. I just searched through this thread and didn't seem to find any solutions for fixing the mwsock.dll issue. Any thoughts?
Unfortunately we have no solution for a microsoft provided dll crashing. It should not be possible for our software to crash the dll unless there's a bug in the dll. The only suggestions are checking you have the latest dll and there are no virus/trojans that have attached to it.

I've been using windows and troubleshooting windows problems long enough to know that the DLL that "crashed" is regularly not the DLL with the problem.  I think the error reporting mechanism from windows is flawed (imagine that), and it's a routine that's calling winsock that is the problem, not winsock.  Thinking through it, a possible example would be a multithreaded app crashing while one thread is in winsock.  Or it could be winsock crashing when trying to perform a requested call back to the owning app.

I've also seen this type of misleading error message when DLL #1 needs DLL #2.  The error message will state DLL #1 isn't found, and when you look, you'll see DLL #1 is there.  The reality is DLL #1 is looking for DLL #2 and not finding it, but the error message only captures part of the info is gives misleading info.

Overall, I don't believe this is a mswinsock issue.  It's probably not a cgminer issue either, per se, but one of the 3rd party DLLs you are using that is calling mswinsock.

M
... of course there are other possibilities as I came across with libusb ...

I am doing something with libusb that ... it would appear ... no one else on the planet does?

Running a multi-threaded application to many devices through the libusb library that states it is thread safe.
However, it is not thread safe.
I found that the comments on the libusb site that stated it was thread safe in fact suggested to me that it wasn't.
One of the libusb fixes was indeed to make libusb calls thread safe in cgminer using locks.

So ... a mining application that is talking to many pools at once through a single dll that "May" be thread safe, might actually NOT be thread safe and since it may be indeed rare for applications to do this on their own, it may be that cgminer is the rare case finding the problem ...

Of course it MAY not be - but certainly no proof to discount it at this stage.

Edit: of course I am replying to this also:
Hmm. Thing is, the stratum code is all done raw now, not using any other libraries precisely because I had problems with libcurl and friends. So the code does raw socket calls which directly use winsock calls. So the problem is either in cgminer or the MS dll.

I hate to be the bearer of bad news, but I doubt it's the winsock dll.  A lot of programs use that, and if there was a native problem in there, we'd be getting crashes elsewhere from other apps.
...
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Hmm. Thing is, the stratum code is all done raw now, not using any other libraries precisely because I had problems with libcurl and friends. So the code does raw socket calls which directly use winsock calls. So the problem is either in cgminer or the MS dll.

I hate to be the bearer of bad news, but I doubt it's the winsock dll.  A lot of programs use that, and if there was a native problem in there, we'd be getting crashes elsewhere from other apps.
Yes a lot use the winsock dll. Note the issue is in mswsock.dll though, and it is actually very rare that applications use raw sockets on windows so I wouldn't say it's quite the same - firefox only recently started using it and lo and behold it's been getting crashes in the same library. Nonetheless, logically my code which is always in heavy development is far more likely to be at fault than an OS provided library (though we've been burnt by that enough already to know it's not impossible). Every attempt so far to get debugging with a full debug build, it always ends up crashing with an overflow in mswsock.dll though, so not sure where to go next...
legendary
Activity: 1540
Merit: 1001
Hmm. Thing is, the stratum code is all done raw now, not using any other libraries precisely because I had problems with libcurl and friends. So the code does raw socket calls which directly use winsock calls. So the problem is either in cgminer or the MS dll.

I hate to be the bearer of bad news, but I doubt it's the winsock dll.  A lot of programs use that, and if there was a native problem in there, we'd be getting crashes elsewhere from other apps.

Since you code in linux for linux, and port to Windows, what's doing the conversion of your code to what windows uses?  From my experience, problems that make themselves apparent after a period of time indicate a memory leak (a), a multithreaded issue (b), or an unmanaged code issue (c). 

(a) should be apparent by increased memory usage or your internal memory leak testing (which I think I've seen you guys say you have)
(b) is awful to troubleshoot.  such problems are caused by different multithreading problems like race conditions or deadly embraces
(c) is generally code accessing memory regions that it shouldn't be accessing, like exceeding an array boundary

(b) and (c) are particularly troublesome because the problem may not always cause a symptom and the OS might not always detect it and clamp down on the offending app

third party code analyzers are the best way to spot these for complex code.  But if there are third party libraries involved, I doubt the code analyzers would be able to detect such things.

somehow I doubt I'm telling you anything you don't already know though.

M
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Hmm. Thing is, the stratum code is all done raw now, not using any other libraries precisely because I had problems with libcurl and friends. So the code does raw socket calls which directly use winsock calls. So the problem is either in cgminer or the MS dll.
legendary
Activity: 1540
Merit: 1001
mwsock.dll ... hmm I wonder how many times that's been reported in here ... and your not running the latest version Smiley

I have now updated to 3.4.3. I just searched through this thread and didn't seem to find any solutions for fixing the mwsock.dll issue. Any thoughts?
Unfortunately we have no solution for a microsoft provided dll crashing. It should not be possible for our software to crash the dll unless there's a bug in the dll. The only suggestions are checking you have the latest dll and there are no virus/trojans that have attached to it.

I've been using windows and troubleshooting windows problems long enough to know that the DLL that "crashed" is regularly not the DLL with the problem.  I think the error reporting mechanism from windows is flawed (imagine that), and it's a routine that's calling winsock that is the problem, not winsock.  Thinking through it, a possible example would be a multithreaded app crashing while one thread is in winsock.  Or it could be winsock crashing when trying to perform a requested call back to the owning app.

I've also seen this type of misleading error message when DLL #1 needs DLL #2.  The error message will state DLL #1 isn't found, and when you look, you'll see DLL #1 is there.  The reality is DLL #1 is looking for DLL #2 and not finding it, but the error message only captures part of the info is gives misleading info.

Overall, I don't believe this is a mswinsock issue.  It's probably not a cgminer issue either, per se, but one of the 3rd party DLLs you are using that is calling mswinsock.

M
sr. member
Activity: 280
Merit: 250
Sometimes man, just sometimes.....
Running 3.4.3 but notice now (it seems this issue goes back a few versions) that the accepted blocks field now shows the latest difficulty (x number of blocks found it seems) instead of total block count? Anyone got a fix for this?

Not sure what coin you're mining, but on the bitcoin network, cgminer now shows the number of "diff 1 shares" found for the accepted field.  So, if you're pool has a higher share difficulty, the counter will increase by whatever the difficulty of the associated job was each time a share is found.

That's not a bug.  If you really want the number of actual shares found (even though they are not as useful in a world that is moving towards variable difficulty mining), you can get them from the API.

He appears to be solo mining so there really isnt "accepted" shares.  When he finds a block, his accepted shares are of that of how many coins were in that block.
member
Activity: 103
Merit: 10
100% CPU problem.  I'm using Xubuntu 12.04, scrypt mining with 6x7950s cgminer 3.4.2.  The causes the rig to lock up after a few hours.
top gives me
Code:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                            
 1791 xxx  20   0 1485m 119m  56m S   92  0.5  10:16.21 cgminer                                                                                           
   10 root      20   0     0    0    0 S   54  0.0   6:06.87 ksoftirqd/1                                                                                       
    3 root      20   0     0    0    0 R   54  0.0   5:56.12 ksoftirqd/0 


Any ideas as to how to get the CPU load down.
When I had that problem or at least a similar problem I was building from GIT while the files where in flux. This happened on a raspberry pi with raspbian not Ubuntu but the high irqd and very high processor seem like that. Any chance you built from GIT?
I didn't build from GIT.  I just grabbed the compiled off of http://ck.kolivas.org/apps/cgminer/
I did see your problem and solution a few pages back. Thanks for responding.
sr. member
Activity: 281
Merit: 250
No idea then sorry. Perhaps your Pi is coincidentally failing or as you say, some upgraded other package is responsible.

I was able t get a screen cap of the error when the system locks up



I'm not sure what this error means but I'm leaning towards that there was some kind of disk access error. Perhaps the SD card on my Pi is corrupted?

Just compiled the 3.4.3 this morning. To see if my USB issues with Eruptors where fixed.
I Was very pleased to see everything working without errors at max speed.

But i seem to have te same problem on my Raspberry Pi with Deb Weezy.
It crasheds at random , but Always around 25 minutes or so.

No bad SD, faulty hardware or so. I have three Pi's and all three have the same issue when using cgminer 3.4.3 and bfgminer runs like a charm.
I now, keep using that, but i like cgminer more, stopt using it when the libusb issue came on and i wan't it back Wink
newbie
Activity: 42
Merit: 0
I'm not sure what the confusion is but ...
DEV
 PGA=0,Name=AMU,ID=0

DEVDETAILS
 DEVDETAILS=0,Name=AMU,ID=0

There is no trick to the association ... Name+ID from one to the other

... except ... missing from DEVS for a GPU, is it has no Name or ID

In miner.php I generate it easily enough:
Code:
function joinsections($sections, $results, $errors)
{
 global $sectionmap;

 // GPU's don't have Name,ID fields - so create them
 foreach ($results as $section => $res)
.
.
.

The names GPU, PGA, ASC in the API are to allow for device selection in the API via a number.
So if you want to affect the first PGA, with a PGA command, it is PGA=0, for the 10th one it is PGA=9
i.e. independent of the screen display - the screen display is Name,ID

I was planning once to also allow selection via the Name,ID but ... as is common I got side tracked ... and since most of the development I do in cgminer is for myself (I get very few donations) I usually do what I feel needs to be done.
I do of course appreciate the donations I get from a few people indeed - but they are not as common for me ... as others Smiley

Edit: though of course I get hardware donations and indeed work on the drivers for them as required Cheesy

Thank you Kano. It's clear now, it was my bad that I assumed something wrong based on previous tests with a GPU and miss-interpretation of a short look of the api.c - File.
hero member
Activity: 546
Merit: 500
Since you've got comments not open on gitHub, I ask here if you notice this . Can in one of the next revisions you have multiple pools set to show thier own best share and accepted list somewhere. Maybe another option button from the selection above mining. I've got my mining set to rotate since eclipseMC is my default and its finding so many block that alot are not being counted and causing the numbers not to add up on the other.

If its just that my rotating is causing the bad numbers maybe a "stop accepting and finish" before rotating or other management strategies if that's not already how it works.

OP, pm me if this option kicks in or already exists.
legendary
Activity: 3583
Merit: 1094
Think for yourself
Are those who are having problems with the dll using IPV6 by any chance?

I've got IPv6 unbound from the network adapter altogether.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
mwsock.dll ... hmm I wonder how many times that's been reported in here ... and your not running the latest version Smiley
I have now updated to 3.4.3. I just searched through this thread and didn't seem to find any solutions for fixing the mwsock.dll issue. Any thoughts?
Unfortunately we have no solution for a microsoft provided dll crashing. It should not be possible for our software to crash the dll unless there's a bug in the dll. The only suggestions are checking you have the latest dll and there are no virus/trojans that have attached to it.
Not that I think the cgminer devs should in any way be responsible for taking this action, but it is possible to open a ticket with Microsoft.  I don't believe per incident support is actually as expensive as one would expect, and I think the fee is refunded when a problem is proved to be a problem in their OS.  Getting high enough up the chain on the ticket might be tricky and might require reproducing the issue over and over again, so it could be easier for a dev, but I just thought I'd point out that it's more of an option than anyone might realize.  However, I don't use cgminer with Windows (I don't even have Win7 at home), so I can't really offer to help beyond the suggestion in this case, so this is just information for those more vested (either because they are effected or because someone offers them a donation for taking action).
Further to this discussion, I've seen quite a few bug reports quite recently about this same dll in combination with firefox. While I do believe the bug is in the dll itself, I wonder if there's a common variable bringing it on? Are those who are having problems with the dll using IPV6 by any chance?
hero member
Activity: 981
Merit: 500
DIV - Your "Virtual Life" Secured and Decentralize
hero member
Activity: 807
Merit: 500
mwsock.dll ... hmm I wonder how many times that's been reported in here ... and your not running the latest version Smiley
I have now updated to 3.4.3. I just searched through this thread and didn't seem to find any solutions for fixing the mwsock.dll issue. Any thoughts?
Unfortunately we have no solution for a microsoft provided dll crashing. It should not be possible for our software to crash the dll unless there's a bug in the dll. The only suggestions are checking you have the latest dll and there are no virus/trojans that have attached to it.
Not that I think the cgminer devs should in any way be responsible for taking this action, but it is possible to open a ticket with Microsoft.  I don't believe per incident support is actually as expensive as one would expect, and I think the fee is refunded when a problem is proved to be a problem in their OS.  Getting high enough up the chain on the ticket might be tricky and might require reproducing the issue over and over again, so it could be easier for a dev, but I just thought I'd point out that it's more of an option than anyone might realize.  However, I don't use cgminer with Windows (I don't even have Win7 at home), so I can't really offer to help beyond the suggestion in this case, so this is just information for those more vested (either because they are effected or because someone offers them a donation for taking action).
member
Activity: 103
Merit: 10
Running 3.4.3 but notice now (it seems this issue goes back a few versions) that the accepted blocks field now shows the latest difficulty (x number of blocks found it seems) instead of total block count? Anyone got a fix for this?

Not sure what coin you're mining, but on the bitcoin network, cgminer now shows the number of "diff 1 shares" found for the accepted field.  So, if you're pool has a higher share difficulty, the counter will increase by whatever the difficulty of the associated job was each time a share is found.

That's not a bug.  If you really want the number of actual shares found (even though they are not as useful in a world that is moving towards variable difficulty mining), you can get them from the API.
I would like to point out that there is A BUG ASSOCIATED WITH THAT FEATURE.
And that is, that as soon as the "Number of accepted Shares" reaches 10 Million, CGMiner crashes/stops working. I mine on an LTC-Pool with high difficulty and I have to restart CGMiner on a regular basis to make sure the miner just doesn't stop working because it has reached 10 Million diff 1 shares again.

Is there an "official" way to report bugs or is ckolivas going to read this?
Not totally a solution, but zero the stats would be easier than restarting.
Jump to: