Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 417. (Read 5806004 times)

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
New release: 2.10.3 - 26th December 2012

What a difference xmas makes.


Human readable changelog:

When switching from a stratum pool to another one, the last few shares were dropped and the share submission failure message was displayed.
There may have been stratum messages waiting to be parsed and cgminer would only parse them once a new message was received. This fix could potentially give even lower rejects!
There is now a [Z]ero stats option under the display menu which resets all the visible statistics.
Fixed the best_share displayed bug.
The current pool target diff is now shown in the status line like so:
Code:
 Connected to au.ozco.in diff 2 with stratum as user ckolivas.0

The current block target diff is now shown in the status line like so:
Code:
 Block: 01f8c77ced4b9c2a...  Diff:3.37M  Started: [10:01:27]  Best share: 129

Block solve detection is now supported with scrypt mining as well (BLOCK! written at end of share and solved blocks listed under pool stats).
Fixed a crash with rolltime pools.
Stratum support for scrypt mining.


Full changelog:

- Do not give the share submission failure message on planned stratum
disconnects.
- Parse anything in the stratum socket if it's full without waiting. Empty the
socket even if a connection is not needed in case there are share returns.
- Provide a mechanism to zero all the statistics from the menu.
- Display the current pool diff in the status line.
- Display block diff in status line.
- Generalise the code for solving a block to enable block solve detection with
scrypt mining.
- Generate the output hash for scrypt as well and use the one function to set
share_diff.
- Use the flip80 function in regeneratehash and the correct sized hash array.
- Use one size for scratchbuf as a macro in scrypt.c
- Stage work outside of the stgd lock to prevent attempted recursive locking in
clone_available.
- share_diff needs to be performed on a BE version of the output hash to work,
leading to false best_share values as spotted by luke-Jr.
- Remove the unused sha224 functions.
- Use the flip functions in hashtest.
- Simplify the setting of the nonce data field in work on submitting nonces.
- Scrypt code does not enter the hashtest function.
- Go back to cloning available work under staged lock.
- Updated links to AMD APP SDK
- Updated link to ADL SDK
- scrypt_diff uses a uint64_t as well.
- Correct target for stratum support with scrypt mining.
- libztex: fixed a typo
- libztex: check returnvalue of libusb_claim_interface() and release the
interface in case of early exit
legendary
Activity: 1610
Merit: 1000
Guy's,
Thank you very much! Have a nice holidays, merry Christmas and a happy new year!

If something else pops up i will let you know. gdb never lies:)

Best
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I hacked
static bool __clone_available(void)
{
        struct work *work, *tmp;
        bool cloned = false;

        mutex_lock(stgd_lock);
        if (!staged_rollable) {
                mutex_unlock(stgd_lock);
                goto out;
        }
        mutex_unlock(stgd_lock);
For the first function, you should note
Code:
/* Called with stgd_lock held */
static bool __clone_available(void)
Meh the lock got lost in the rework for 2.10, will fix in git. This is likely your problem after all, thanks!
legendary
Activity: 2576
Merit: 1186
I hacked
static bool __clone_available(void)
{
        struct work *work, *tmp;
        bool cloned = false;

        mutex_lock(stgd_lock);
        if (!staged_rollable) {
                mutex_unlock(stgd_lock);
                goto out;
        }
        mutex_unlock(stgd_lock);
I do not know if it is OK but from what i can see there is race condition accessing staged_rollable or at least i think so
I will let you know if it crashesh again


For me it seems that stage_work called from  __clone_available crashed
From the other hand

static void stage_work(struct work *work)
{
   applog(LOG_DEBUG, "Pushing work from pool %d to hash queue", work->pool->pool_no);
   work->work_block = work_block;

static unsigned int work_block; is a global and it has been never locked, so maybe a instance of test_work_current was trying to change it
work->work_block = ++work_block; and race conditions occurred?

I am not both c and threads expert, but as long as i know each global var shall be locked when changed. Maybe when read also but it depends on gcc os and whatever. Is that true? If yes we have a potential code that can cause core dumps because of it

I do not how to lock work->work_block = ++work_block; so i am running same version with staged_rollable locked and i do expect within a day or two same crash to appear
10X
For the first function, you should note
Code:
/* Called with stgd_lock held */
static bool __clone_available(void)
So adding another lock does nothing and only risks recursive lock deadlocks or livelocks.
Someone might mention to Con that it is in fact called without the lock held, and that the deadlock occurred when stage_work (called by __clone_available) tried to lock the same lock.
I fixed this in BFGMiner (since 2.10.0) by staging the cloned work outside of the lock in clone_available.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
It fixes the shares looking high because the (I am guessing backup Bitcoin) target is far higher. I would rather see what the highest achieved was not the highest accepted was. Cool

Seems like a solution in search of a problem.
Both before and after intend to show the highest achieved. But the cgminer code calculated the hash twice, in two different ways, and the hash-to-difficulty code assumed it was one of those ways. When the share doesn't meet the pool target, the hash-to-difficulty code was run on it with its hash calculated the opposite way, and as a result gave the wrong result. My rewrite cleans up the code so it's actually readable, and makes the share->hash value always consistent with SHA256 and the share_diff function expectations.

I also wrote a much-less-changed fix for BFGMiner 2.8.x and 2.9.x: https://github.com/luke-jr/bfgminer/commit/006faac
This one doesn't clean up the code to make it more readable, though. But as a diff, it is easier to see what the problem was.
The function you replaced works fine and does exactly what I wrote it to do 15 months ago.
Check and see if it was a block based on the block header difficulty.
... though as I said, it's way faster that your replacement - which most likely is code you just copied out of elsewhere (and now say it's yours)

Oddly enough that's still required in cgminer - I guess the clone doesn't need that any more Tongue
There is nothing to clean up except your retarded brain not being able to understand the simple original code I wrote.

I think there's a valid point here though. His version of your code is not the issue here but rather a problem with the hash when it hits the share_diff function in cgminer.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I hacked
static bool __clone_available(void)
{
        struct work *work, *tmp;
        bool cloned = false;

        mutex_lock(stgd_lock);
        if (!staged_rollable) {
                mutex_unlock(stgd_lock);
                goto out;
        }
        mutex_unlock(stgd_lock);
I do not know if it is OK but from what i can see there is race condition accessing staged_rollable or at least i think so
I will let you know if it crashesh again


For me it seems that stage_work called from  __clone_available crashed
From the other hand

static void stage_work(struct work *work)
{
   applog(LOG_DEBUG, "Pushing work from pool %d to hash queue", work->pool->pool_no);
   work->work_block = work_block;

static unsigned int work_block; is a global and it has been never locked, so maybe a instance of test_work_current was trying to change it
work->work_block = ++work_block; and race conditions occurred?

I am not both c and threads expert, but as long as i know each global var shall be locked when changed. Maybe when read also but it depends on gcc os and whatever. Is that true? If yes we have a potential code that can cause core dumps because of it

I do not how to lock work->work_block = ++work_block; so i am running same version with staged_rollable locked and i do expect within a day or two same crash to appear
10X
For the first function, you should note
Code:
/* Called with stgd_lock held */
static bool __clone_available(void)
So adding another lock does nothing and only risks recursive lock deadlocks or livelocks.

For the second value, work_block is simply an integer value and even if it's wrong can not lead to a crash since there's nothing being dereferenced at any stage.
sr. member
Activity: 303
Merit: 250
Can't run CG miner...

"Too many values passed to set temp cutoff"

This is in 2.8.7

Just tried 2.10.2... same thing... window just closes. Help?

Also, Diablominer runs just fine on this box... I just would rather run CGminer.

Sounds like you have an additional character value in the conf file.  Either edit it out or delete the file and try running CGMiner again.
legendary
Activity: 1876
Merit: 1000
Is there anywhere in cgminer where I can see if I solved a block other then having to quit to see it in the stats.  
From the [p]ools menu, if you select one pool, it will show up if a block was found for that pool, otherwise it will say nothing about blocks if you found none.

You can also get it from the api
Code:
 [SUMMARY] => Array
        (
            [0] => Array
                (
                    [Elapsed] => 145712
                    [MHS av] => 2682.86
                    [Found Blocks] => 0
                    [Getworks] => 12894
                    [Accepted] => 33838
                    [Rejected] => 370
                    [Hardware Errors] => 780
                    [Utility] => 13.93
                    [Discarded] => 10252
                    [Stale] => 0
                    [Get Failures] => 1
                    [Local Work] => 131292
                    [Remote Failures] => 0
                    [Network Blocks] => 232
                    [Total MH] => 390923779.3094
                    [Work Utility] => 37.24
                    [Difficulty Accepted] => 89064.42331579
                    [Difficulty Rejected] => 921.00453193
                    [Difficulty Stale] => 0
                    [Best Share] => 2550226
                )
sr. member
Activity: 310
Merit: 250
Can't run CG miner...

"Too many values passed to set temp cutoff"

This is in 2.8.7

Just tried 2.10.2... same thing... window just closes. Help?

Also, Diablominer runs just fine on this box... I just would rather run CGminer.
legendary
Activity: 1610
Merit: 1000
Hi Kano,

Here is what happens

[New LWP 2833]
[New LWP 2832]
[New LWP 2831]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/bin/cgminer '.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000040be10 in __clone_available () at cgminer.c:3011
3011    cgminer.c: No such file or directory.
(gdb) bt
#0  0x000000000040be10 in __clone_available () at cgminer.c:3011
#1  0x000000000041e1b3 in main (argc=1, argv=0x7fffececa7b8) at cgminer.c:6859

(gdb) bt full
#0  0x000000000040be10 in __clone_available () at cgminer.c:3011
        work = 0x2424f70
        tmp = 0x40688f
        cloned = false
#1  0x000000000041e1b3 in main (argc=1, argv=0x7fffececa7b8) at cgminer.c:6859
        pool = 0x12b9290
        lagging = false
        ce = 0x1630e90
        ts = 1
        max_staged = 1
        cp = 0x12b9290
        work = 0x2420ed0
        pools_active = true
        handler = {__sigaction_handler = {sa_handler = 0x40b72a ,
            sa_sigaction = 0x40b72a }, sa_mask = {__val = {
              0 }}, sa_flags = 0, sa_restorer = 0x1}
        thr = 0x16325a8
        block = 0x12b7e30
        k = 31
        i = 1
        j = 1
        s = 0x12b7e10 "8332"

[57817.341874] cgminer[2450]: segfault at 260 ip 000000000040be10 sp 00007fffecec84c0 error 4 in cgminer[400000+57000]
[83170.561524] cgminer[32284]: segfault at 260 ip 000000000040be10 sp 00007fff6fb85250 error 4 in cgminer[400000+57000]
I hacked
static bool __clone_available(void)
{
        struct work *work, *tmp;
        bool cloned = false;

        mutex_lock(stgd_lock);
        if (!staged_rollable) {
                mutex_unlock(stgd_lock);
                goto out;
        }
        mutex_unlock(stgd_lock);
I do not know if it is OK but from what i can see there is race condition accessing staged_rollable or at least i think so
I will let you know if it crashesh again


You can see that bug happens at exact same location cgminer[400000+57000]  which makes me think that PC memory is OK.


I can send you core and cgminer binary if you need them.  Do you have any idea what might be wrong?

Thanks

PS this time we went little more further


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/bin/cgminer.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000410ef4 in stage_work (work=0x326fc10) at cgminer.c:3584
3584            applog(LOG_DEBUG, "Pushing work from pool %d to hash queue", work->pool->pool_no);
(gdb) EBUG, "Pushing work from pool %d to hashEBUG, "Pushing work from pool %d to

Program terminated with signal 11, Segmentation fault.
#0  0x0000000000410ef4 in stage_work (work=0x326fc10) at cgminer.c:3584
3584            applog(LOG_DEBUG, "Pushing work from pool %d to hash queue", work->pool->pool_no);
(gdb) bt full
#0  0x0000000000410ef4 in stage_work (work=0x326fc10) at cgminer.c:3584
No locals.
#1  0x000000000040bebd in __clone_available () at cgminer.c:3021
        work_clone = 0x326fc10
        work = 0x32ad230
        tmp = 0x0
        cloned = false
#2  0x000000000041e1f3 in main (argc=1, argv=0x7fff030ba5c8) at cgminer.c:6869
        pool = 0x1f0a270
        lagging = false
        ce = 0x7fe030531120
        ts = 1
        max_staged = 1
        cp = 0x1f0a270
        work = 0x328f2e0
        pools_active = true
        handler = {__sigaction_handler = {sa_handler = 0x40b72a ,
            sa_sigaction = 0x40b72a }, sa_mask = {__val = {
              0 }}, sa_flags = 0, sa_restorer = 0x1}
        thr = 0x26fc698
        block = 0x1f08e10
        k = 31
        i = 1
        j = 1
        s = 0x1f08df0 "8332"
(gdb)
For me it seems that stage_work called from  __clone_available crashed
From the other hand

static void stage_work(struct work *work)
{
   applog(LOG_DEBUG, "Pushing work from pool %d to hash queue", work->pool->pool_no);
   work->work_block = work_block;

static unsigned int work_block; is a global and it has been never locked, so maybe a instance of test_work_current was trying to change it
work->work_block = ++work_block; and race conditions occurred?

I am not both c and threads expert, but as long as i know each global var shall be locked when changed. Maybe when read also but it depends on gcc os and whatever. Is that true? If yes we have a potential code that can cause core dumps because of it

I do not how to lock work->work_block = ++work_block; so i am running same version with staged_rollable locked and i do expect within a day or two same crash to appear
10X




-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Is there anywhere in cgminer where I can see if I solved a block other then having to quit to see it in the stats. 
From the [p]ools menu, if you select one pool, it will show up if a block was found for that pool, otherwise it will say nothing about blocks if you found none.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Is there anywhere in cgminer where I can see if I solved a block other then having to quit to see it in the stats. 
API-README

java API summary
  Found Blocks
member
Activity: 110
Merit: 10
Is there anywhere in cgminer where I can see if I solved a block other then having to quit to see it in the stats. 
legendary
Activity: 1610
Merit: 1000
Kano,

It might be memory. But my pc is running fine almost 5 mounts without a hang. If it is a memory i shall expect some random PC freezes also. Any way, as i told you i am not sure if core dump was generated from same version of cgminer. I am upgrading it frequently Smiley I shall wait for another crash and i will send the proper debug info

thank you very much!
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Kano,
There might be version mismatch between core and executable
I will post correct info when it crashes again
I am not quite sure if that info is useful at all
Flowing your advise i got some gdb core info to share. cgminer 2.10.2 coredumps
[91932.126721] cgminer[2138]: segfault at 260 ip 000000000040be10 sp 00007fff4c6b48b0 error 4 in cgminer[400000+57000]

[New LWP 2680]
[New LWP 2731]
[New LWP 2618]

warning: Error reading shared library list entry at 0x780000003c

warning: Corrupted shared library list: 0x0 != 0x4830408b48d00148
Core was generated by `/usr/local/bin/cgminer xxxxxx.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000410ea1 in stage_thread (userdata=0x1c4a080) at cgminer.c:3576
3576    cgminer.c: No such file or directory.


line 3576   tq_freeze(mythr->q);


10X
At a guess, a corrupted shared library list is corrupted RAM.
So either a runaway pointer, or you need to run a memory check.
Although it's not impossible, it doesn't seem likely that the actual transfer address to the library function was corrupted.
Also note that when you run memtest86+ you should actually let it complete the full test (which takes quite a while) even it the first 4 tests return OK.
I've had one case in the past of RAM that only failed much later, I think it was test #7, but succeeded on all the tests before it.
If it's not RAM then it's gonna need a lot of info to work out what corrupted the library transfer address.
sr. member
Activity: 467
Merit: 250
...... If it were a getwork pool it could take up to 60 seconds. With stratum it's over 2 mins.

Perfect, that's what I'm seeing. Appreciate the response.

sr. member
Activity: 383
Merit: 250

Just upgraded from 2.9.7 to 2.10.2 (pulled from git, compiled myself), and I'm noticing VERY long startup times before mining begins when primary pool is down.. by very long, I'm talking 2-3 minutes. I don't recall it taking that long under previous versions.

Is there some new timeout value I might be missing, or need to set?


Quote
[2012-12-20 14:00:19] Started cgminer 2.10.2
 [2012-12-20 14:00:19] Probing for an alive pool
 [2012-12-20 14:02:49] Pool 0 slow/down or URL or credentials invalid
 [2012-12-20 14:02:49] Unable to get work from pool 0 http://btcguild.com:8332
 [2012-12-20 14:02:50] Switching pool 1 http://de.btcguild.com:8332 to stratum+tcp://176.9.42.247:3333

in this case, the URL changed for btcguild primary connection, once I fixed that, and the pool is 'reachable' it starts up immediately... invalid or down pool as pool #1 still gives almost 3 minutes delay on startup


I think they were under DDOS. Maybe that has something to do with it, as in btcguild was not completely down.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/

Just upgraded from 2.9.7 to 2.10.2 (pulled from git, compiled myself), and I'm noticing VERY long startup times before mining begins when primary pool is down.. by very long, I'm talking 2-3 minutes. I don't recall it taking that long under previous versions.

Is there some new timeout value I might be missing, or need to set?


Quote
[2012-12-20 14:00:19] Started cgminer 2.10.2
 [2012-12-20 14:00:19] Probing for an alive pool
 [2012-12-20 14:02:49] Pool 0 slow/down or URL or credentials invalid
 [2012-12-20 14:02:49] Unable to get work from pool 0 http://btcguild.com:8332
 [2012-12-20 14:02:50] Switching pool 1 http://de.btcguild.com:8332 to stratum+tcp://176.9.42.247:3333

in this case, the URL changed for btcguild primary connection, once I fixed that, and the pool is 'reachable' it starts up immediately... invalid or down pool as pool #1 still gives almost 3 minutes delay on startup

That's just the nature of how long the timeout needs to be on a raw socket to reasonably know it's dead with stratum. There is nothing you can manually change to alter that apart from not putting a dead pool first in line. This is not a new issue with 2.10.2, it's just that btcg is suffering a DDoS. If it were a getwork pool it could take up to 60 seconds. With stratum it's over 2 mins.
sr. member
Activity: 467
Merit: 250

Just upgraded from 2.9.7 to 2.10.2 (pulled from git, compiled myself), and I'm noticing VERY long startup times before mining begins when primary pool is down.. by very long, I'm talking 2-3 minutes. I don't recall it taking that long under previous versions.

Is there some new timeout value I might be missing, or need to set?


Quote
[2012-12-20 14:00:19] Started cgminer 2.10.2
 [2012-12-20 14:00:19] Probing for an alive pool
 [2012-12-20 14:02:49] Pool 0 slow/down or URL or credentials invalid
 [2012-12-20 14:02:49] Unable to get work from pool 0 http://btcguild.com:8332
 [2012-12-20 14:02:50] Switching pool 1 http://de.btcguild.com:8332 to stratum+tcp://176.9.42.247:3333

in this case, the URL changed for btcguild primary connection, once I fixed that, and the pool is 'reachable' it starts up immediately... invalid or down pool as pool #1 still gives almost 3 minutes delay on startup
legendary
Activity: 1610
Merit: 1000
Kano,
There might be version mismatch between core and executable
I will post correct info when it crashes again
I am not quite sure if that info is useful at all
Flowing your advise i got some gdb core info to share. cgminer 2.10.2 coredumps
[91932.126721] cgminer[2138]: segfault at 260 ip 000000000040be10 sp 00007fff4c6b48b0 error 4 in cgminer[400000+57000]

[New LWP 2680]
[New LWP 2731]
[New LWP 2618]

warning: Error reading shared library list entry at 0x780000003c

warning: Corrupted shared library list: 0x0 != 0x4830408b48d00148
Core was generated by `/usr/local/bin/cgminer xxxxxx.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000410ea1 in stage_thread (userdata=0x1c4a080) at cgminer.c:3576
3576    cgminer.c: No such file or directory.


line 3576   tq_freeze(mythr->q);


10X
Jump to: