Pages:
Author

Topic: cgminer - CPU/GPU miner in C for linux/windows - page 3. (Read 81916 times)

newbie
Activity: 15
Merit: 0
Seems to be alternating between "Long poll received" and "New block detected" messages now; first the poll message, then the other, sometimes with a second or less between the messages. Here's a sample:

Code:
[2011-07-10 08:24:28] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:24:30] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 08:27:23] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:27:36] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 08:32:39] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:32:54] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 08:38:41] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:39:16] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 08:47:03] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:47:04] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 08:56:18] LONGPOLL received - new block detected and work flushed already
[2011-07-10 08:56:19] New block detected, possible missed longpoll, flushing work queue
[Accepted] [CPU 0] [0.9 Mh/s] [Q:530  A:4  R:3  HW:0  E:1%  U:0.01/m]
[2011-07-10 09:36:24] LONGPOLL received - new block detected and work flushed already
[2011-07-10 09:36:51] New block detected, possible missed longpoll, flushing work queue
[2011-07-10 09:50:50] LONGPOLL received - new block detected and work flushed already
[2011-07-10 10:10:05] New block detected, possible missed longpoll, flushing work queue

I'm on commit 4bb13bda68fca91a8f96ec3c17cf6f99ecf70342

Edit: before 6:46 I was only getting this message: "LONGPOLL detected new block, flushing work queue", now I'm not getting it at all.
newbie
Activity: 47
Merit: 0
Updated tree:

Now I've fixed it for real, and incremented the version number and tag to v1.1.1 so people know which is the current good version. Everyone should upgrade.

Still having problems, seeing if they'll happen with -D now.

EDIT: -D and 411 requests, 352 accepts and the only odd thing is I start doing work before I have completed a getwork and these cause "HW" errors:
Code:
[2011-07-10 14:48:14] [thread 1: 134217728 hashes, 73762500 khash/sec]
[2011-07-10 14:48:15] [thread 0: 268435456 hashes, 244103467 khash/sec]
[2011-07-10 14:48:15] [thread 0: 134217728 hashes, 77439887 khash/sec]
[2011-07-10 14:48:15] GPU 0 found something?              
[2011-07-10 14:48:15] [thread 1: 134217728 hashes, 122021441 khash/sec]
[2011-07-10 14:48:15] No best_g found! Error in OpenCL code?

Trying to get the failure again without -D

EDIT2:
That was quick.  Any ideas on how to debug this?
Code:
[(5s):360.4  (avg):367.1 Mh/s] [Q:29  A:26  R:3  HW:3  E:90%  U:4.70/m]         [Accepted] [GPU 0] [369.5 Mh/s] [Q:29  A:28  R:3  HW:3  E:97%  U:5.06/m]                
[(5s):360.4  (avg):367.1 Mh/s] [Q:29  A:26  R:3  HW:3  E:90%  U:4.70/m]         [(5s):248.1  (avg):363.8 Mh/s] [Q:30  A:28  R:3  HW:3  E:93%  U:4.97/m]         [(5s):124.3  (avg):357.4 Mh/s] [Q:30  A:28  R:3  HW:3  E:93%  U:4.88/m]         ^C5s):0.1  (avg):304.4 Mh/s] [Q:30  A:28  R:3  HW:3  E:93%  U:4.15/m]

EDIT3: Ah-ha!  Finally: http://pastebin.com/vRQFkmRs
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:

You can now choose which device(s) to start cgminer on:

--device|-d   Select device to use, (Use repeat -d for multiple devices, default: all)
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:

Now I've fixed it for real, and incremented the version number and tag to v1.1.1 so people know which is the current good version. Everyone should upgrade.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
No wait, it's still not right. Everyone hang on while I fix it properly...
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Important updated tree:

I didn't see the problem.  But when I ran it again w/o -D, I saw it after 63.  Which got me thinking, when I reported this problem the first time a few days ago I know it was when I had a network hiccup.  So I forced a hiccup (ifdown, sleep 5, ifup), but that didn't do it.  So, I'm not sure what's up.

Okay that got me thinking and I did some more instrumenting. I seem to have had a logic failure in the queueing of locally generated work, which may have caused this issue! It was also preventing local work generation ever occurring. Please update to the latest tree.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:

Implemented never idle logic. During periods of network or server problems it takes existing work and generates more work from that till the server starts responding properly or fast enough. This means that hash rates should -never- drop now with cgminer.

This is it in action:

[Accepted] [GPU 2] [435.0 Mh/s] [Q:111  A:21  R:0  HW:0  E:19%  U:22.20/m]                
[2011-07-09 10:26:01] Server not providing work fast enough, generating work locally
[Accepted] [GPU 3] [429.4 Mh/s] [Q:135  A:21  R:1  HW:0  E:16%  U:18.13/m]                
[2011-07-09 10:26:53] Resumed retrieving work from server

If your network is down for extensive periods eventually this will generate more rejected blocks, but for transient blips this makes a massive difference.

Can you explain this further? So my card just hashes to keep its hash rate up? Why is this needed? I think i'd rater save power than generate rejected blocks.

Until the next block is solved, which happens every ten minutes on average, there is a valid way to take existing work from your pool, modify it and make it new work for your machine. The shares generated from solving that work are actually valid if returned to the server before the next block is solved. Thus it is not just consuming power, it is actually generating valid shares by deriving new work off the work received from the pool.
newbie
Activity: 47
Merit: 0
With the current tree, under Linux, SDK 2.4, 6950 card, as soon as I hit 100 accepted blocks it stops getting new work.  Any ideas?
That sounds very wrong. Can you do it while running with -D and store the log please? All of my miners are in the thousands of accepted shares.

OK, I let it go with -D for almost 200 accepted and aside from some odd looking output:
Code:
[2011-07-09 22:58:09] [Rate (5s):35.5  (avg):267.47 Mhash/s] [Requested:196  Accepted:186  Rejected:8  HW errors:2048  Efficiency:95%  Utility:3.66/m]

I didn't see the problem.  But when I ran it again w/o -D, I saw it after 63.  Which got me thinking, when I reported this problem the first time a few days ago I know it was when I had a network hiccup.  So I forced a hiccup (ifdown, sleep 5, ifup), but that didn't do it.  So, I'm not sure what's up.
full member
Activity: 182
Merit: 100
Updated tree:

Implemented never idle logic. During periods of network or server problems it takes existing work and generates more work from that till the server starts responding properly or fast enough. This means that hash rates should -never- drop now with cgminer.

This is it in action:

[Accepted] [GPU 2] [435.0 Mh/s] [Q:111  A:21  R:0  HW:0  E:19%  U:22.20/m]                
[2011-07-09 10:26:01] Server not providing work fast enough, generating work locally
[Accepted] [GPU 3] [429.4 Mh/s] [Q:135  A:21  R:1  HW:0  E:16%  U:18.13/m]                
[2011-07-09 10:26:53] Resumed retrieving work from server

If your network is down for extensive periods eventually this will generate more rejected blocks, but for transient blips this makes a massive difference.

Can you explain this further? So my card just hashes to keep its hash rate up? Why is this needed? I think i'd rater save power than generate rejected blocks.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
If you're getting this fairly often:
Code:
New block detected, possible missed longpoll, flushing work queue
Does that mean you should reduce the scan time? If it's not too often, is it okay to just leave it be?

I'm on slush's pool, which I think doesn't support longpolling, so seeing the message doesn't surprise me.

That's a harmless message that you need do nothing about. It means the automated new block detection which I implemented for cgminer is detecting the block before longpoll is telling it there's a new block. It also means that longpoll is activated, but auto detection is beating it. If longpoll was disabled it would have just said "New block detected, flushing work queue".
newbie
Activity: 15
Merit: 0
If you're getting this fairly often:
Code:
New block detected, possible missed longpoll, flushing work queue
Does that mean you should reduce the scan time? If it's not too often, is it okay to just leave it be?

I'm on slush's pool, which I think doesn't support longpolling, so seeing the message doesn't surprise me.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
With the current tree, under Linux, SDK 2.4, 6950 card, as soon as I hit 100 accepted blocks it stops getting new work.  Any ideas?
That sounds very wrong. Can you do it while running with -D and store the log please? All of my miners are in the thousands of accepted shares.
newbie
Activity: 47
Merit: 0
With the current tree, under Linux, SDK 2.4, 6950 card, as soon as I hit 100 accepted blocks it stops getting new work.  Any ideas?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated source.

I cleaned up the output to make it clear what's going on instead of spitting out the meaningless network errors:

When it can not submit work:
[2011-07-09 17:05:11] Upstream communication failure, caching submissions

and it will continue to mine unhindered (either with fresh work it can retrieve or locally generated work) until:
[2011-07-09 17:05:26] Upstream communication resumed, submitting work         

and then all the work finished in that time will be pushed upstream. If the work is no longer valid because a new block has since appeared, cgminer knows this since it can detect a block change itself, and will not submit a whole lot of rejects and will instead say:

[2011-07-10 01:19:20] Stale work detected, discarding       

This may happen immediately after a longpoll as well, and is harmless, but minimises your reject count.

I've also updated the efficiency and utility values printed to be per-device when reporting on share submissions:

[Accepted] [GPU 1] [411.0 Mh/s] [Q:425  A:410  R:17  HW:0  E:96%  U:5.67/m]                 
[Rejected] [GPU 0] [411.1 Mh/s] [Q:425  A:384  R:24  HW:0  E:90%  U:5.31/m]                 
[Accepted] [GPU 1] [411.1 Mh/s] [Q:426  A:411  R:17  HW:0  E:96%  U:5.68/m]                 
[Accepted] [GPU 0] [410.9 Mh/s] [Q:426  A:385  R:24  HW:0  E:90%  U:5.32/m]                 
[(5s):1696.1  (avg):1643.1 Mh/s] [Q:1704  A:1538  R:81  HW:0  E:90%  U:21.25/m]         

The status line will continue to be a grand total with overall efficiency and utility.
member
Activity: 98
Merit: 10
Updated tree:

Implemented never idle logic. During periods of network or server problems it takes existing work and generates more work from that till the server starts responding properly or fast enough. This means that hash rates should -never- drop now with cgminer.

This is it in action:

[Accepted] [GPU 2] [435.0 Mh/s] [Q:111  A:21  R:0  HW:0  E:19%  U:22.20/m]                 
[2011-07-09 10:26:01] Server not providing work fast enough, generating work locally
[Accepted] [GPU 3] [429.4 Mh/s] [Q:135  A:21  R:1  HW:0  E:16%  U:18.13/m]                 
[2011-07-09 10:26:53] Resumed retrieving work from server

If your network is down for extensive periods eventually this will generate more rejected blocks, but for transient blips this makes a massive difference.

Awesome feature Smiley
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
New windows build with latest no-idle logic included.
http://ck.kolivas.org/apps/cgminer-ycros-2011-07-09.zip
sr. member
Activity: 378
Merit: 250
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:

Implemented never idle logic. During periods of network or server problems it takes existing work and generates more work from that till the server starts responding properly or fast enough. This means that hash rates should -never- drop now with cgminer.

This is it in action:

[Accepted] [GPU 2] [435.0 Mh/s] [Q:111  A:21  R:0  HW:0  E:19%  U:22.20/m]                 
[2011-07-09 10:26:01] Server not providing work fast enough, generating work locally
[Accepted] [GPU 3] [429.4 Mh/s] [Q:135  A:21  R:1  HW:0  E:16%  U:18.13/m]                 
[2011-07-09 10:26:53] Resumed retrieving work from server

If your network is down for extensive periods eventually this will generate more rejected blocks, but for transient blips this makes a massive difference.
sr. member
Activity: 378
Merit: 250
Question:  Which is more optimum?
Code:
movdqa xmm1, [rdx]
pshufd xmm2, xmm1, 0x55
paddd xmm5, xmm2
pshufd xmm6, xmm1, 0xAA
paddd xmm4, xmm6
pshufd xmm11, xmm1, 0xFF
paddd xmm3, xmm11
pshufd xmm1, xmm1, 0
paddd xmm7, xmm1

movdqa xmm1, [rdx+4*4]
pshufd xmm2, xmm1, 0x55
paddd xmm8, xmm2
pshufd xmm6, xmm1, 0xAA
paddd xmm9, xmm6
pshufd xmm11, xmm1, 0xFF
paddd xmm10, xmm11
pshufd xmm1, xmm1, 0
paddd xmm0, xmm1

movdqa [hash+0*16], xmm7
movdqa [hash+1*16], xmm5
movdqa [hash+2*16], xmm4
movdqa [hash+3*16], xmm3
movdqa [hash+4*16], xmm0
movdqa [hash+5*16], xmm8
movdqa [hash+6*16], xmm9
movdqa [hash+7*16], xmm10
or
Code:
movdqa xmm1, [rdx]
pshufd xmm2, xmm1, 0x55
pshufd xmm6, xmm1, 0xAA
pshufd xmm11, xmm1, 0xFF
pshufd xmm1, xmm1, 0

paddd xmm5, xmm2
paddd xmm4, xmm6
paddd xmm3, xmm11
paddd xmm7, xmm1

movdqa xmm1, [rdx+4*4]
pshufd xmm2, xmm1, 0x55
pshufd xmm6, xmm1, 0xAA
pshufd xmm11, xmm1, 0xFF
pshufd xmm1, xmm1, 0

paddd xmm8, xmm2
paddd xmm9, xmm6
paddd xmm10, xmm11
paddd xmm0, xmm1

movdqa [hash+0*16], xmm7
movdqa [hash+1*16], xmm5
movdqa [hash+2*16], xmm4
movdqa [hash+3*16], xmm3
movdqa [hash+4*16], xmm0
movdqa [hash+5*16], xmm8
movdqa [hash+6*16], xmm9
movdqa [hash+7*16], xmm10

Oddly enough, I'm seeing higher optimization using the first set of code which is part of my modifications to the sse2_64_atom code.
Pages:
Jump to: