Author

Topic: Help With Miner Comms to a stratum+tcp Pool (Read 1243 times)

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
July 24, 2016, 09:14:05 AM
#5
Perhaps some issue with the compac fork of cgminer in initial communications. I didn't think they changed that much though and others have been mining fine with them for ages. Anyway don't mess with it now that it's mining Tongue
sr. member
Activity: 475
Merit: 265
Ooh La La, C'est Zoom!
The Compac stick communicates fine. I point my Compac at solo.ckpool.org (actually stratum.ckpool.org which is where it eventually winds up if I start at solo.ckpool.org) and specify --suggest-diff 17, which is where it is now after bouncing back and forth between 1000 and 17, and it still takes a seemingly random amount time before the pool begins accepting shares.

Code:
./cgminer -o stratum+tcp://stratum.ckpool.org:3333 -u .Compac1 -p x --compac-freq 325 --suggest-diff 17

Well that didn't work... I let the Compac run while I got some sleep, and 10 hours after starting the miner, still no joy using the above command.




I restarted the Compac a few minutes ago using this command:

Code:
./cgminer -o stratum+tcp://stratum.ckpool.org:3333 -u .Compac1 -p x --compac-freq 325

To see how long it runs before the submitted shares are accepted.

Edit: Below this is new (~11.5 hours after original post)

Gave up waiting after 90 minutes and switched over to this command:

Code:
./cgminer -o stratum+tcp://solo.ckpool.org:3333 -u .Compac1 -p x --compac-freq 325

That command took over two hours before a share was submitted that didn't have any retransmits. The response that came back was a reconnect:

Code:
{"id":null,"method":"client.reconnect","params":["stratum.ckpool.org","443",0]}

That resulted in another 40 or so minutes of attempting to submit shares before one was accepted with a 1000 Diff. Five minutes later another submitted share:

Code:
{"params": [".Compac1", "5786cf6b000071a3", "9625000000000000", "5793d7c4", "5c73552b"], "id": 128, "method": "mining.submit"}

and the pool responds with:

Code:
{"params":[24],"id":null,"method":"mining.set_difficulty"}

immediately followed by:

Code:
{"error":null,"result":true,"id":128}

Hooray! Another accepted share. From that point on the Compac has been doing fine.




Cheers,

- zed
sr. member
Activity: 475
Merit: 265
Ooh La La, C'est Zoom!
I glossed over your complete data set, but you're probably communicating fine, it's just that finding 1k diff shares is a rare event on a compaq stick which is why you see the S7 communicating fine which makes 1k shares far more frequently. There's nothing to fix there, unless you want the feedback faster, in which case mine on a pool that supports the stratum specification command --suggest-diff (like kano.is or solo.ckpool.org) and use cgminer which supports the command and choose your starting diff.

@-ck, Thanks. Those are actually the two pools I used. In fact the images are from when I had the S7-LN pointed at solo.ckpool.org, but the results on kano.is are essentially the same and is where I have the S7-LN pointed now.

The Compac stick communicates fine. I point my Compac at solo.ckpool.org (actually stratum.ckpool.org which is where it eventually winds up if I start at solo.ckpool.org) and specify --suggest-diff 17, which is where it is now after bouncing back and forth between 1000 and 17, and it still takes a seemingly random amount time before the pool begins accepting shares.

Code:
./cgminer -o stratum+tcp://stratum.ckpool.org:3333 -u .Compac1 -p x --compac-freq 325 --suggest-diff 17

I restarted the Compac over an hour ago with the above command and it has yet to receive an "Accepted" share message. I've got wireshark running capturing the packets. I'll edit this message later today (eastern US timezone) with the time it took to get the first accepted share. Meanwhile, here is what I see a lot of in the terminal window:

Code:
[2016-07-23 02:48:15.875] Stratum connection to pool 0 interrupted
[2016-07-23 02:48:16.518] Pool 0 difficulty changed to 1000
[2016-07-23 02:48:16.579] Pool 0 message: Authorised, welcome to solo.ckpool.org !
[2016-07-23 02:48:16.895] Pool 0 difficulty changed to 17
[2016-07-23 02:48:16.895] Stratum from pool 0 requested work restart
[2016-07-23 02:48:35.015] Lost 7 shares due to no stratum share response from pool 0


On the S7-LN I have not been able to figure out how to specify --suggest-diff. Bitmain doesn't make it easy to adjust their miners, especially across a reboot/power cycle. On kano.is I use the worker management page to specify a diff of 1072 which is what the pool commands when the miner is finally able to get a few accepted shares. However, even doing that it doesn't seem to make a difference.

I guess the thing that I find most odd is that the pool servers always seem to go "deaf" when the miner sends the packet that contains the mining.submit:

Code:
{"params": [".Compac1", "5786cf6b00006ab9", "0e01000000000000", "57930c36", "69c70e16"], "id": 336, "method": "mining.submit"}

By deaf I mean that there is no TCP message coming back from the pool at all. The non-response triggers a bunch of retransmits from the miner, and when those fail, the connection times out and the miner sends a TCP RST, which "wakes up" the pool and the cycle starts all over again. If the pool doesn't like the submitted share, for any reason, I would expect the pool to send stratum message indicating the share was rejected with a "result": false and optionally some error code.

Eventually the pool accepts the submitted share and responds appropriately. On the Compac that means it happily mines away until cgminer is stopped for one reason or another. On the S7-LN it's as if only the secondary pool gets accepted shares, and it only lasts for a little while (couple of minutes or so). Inevitably the miner fails back to the primary pool address and goes back to the "deafness" issue.

Cheers,

- zed
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I glossed over your complete data set, but you're probably communicating fine, it's just that finding 1k diff shares is a rare event on a compaq stick which is why you see the S7 communicating fine which makes 1k shares far more frequently. There's nothing to fix there, unless you want the feedback faster, in which case mine on a pool that supports the stratum specification command --suggest-diff (like kano.is or solo.ckpool.org) and use cgminer which supports the command and choose your starting diff.
sr. member
Activity: 475
Merit: 265
Ooh La La, C'est Zoom!
Hello pool software/stratum protocol/miner software experts,

I've got several miners that seem to take random amounts of time to establish communications to a pool such that shares are accepted. My GekkoScience Compac miner can take anywhere from just a few minutes to multiple hours before I start seeing messages like:

Code:
[2016-07-22 19:36:16.626] Accepted 2bbe2621 Diff 1.5K/17 COMPAC

In the case of some pools it seems that the Compaq never does establish communications with the pool such that shares are accepted. Clearly you want to point to a pool that is relatively close by so that there isn't a lot of lag in communications. By close by I mean round trip ping times are low and do not fluctuate too much.

In addition to the Compac I have an Antminer S7-LN that I can not get to mine successfully on any pool I have pointed it at. What I see on the Miner Stats page looks like this:



As you can see, this miner has been working for over an hour, but has not had the DiffA# or DiffR# shares counters pegged. That means the pool is not happy with what it is receiving, or more specifically should be receiving, but probably is not. When miner and pool are communicating well (my Compac with cgminer running on my Mac Mini), it looks like this:

Code:
Mac-mini:~ boris$ netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4       0      0  192.168.1.101.51812    a.b.c.d.3333       ESTABLISHED
Mac-mini:~ boris$ netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4       0      0  192.168.1.101.51812    a.b.c.d.3333       ESTABLISHED
Mac-mini:~ boris$

And when the miner and pool are not communicating well (my S7-LN) it looks like this:

Code:
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0   5217 192.168.1.196:53729     a.b.c.d:3333        FIN_WAIT1  
tcp        0    815 192.168.1.196:53730     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0   5217 192.168.1.196:53729     a.b.c.d:3333        FIN_WAIT1  
tcp        0   2934 192.168.1.196:53730     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0   5217 192.168.1.196:53729     a.b.c.d:3333        FIN_WAIT1  
tcp        0   3097 192.168.1.196:53730     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0   3097 192.168.1.196:53730     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~#

For those reading this thread who are unfamiliar with TCP/IP communications, the state FIN_WAIT1 means that the local side (miner in this case) has told the remote side to close the connection. In this case the reason the miner is in the FIN_WAIT1 state on this particular connection is because it has exhausted all of the TCP retransmissions for the most recent message, and the connection has timed out.

If you look at the output from netstat there is a column called Send-Q. When things are working normally (see the netstat output from my Mac Mini) the value in that column is zero. That means that there is no data in the output queue waiting to be sent to the remote side of the connection. When things are not working normally (see the netstat output from the Antminer) there is a number greater than zero in that column.

I put a wireshark packet sniffer onto the network and I can "see" that the miner is communicating with the pool and the "conversation" seems to be going OK right up until the miner sends a stratum mining.submit message. The mining.submit message is never ACKnowledged by the pool. Eventually that causes the miner to retransmit the message. The miner will continue to retransmit (waiting an exponentially longer time before sending the message again) until it has reached the TCP/IP time-out limit which is 30 seconds.


Can any of the pool software/stratum protocol/miner software experts help me figure out what is happening here Initially I thought it was network bandwidth issues, but I have increased my bandwidth and I'm still seeing the same behavior. I know that the Compac miner eventually "sync's" with the mining pool, and the S7-LN has occasional "moments" when it is submitting DiffA# or DiffR# accepted shares, but seemingly only when I have a second pool destination URL specified, and only intermittently.

Edit: Added the below info...

Here is the image of the miner configured with two pool addresses. The addresses are the same, with the miner somehow successful in submitting shares to the "backup" pool, but not to the "primary" address.



The netstat from the miner looks like:

Code:
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0   7173 192.168.1.196:58286     a.b.c.d:3333        FIN_WAIT1   
tcp        0   3423 192.168.1.196:58295     a.b.c.d:3333        ESTABLISHED
tcp        0      0 192.168.1.196:58247     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0   3586 192.168.1.196:58295     a.b.c.d:3333        ESTABLISHED
tcp        0      0 192.168.1.196:58247     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~# netstat -alnt | awk 'NR == 2 || /a.b.c.d/'
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0   1630 192.168.1.196:58298     a.b.c.d:3333        ESTABLISHED
tcp        0   7336 192.168.1.196:58296     a.b.c.d:3333        FIN_WAIT1   
tcp        0      0 192.168.1.196:58247     a.b.c.d:3333        ESTABLISHED
netstat: /proc/net/tcp6: No such file or directory
root@antMiner:~#


And of course when the miner "fails" back to the primary it isn't able to submit shares.

Thanks,

- zed
Jump to: