I'm still seeing disconnects where after running for a while units disappear and are rediscovered. I'm on a Raspberry Pi running 10 GS units with the girnyau version of cgminer that I built from sources, plus Scripta as the web front end. Most of the units are on an Orico P10 U2.
After a disconnect, the device always reconnects a couple of seconds later. This leaves a "zombie" in the API because cgminer doesn't mark the device's status as dead. Sometimes more than one device at a time disconnects.
Here's an example of a disconnect from the log:
[2014-04-06 06:45:03] Stratum from pool 0 detected new block
[2014-04-06 06:45:03] GSD 8 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 06:45:03] work prepare failed, exiting mining thread 8
[2014-04-06 06:45:05] Device found, firmware version 01140113, driver version v3.8.5.20140210.02
The errors are *always* in the "prepare_work" method of the driver. In the sources, the function name is gridseed_prepare_work. I never see any other USB errors in the log. I've reviewed the various Linux system logs but don't see a correlation to the problem there.
When this error happens, the thread associated with the device exits. But the code doesn't mark the device as dead. That, in combination with the way that Scripta detects "live" devices by verifying that the last 5 second hash value is non-zero, results in irritating and bogus device info and hash rate totals in the web front end.
I don't have a spare 10 port hub, so I can't swap that out. I do have some poorly behaving Gridseeds that have excessive hardware failures, so I've tried further under-clocking them. But I still get disconnects.
One thing I've noticed is that all of the failures in the log are aligned on 5 minute boundaries within a few seconds. Here's the most recent ones:
[2014-04-05 13:45:03] GSD 6 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-05 20:35:02] GSD 9 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 02:35:03] GSD 5 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 05:10:03] GSD 10 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 05:55:03] GSD 12 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 06:45:03] GSD 8 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
[2014-04-06 07:40:04] GSD 13 SendWork usb write err:(-7) LIBUSB_ERROR_TIMEOUT
Scripta has several cron jobs that run on 5 minute boundaries. Most of the cron jobs call the cgminer API for status, which should be harmless.
But the timing of these events is very suspect. I'm starting to think that there's a cgminer race condition where sometimes an API status call can interfere with the general operation of cgminer.
As a binary chop, I'm going to try disabling the Scripta cron jobs to see if the problem goes away.
If anyone else has any suggestions, I'm all ears...