Pages:
Author

Topic: HashFast BabyJet users thread - page 38. (Read 69011 times)

member
Activity: 83
Merit: 10
February 04, 2014, 02:33:12 PM
Holy shit, this is probably the most unstable miner i´ve ever seen.
More like a bad prototype to be honest.  Angry
newbie
Activity: 28
Merit: 0
February 04, 2014, 02:28:25 PM
Phil,

I'm a Sierra customer. Having problems with cgminer 3.12 crashing after a 12+ hour run on Ubuntu. Can't get multiple sierras to work well on the same host. Is I stop cgminer or it crashes I can't get them all to reattach by themselves and the whole power off and wait a coupe minutes thing is not easy in a data center or remote location.

I see a lot of this when its trying to reacquire the units.

 [2014-02-04 09:11:39] HFA 0: OP_USB_INIT failed! Operation status 9 (Secondary board communi
cation error)
 [2014-02-04 09:11:39] HFA 0 HFClearRead usb read err:(-1) LIBUSB_ERROR_IO
 [2014-02-04 09:11:39] HFA 0 attempted reset got err:(-5) LIBUSB_ERROR_NOT_FOUND
 [2014-02-04 09:11:39] FAIL: USB get_lock not found (2:42)
 [2014-02-04 09:11:39] FAIL: USB remove not already in use (2:42)

Any help? New firmware needed maybe?
We have new firmware coming soon.  But what I recommend is to set up a bash script to automatically restart cgminer if it crashes, and possibly something to monitor hashrate and kill cgminer if it falls off.  When restarting "warm", it will appear to fail, but if you keep restarting cgminer, usually within a few tries it will eventually "take".  This is basically how I've got the RPI set up to work, and it yields a reliable system.

Sort of a "duct tape and bailing wire" solution for now, but it'll keep you mining, and hopefully the coming improvements to both our firmware and Con's cgminer will reduce the need for these tricks.

-Phil
newbie
Activity: 28
Merit: 0
February 04, 2014, 02:17:52 PM
Quote
[2014-02-04 09:45:09] HFA 0 NOTICE: ######################### WARNING: Work Watchdog Reboot Imminent!
 [2014-02-04 09:45:11] HFA 0 HFGetHeader usb read err:(-99) LIBUSB_ERROR_OTHER
 [2014-02-04 09:45:11] HFA 0 attempted reset got err:(-5) LIBUSB_ERROR_NOT_FOUND
 [2014-02-04 09:45:11] HFA 0: device disappeared, disabling
 [2014-02-04 09:45:11] HFA 0 failure, disabling!
 [2014-02-04 09:45:11] HFA 0: hfa_send_frame: USB Send error, ret -4 amount 0 vs. tx_length 8
 [2014-02-04 09:45:17] Hotplug: Hashfast added HFA 1

More then 30 times a day!
This is not a problem, the watchdog is designed to reset a board that ends up in a strange state for any reason.  This could be a USB comms glitch, a firmware bug, cgminer, etc.  A few times an hour will not much effect on your overall hashrate.   Be glad it's there, otherwise your system would probably have stopped working by now.

Once some the bugs get worked out we should see that happen less and less.  I know it's not ideal, but this way you get to have a system that mines instead of one that doesn't, or worse, one that we didn't ship until we decided we had a 100% glitch-free setup.  (And if you've been in the mining game for any length of time, you know that is never gonna happen!)

-Phil
sr. member
Activity: 446
Merit: 250
February 04, 2014, 02:07:38 PM
newbie
Activity: 28
Merit: 0
February 04, 2014, 01:34:04 PM
Guys, you've gotta understand that these errors, including the ones that cause firmware issues (triggering the work watchdog resets) are always going to happen, but statistically they happen more when you push a chip harder, on average.  You need a longish sample period before you decide that one configuration or the other has better performance.   Personally, I would allow at least an hour to pass, watching the hashrate reported at the pool (on Eligius only watch the 22.5 minute stats or higher, 128 and 256 seconds never seem stable enough).  Write the average hashrate down, then make a change, noting what you did, and watch it for another hour.   An even better way to do this is to write a shell script or a program to run different versions of cgminer, different hash clocks, etc, then you can come back a day later and look at the graphs.

Taking a short sample period to assess your current configuration results in too much "noise" in your stats to be of any value.

Back in this post I describe how to change the cgminer version on your RPI's if you want, though I recommend 3.9.0h2, which is what I personally am running, and it works best: https://bitcointalksearch.org/topic/m.4919972

If you are running on Windows or Linux and want to try 3.9.0h2, here's the source code: http://setup.hashfast.com/cgminer-3.9.0h2.tar.gz
(MD5: fdef15ae73b180deef74bc51df482eb0)

See README for build instructions, basically you'll need to run autogen.sh  

My test RPI's are running stable on eligius with 3.9.0h2.  I've got one setup here that generates a typical amount of errors with this firmware, but has a good average hashrate.  Here's the terminal if you'd like to take a look:
http://setup.hashfast.com/rpi/
And the pool results:
http://eligius.st/~wizkid057/newstats/userstats.php/17ao2fT7gbnnKYpMd7x29E8p4oGdE87Ahd

Again, I do not make policy at HashFast, and I cannot help outside of technical or engineering areas.  I cannot tell you when MPP or anything of that nature is happening, I have no idea.  If you aren't civil with me, I will ignore you.  I'd like to help, so allow me.

-Phil
member
Activity: 84
Merit: 10
February 04, 2014, 11:54:08 AM
For all the people here that are running on something other than the RPI's, Here's the source code for cgminer 3.9.0h2 (same version we're currently using on the RPI).  This version has somewhat correct hashrate reporting, and has a few other things, such as the per-die voltage and temperature readout.  It seems to be the most stable during our testing.  One of our top Engineers is currently working with Con Kolivas right now to get some of these things improved.  Thanks Con!

cgminer 3.9.0h2 source (can be complied on most Linux distros): http://setup.hashfast.com/cgminer-3.9.0h2.tar.gz

I'd love reports back to see if this improves your mining performance/stability.  Again; please ignore the strings of HW errors, this should not be a problem.

-Phil

To run Cgminier 3.09h2 , on another linux platform , compile source code above , provided by Phil from HF

member
Activity: 97
Merit: 10
February 04, 2014, 11:49:34 AM

I had issues with cgminer 3.11 and 3.12  , rolled back the the hashfast image whith cgminer3.9.0h2, so far the best for me . no Watchdog errors or CRC errors.

Okay I can get 3.9.0, but where are you getting 3.9.0h2 from? If this is from your minepeon raspi then nevermind...
sr. member
Activity: 307
Merit: 250
February 04, 2014, 11:43:39 AM
*Each chip has its own sweet point .  Mine OC sweet points where 625 , 645, and 655

You are lucky.  I can't get mine that high.


Same here, anything above 588 will cause the watchdog to trip constantly. I can't even get 3.09 to work for some reason. Doesn't detect the miner.
member
Activity: 84
Merit: 10
February 04, 2014, 11:27:03 AM
*Each chip has its own sweet point .  Mine OC sweet points where 625 , 645, and 655

You are lucky.  I can't get mine that high.

On the watchdog business, fubly, if you are using 3.12, try using 3.11 or, as people here say, 3.09.  I got watchdog errors on 3.12 when, at the same clock rate, I've never seen a watchdog problem with 3.11.

I had issues with cgminer 3.11 and 3.12  , rolled back the the hashfast image whith cgminer3.9.0h2, so far the best for me . no Watchdog errors or CRC errors.
member
Activity: 97
Merit: 10
February 04, 2014, 11:21:55 AM

No, the current p2pool code will always yield at best a 20% DOA rate for all asic based miners.


NO!

avalon gen2  5%
knc jupiter 5-8%

Okay, I'll consider myself KICKED!!

Hmmm, just for my information: So in effect p2pool is for asics only then, right? When did the BIG CHANGE happen, i.e. non-asic miners get kicked out? THAT sounds non-trivial...


And also, why didn't baloo_kiev's presumption of a 1-2% DOA for asics not happen? Hmmmmmmm.

Awww man, sounds like p2pool is being a one-tool-fits-all mentality. I can smell it now.

Sheeze, do the fork already....! ASICs Rule! Optimize p2pool for ASICS!!!!

Sorry, kinda out of the loop on p2pool since July.

Note to Ponder: If p2pool was in the 1-2% DOA for ASIC (hashfast, kncminer etc.) I and I'm sure a whole mass of ASIC miner owners would quit GHASH in an instant and switch over and therefore no need to worry about potential 51% attacks/exploits!

And of course do this over TOR or something custom so that we don't get our p2pool public node DDOS'd!!!!
sr. member
Activity: 392
Merit: 250
February 04, 2014, 11:19:54 AM
*Each chip has its own sweet point .  Mine OC sweet points where 625 , 645, and 655

You are lucky.  I can't get mine that high.

On the watchdog business, fubly, if you are using 3.12, try using 3.11 or, as people here say, 3.9.0.  I got watchdog errors on 3.12 when, at the same clock rate, I've never seen a watchdog problem with 3.11.

Edit:  Sorry about 3.9.0 version confusion.
fhh
legendary
Activity: 1206
Merit: 1000
February 04, 2014, 11:11:16 AM

No, the current p2pool code will always yield at best a 20% DOA rate for all asic based miners.


NO!

avalon gen2  5%
knc jupiter 5-8%
member
Activity: 97
Merit: 10
February 04, 2014, 11:06:21 AM
and again a miner that fails on p2pool Sad 20% rejected shares
anybody tried with cgminer 3.11 is this getting better performance?
I'm not seeing anything like these reject levels. Only when I first start p2pool and first connect the device. After a while diff rises, p2pool stops chewing up CPU and then it all calms down to very normal reject and DOA rates and good efficiency.

after 5 hours I still got >20% rejected shares on p2pool.
running cgminer 3.9 from the minepeon on rpi shipped with the bj

will give 3.12 or newer a shot on a linux host PC

No, the current p2pool code will always yield at best a 20% DOA rate for all asic based miners.

If you do a bitcointalk.org search on asicminer blades (I have seven (7) each of Fried Cat's 13GH/s blades lying around collecting dust --open to all offers) and p2pool efficiency, you will see that this is just how things are sans two separate p2pools (one for asic miners and the other for non-asic miners). Attempting to modify p2pool to accommodate both will actually penalize both.

Forking the p2pool code (one p2pool network for asics, and a separate p2pool for legacy) is pretty trivial and with all new bitcoin miners coming onboard being 100% asic, then it is high time to create this asci-only p2pool mining pool.

And when that happens, the asic miners' p2pool DOA rate would be under 1%.

Say, that would be a good name for it: ASIC-P2Pool.

Below is this PM that I got from baloo_kiev that explains all this much better (the "blades" he refers to are 13 GH/s miners) Dated July 9th, 2013 like 7 months ago:

Hello!

Current estimated DOA for eruptor blades is about 42%. After switch to new share period of 30 s, which will hopefully occur within several days, it will drop to about 18%, which is still unacceptable.

The solution I was talking about can reduce estimated DOA to 1-2%, which is about the same as amount of stales you must get on any 'traditional' pool. The solution is simply forking the p2pool with 5 minute share period. It will be a new p2pool with new share chain, not sharing work with 'old p2pool'.

For instance, this means that you will need to gain some other miners' support before launching it. If you want the new pool to generate at least 1 block per 24 hours at current difficulty, you will need at least 100 blades mining in it. Here's a table
showing estimated time to block and time to share for 1 blade at current difficulty at 5-minute share period:

Total hashrate, GH/s (total number of blades)   Time to share (for 1 blade), h   Time to block, h

500 (50)                                                                       4                                 48
1000 (100)                                                                    8                                 24
2000 (200)                                                                   16                                12
3000 (300)                                                                   24                                 8

Note that the new p2pool will be suitable for any device, not only the blades, but miners with less than 10 GH total hashrate won't probably want to mine there because of high share difficulty. The coding here must be primitive (as I stated in the post, it's all about a couple of lines) and I can easily do it, but it only makes sense if you make sure first that enough hashrate will be put into the pool!


====================

So fellow Space Cadets, get in touch with user baloo_kiev. Note that forking p2pool code i.e. making it mandatory for all nodes in a viable manner, is as you can expect, politically non-trivial.

KICK ME IN THE ASS TIME:  I  haven't kept up with p2pool ever since July 2013, so maybe all this has already been addressed and if so, I apologize!
member
Activity: 84
Merit: 10
February 04, 2014, 11:04:54 AM
Quote
[2014-02-04 09:45:09] HFA 0 NOTICE: ######################### WARNING: Work Watchdog Reboot Imminent!
 [2014-02-04 09:45:11] HFA 0 HFGetHeader usb read err:(-99) LIBUSB_ERROR_OTHER
 [2014-02-04 09:45:11] HFA 0 attempted reset got err:(-5) LIBUSB_ERROR_NOT_FOUND
 [2014-02-04 09:45:11] HFA 0: device disappeared, disabling
 [2014-02-04 09:45:11] HFA 0 failure, disabling!
 [2014-02-04 09:45:11] HFA 0: hfa_send_frame: USB Send error, ret -4 amount 0 vs. tx_length 8
 [2014-02-04 09:45:17] Hotplug: Hashfast added HFA 1

More then 30 times a day!

Seems that you may be OC too much for the chip, I have tested my BJ and only get that error when I OC to the max allowed by the current setup* , I have mine set an over 0600 core but I see that error when I go to the high 660+clock core, so to be real keep it as Phil da HF Engineer stated in this thread , just clock back the core 5 clicks and see what you get, do this until you stop seeing "WARNING: Work Watchdog Reboot Imminent!"

*Each chip has its own sweet point .  Mine OC sweet points where 625 , 645, and 655

Cheers
sr. member
Activity: 307
Merit: 250
February 04, 2014, 08:04:17 AM
Not sure why so many users are having windows problems. It took my less then an hour to set mine up and I just recently put cgwatcher on their as well. For those of you with windows problems this is what I did.

Step 1. Download cgminer from offical site, I personally used 3.11 as 3.12 was giving me problems.
Step 2. Connect babyjet device and download zadig from cgminer site. This step was key for me, download from the official zadig site didn't work.
Step 3. Install winusb driver on m1 module.(may take a while)
Step 3. Run cgminer and it should work right away.

+1

This is what I did, works great. I have it OC'd to 581 and it has been running stable for 12+ hours.
However, I only get ~350GH/s reported on Eligius and BTCGuild running stock speeds, and running stock with RPi.
OC gets ~420GH/s.
legendary
Activity: 1630
Merit: 1000
February 04, 2014, 06:46:43 AM
Not sure why so many users are having windows problems. It took my less then an hour to set mine up and I just recently put cgwatcher on their as well. For those of you with windows problems this is what I did.

Step 1. Download cgminer from offical site, I personally used 3.11 as 3.12 was giving me problems.
Step 2. Connect babyjet device and download zadig from cgminer site. This step was key for me, download from the official zadig site didn't work.
Step 3. Install winusb driver on m1 module.(may take a while)
Step 3. Run cgminer and it should work right away.
sr. member
Activity: 322
Merit: 250
February 04, 2014, 05:53:28 AM
Also you have to prevent eligius pool to give the BJ an diff of 512, because the hashrate drops dramatically,
followed by many CRC Errors and then back to DIFF 256.

What an BS is this?

I believe this is a bug in the cgminer display. It doesn't just drop dramatically - it exact halves when the diff doubles.

I've watched it carefully through the stings when it does that - the Hashrate on Eligius itself doesn't seem to be affected, it's just the cgminer display. I'm not seeing CRC errors after switchover though. (3.11 on Windows).



Hi HF,

when will you ship the Batch 1 mpp boards?Huh

HF-Engineer (Phil), does not have any input or knowledge of the MPP & Upgrade programs. That's an off-topic conversation for this thread anyway (see post 1) - please keep to technical issues.
hero member
Activity: 561
Merit: 521
Trustless IceColdWallet
February 04, 2014, 05:32:26 AM




Eligius is always special. KNCJupiters had also lower Hashrates/higher HW-Errors at Eligius until wizkid published a eligius-cgminer version.
[/quote]

OK! Then we also need an HF Eligius cgminer!

Come on HF, take more money in your hand and pay some btc to wizkid!!!

Hi HF,

when will you ship the Batch 1 mpp boards?Huh
ImI
legendary
Activity: 1946
Merit: 1019
February 04, 2014, 05:28:06 AM
Code:
Rejected c4b91fb5 Diff 333/256 HFB 1 pool 0 (high-hash)

HIGH HASH on eligius.st

on ghash.io only low-hash will rejected!!!

HI HF,

can you explain please! Why the handling on different pools are not equal.

thx

fubly

Eligius is always special. KNCJupiters had also lower Hashrates/higher HW-Errors at Eligius until wizkid published a eligius-cgminer version.
hero member
Activity: 561
Merit: 521
Trustless IceColdWallet
February 04, 2014, 04:46:53 AM
Quote
[2014-02-04 09:45:09] HFA 0 NOTICE: ######################### WARNING: Work Watchdog Reboot Imminent!
 [2014-02-04 09:45:11] HFA 0 HFGetHeader usb read err:(-99) LIBUSB_ERROR_OTHER
 [2014-02-04 09:45:11] HFA 0 attempted reset got err:(-5) LIBUSB_ERROR_NOT_FOUND
 [2014-02-04 09:45:11] HFA 0: device disappeared, disabling
 [2014-02-04 09:45:11] HFA 0 failure, disabling!
 [2014-02-04 09:45:11] HFA 0: hfa_send_frame: USB Send error, ret -4 amount 0 vs. tx_length 8
 [2014-02-04 09:45:17] Hotplug: Hashfast added HFA 1

More then 30 times a day!
Pages:
Jump to: