Pages:
Author

Topic: HashFast BabyJet users thread - page 33. (Read 68982 times)

newbie
Activity: 28
Merit: 0
February 06, 2014, 04:49:50 PM
Several things:

1. I see this a lot: "it keeps rebooting".   If the RPI reboots, it will show a "rainbow" image on the monitor (if you have one connected), and if you check the status page on your web interface, you'll see the "System Uptime" value reflect this reboot.  You should not be seeing the RPI reboot often, if at all.  If you are, it indicates a problem possibly with your SD card, it's filesystem, or possibly the power supply for the RPI.  For instance, plugging in some USB periperals are known to overload the RPI.  We recommend to only keep the BJ plugged in when mining.  Some people have installed WiFi adapters, and some of these can draw more power peak than the RPI can output, resulting in a reboot.  Now, cgminer restarting is somewhat normal.  We've set it up so that in the event of any problems it restarts cgminer.  Until all the bugs get worked out, this is so you at least keep mining.  You'll see this reflected in the "miner uptime" on the status screen.  Mine usually lasts a day or 2.  Sometimes when cgminer restarts it can take up to a minute to get going again.  This seems to be due to a USB locking problem, which we are aware of and are trying to fix.  If you unplug your USB cable from the RPI and plug it back in, this delay is usually reduced, but either way, it will start mining again if left to it's own devices.  If you believe it really is your RPI rebooting, then maybe it's best to backup your settings and reload the SD card.  It's also a good idea to go get a spare SD card so you have on on hand, and this way you can test a "virgin" image if problems develop.  Please be clear when you post here in the future as to the difference between RPI system reboots and cgminer restarts.  This will help me to help you if I better understand exactly what's happening on your system. 

2. Errors and "work watchdog resets" are not an indication your system is broken.  Most of these are due to bugs which are being worked on and will be fixed in upcoming software/firmware releases.  Please be patient!   As long as your average hashrate as reported by the pool is around 400Gh/s, you are ok.  The point of the watchdog is to ensure, regardless of glitches, you keep mining.  If your average pool reported hashrate is below 400Gh/s, then there may actually be a problem.  Post it!  Be clear about the symptoms, and post a log in verbose mode if possible.  All the engineers here want to see them!

3. If your die temps vary from each other more than about 4 degrees C, there is likely something up with your cooler.  Note that you must be running the custom hf version of cgminer to see all 4 die temps at once.  The standard version only shows the max per board in the status.  The temps seem to prefer being in the 70-80C range.  Don't think cooler is better, because it usually isn't!  (Though this varies somewhat from die to die)  Get a tube of hi-quality (grey) thermal compound, remove your cooler head, re-apply the compound (read online how to do this, but a little goes a long way!) and re-install the cooler head.  There should be washers under each screw, and the screws should be tightened until the stop.  Do not force them beyond snug!

4. Unfortunately we cannot support you if you are running Windows or some other miner program.  We only officially support you mining with the RPI, but if you are running our version of cgminer (right now it's 3.9.0h2) on a Linux box, I'll still try to help you.  If you are running anything else, don't bother asking, as there are plenty of known issues that we are aware of that can cause problems.  If you want to make the most BTC and not have your miner dead when you check it, you should be running what we recommend.  If you are looking at this post and it's more than a week old, this information may be out of date.  Read newer posts!

5. Tech Support.  I plan to remain active here as much as I can.  I'll help if I can.  But we are also improving tech support.  We have hired new people, developed new procedures, and engineering is monitoring TS and providing guidance.  If you didn't get good help, try again!  Please be patient and CIVIL.  Everyone wants to help, make easier for them not harder.

6. Warranty; I am not authorized to give you any information on what voids it or how long it lasts, etc.  But if you have a broken system, most likely we can get you going again.  Don't panic, be patient!

-Phil
 
newbie
Activity: 14
Merit: 0
February 06, 2014, 04:31:41 PM
No, that is a 2.54mm pitch connector, whereas the chaining connector uses a 2mm pitch.  I think we use a Samtec 2x8 IDC connector in the cables.

It's the same pitch as the old 2.5" laptop IDE drive cables were.  So if you had one of those laying around, you might be able to cut it down.  (They were 44 pin IIRC)

-Phil

How about these: http://www.ebay.com/itm/10pcs-Flat-cable-connector-2mm-0-079-IDC-SOCKET-16Pin-2x8Pin-/291060588602?pt=LH_DefaultDomain_0&hash=item43c490203a

Then we just need the ribbon cable. A bit of D.I.Y.
newbie
Activity: 4
Merit: 0
February 06, 2014, 04:20:31 PM
sr. member
Activity: 307
Merit: 250
February 06, 2014, 04:08:50 PM
wait - seems to be working now. am thinking it might have been too warm in the room. doh! Smiley

I keep mine outside in the garage! 58C on all cores!


What Clock speed are you getting now ?

I run it at 581, as it is the most stable ATM. Once I go above that watchdog resetting the unit so often results in <400GH/s hashrate Sad.

Clocked at 615 it only reaches 63C, but watchdog resets it ~3x/min!!
member
Activity: 84
Merit: 10
February 06, 2014, 01:08:02 PM
wait - seems to be working now. am thinking it might have been too warm in the room. doh! Smiley

I keep mine outside in the garage! 58C on all cores!


What Clock speed are you getting now ?
sr. member
Activity: 307
Merit: 250
February 06, 2014, 11:30:15 AM
wait - seems to be working now. am thinking it might have been too warm in the room. doh! Smiley

I keep mine outside in the garage! 58C on all cores!
newbie
Activity: 7
Merit: 0
February 06, 2014, 10:53:19 AM
wait - seems to be working now. am thinking it might have been too warm in the room. doh! Smiley
newbie
Activity: 7
Merit: 0
February 06, 2014, 09:52:42 AM
thanks - hmm..got a bit of a problem then, BJ is throwing up "HFB 0: Unhandled operation code 12"" in cgminer-3.12 every 20 seconds or so and rebooting.
also not working on Pi

anyone seen anything like that before?
thanks
warren
legendary
Activity: 1112
Merit: 1000
February 06, 2014, 09:29:29 AM
anyone in the UK know if a standard 10A power lead will work with the BJ instead of the 13A one provided by HF?

Yes, because you are only loading the power supply up to 450 Watts, which is less than 2 Amps
newbie
Activity: 7
Merit: 0
February 06, 2014, 08:36:34 AM
anyone in the UK know if a standard 10A power lead will work with the BJ instead of the 13A one provided by HF?
BJ is rebooting itself every few minutes with this error "HFA 0 NOTICE: ######################### WARNING: Work Watchdog Reboot Imminent!",
then the rasberry PI reboots (sometimes)

edit: also getting this error a lot "HFB 0: Unhandled operation code 12"

edit: MinePeon Version:  0.2.4.3hf8
HashFast Firmware : 735ccca5
hero member
Activity: 991
Merit: 500
February 06, 2014, 05:12:02 AM
Thanks for your ongoing input. If anyone finds these online from the US, let me know. They seem hard to come by.
newbie
Activity: 28
Merit: 0
February 06, 2014, 04:30:36 AM
No, that is a 2.54mm pitch connector, whereas the chaining connector uses a 2mm pitch.  I think we use a Samtec 2x8 IDC connector in the cables.

It's the same pitch as the old 2.5" laptop IDE drive cables were.  So if you had one of those laying around, you might be able to cut it down.  (They were 44 pin IIRC)

-Phil
newbie
Activity: 28
Merit: 0
February 06, 2014, 04:21:31 AM
So, when daisy-chaining boards together, the 5 pin power connector needs to be attached to the LAST board in the chain?
If multiple boards are being supplied with enough power, how many could be PRACTICALLY chained together?
We have not tested the firmware with chains longer than 3 boards.  This is the configuration present in the Sierra product.

Practically, the limit is getting all the boards powered.  If the Sierra, there are 2 PSU's, and the left side of each board is connected to one PSU, and the right sides are connected to the other.  It's important that the grounding be the same on the left side to ensure reliability.  This will be what ultimately limits how many you can practically chain.

-Phil
hero member
Activity: 991
Merit: 500
newbie
Activity: 28
Merit: 0
February 06, 2014, 04:16:29 AM
To overclock (may void your warranty!), add the following parameter to the cgminer command line:
Code:
--hfa-hash-clock xxx
Where xxx is the speed in mhz you'd like to run.  Valid values are 125 on up, and default is 550.

If you are running a RPI with our image, you can add this line to the "Extra cgminer parameters" section in the settings page of the web interface.

In a perfect world the hashrate is mhz X 768 (the total number of hashing units built into the ASIC).  Each of the 4 die has 96 cores, and each core has two hashing units, for a total of 768.  Note: It's normal to have a few defective cores per die, and it's also considered normal for some cores to produce occasional errors.  This means the maximum possible hash rate is 422.4 Gh/s when running at 550mhz, but in the real-world it will typically be lower due to the reasons I just mentioned.

Some boards will like certain clock rates and some will like others.  If you do experiment with values, and I'm not recommending you do, try small increments and watch the error rates and/or if the ASIC stalls.  When it attempts to draw too much power, the power supply can momentarily dip and cause it to stop sending work.  This is when you'll see the watchdog timer get invoked.

On an average board, the buck converters (take 12V down to core voltage) are usually the limiting factor for ASIC performance.  They put out over 400 amps of low voltage power for the ASIC's cores to operate with.  Right now in the present version of the firmware and cgminer, the regulators cannot be adjusted, but soon we will release firmware that supports adjustment of these regulators as well as independent clock speed adjustment per die.

We are also presently performing lab qualification of the silicon, so I'll soon have some numbers for you guys as to what the silicon is capable of.

The on chip programmable PLL is capable of pushing the clock rate to well over 1ghz.  (You would need about 2X the power and cooling to hit this rate though!)

-Phil
newbie
Activity: 14
Merit: 0
February 06, 2014, 01:16:00 AM

Once you have that, simply mount the second board, connect the chaining cable from the top to the bottom board (using the 2 closest 16 pin connectors), MOVE the 5-pin cable to the second board, and then connect the spare set of PCI-E cables to the new lower board.

-Phil

So, when daisy-chaining boards together, the 5 pin power connector needs to be attached to the LAST board in the chain?
If multiple boards are being supplied with enough power, how many could be PRACTICALLY chained together?

newbie
Activity: 4
Merit: 0
February 06, 2014, 12:49:17 AM
,
I was getting great performance when my 3 BJ's arrived. At least on two of them, there was one that seemed a bit flakey and seems to be getting worse. All 3 seem to be getting worse. One of them keeps shutting down and restarting grabbing a new ID each time. I posted a small capture from my monitor so that maybe if others have had this and fixed it they can help. This one that is flaky was delivered in poor condition, all but one screw was sliding around the inside of the box and 4 risers that prop the board up were also floating around. I though the board was damaged and it may be. Worst assembly I have ever seen and I used to build PC's. That point is mute if they all work but they don't.

Here is what I need help answering.
1. How do I get back to my 400+ GH performance. All three of these were pushing +-420GH.
2. Worse case scenario is how do I raise the clock speed to get up to the 400GH range? I was promised by Erin at Hashfast that it would not void the warranty but the delivery paperwork stated otherwise. I will have to deal with that warranty issue separately, I know.
3. I seem to be getting huge error rates, I think it is on the flaky board but not 100% sure that is the only place. 70% to 80% errors are common. Elgius reports I am only getting 900 to 950 GH/s AVG with 3 boards and I should be getting closer to 1200. Not good for profits. Any help would be appreciated. I hate to try and send a board back under warranty since I might be stuck for months. I still am waiting on my 3 upgrades, I would hat to be down to 2 processors if I can avoid it.
Sorry about the loose parts.  The problem wasn't bad assembly, it was the parts coming loose during shipping.  They are using threadlock now to stop that problem.   It's unlikely your boards are damaged if they are hashing on all 4 die, which sounds like they are.

What are your die temps and voltages?  (if you run the "hf" version of cgminer it reports all dies/voltages)

Since your unit obviously had quite a ride during shipping, I'd look at each die temp and see if there is a discrepancy.  You might need to take the cooler head off, re-grease the thermal interface, and re-tighten the cooler.  Apparently some of those earlier orders that had loose standoffs have a high chance of having a loose cooler.

Also, what hash clock speed are you using?   Have you tired other speeds?

-Phil



Thanks for your time Phil, I know this is on your own and a favor to the newly create Hashfast community.

By the way, I would love to try other speeds but I am not sure how to adjust them and its all pretty hush, hush it seems. Before I ordered I was told I would be able to overclock without voiding my warranty but now I am worried about doing it based on paperwork received. I may try it on the machine that is struggling anyway if someone will tell e how. I run the machine headless so there seems to be little access. If you can email me instructions to overclock that would be great. kenpofighta at gmail dot com.

Details from the struggling miner: (You can see it stop an id and start new ones throughout the day, error rates stay around 80%. Seems 2 cores are much cooler than the other 2)
 cgminer version 3.9.0h2 - Started: [2014-02-05 16:45:10]
                                              
 (5s):322.7G (avg):367.5Gh/s | A:1038456  R:7040  HW:13622629  WU:5999.8/m                                                      
 ST: 2  SS: 0  NB: 38  LW: 1772615  GF: 0  RF: 0                                                                                
 Connected to stratum.mining.eligius.st diff 128 with stratum as user 1Bk9VSwaXn9UhpdowTwHEPEWw95i5F7N2_Miner2                  
 Block: c019bc52...  Diff:2.62G  Started: [22:23:14]  Best share: 699K                                                          
                                              
 [P]ool management Settings [D]isplay options [Q]uit                                                                          
 HFA 2: 61C/.23V796C/.23V1429C/.26V661C/.23V OFF   /40.21Gh/s | A: 96447 R:1472 HW: 1362431 WU:  650.5/m                        
 HFA 4:  75C/.80V 77C/.80V 62C/.82V 61C/.82V 368.7G/369.1Gh/s | A:810420 R:4800 HW:10748045 WU: 6021.6/m    

Here are stats from Eligius (I boke them down by naming the miners so that I could better analyse the data about 4 hours ago, Miner 2 is the struggling miner)
Miner2   3 Hours      214.79 Gh/s   540096   
           22.5 Minutes   196.28 Gh/s   61696   
Miner1   3 Hours      400.76 Gh/s   1007744   
           22.5 Minutes   416.19 Gh/s   130816   
Miner3   3 Hours      387.32 Gh/s   973952   
           22.5 Minutes   374.65 Gh/s   117760

I have the stats for the other two below for comparison. These units all sit side by side in the same room and run about the same regardless of how cool I keep the room.

I appreciate that the loose parts may not have been purely an assembly issue but I have never seen a shipped piece of hardware in this condition after being shipped. I hope the threadlock will stop the issue going forward. I am hoping what you say is correct and that the issue is due to loose parts and nothing more.

I am a software architect by profession and eventually I will port some of these features to Windows, I know there are many haters but it is a very stable platform, I am running Windows Server 2012 and can spin up multiple VM's to handle mining as I grow my operation. I can post on how that goes as I get to it.


 cgminer version 3.9.0h2 - Started: [2014-02-05 16:45:32]
                                      
 (5s):440.3G (avg):421.4Gh/s | A:1900827  R:17408  HW:7210  WU:11114.1/m                                                        
 ST: 2  SS: 0  NB: 38  LW: 2047136  GF: 0  RF: 0                                                                                
 Connected to stratum.mining.eligius.st diff 256 with stratum as user 1Bk9VSwaXn9UhpdowTwHEPEWw95i5F7N2_Miner1                  
 Block: 12f30e18...  Diff:2.62G  Started: [22:31:53]  Best share: 551K                                                          
                                            
 [P]ool management Settings [D]isplay options [Q]uit                                                                          
 HFA 0: 77C/.80V 82C/.80V 73C/.79V 72C/.79V 425.9G/421.4Gh/s | A:1899803 R:17408 HW:7215 WU:11117.8/m                          
                                              

 cgminer version 3.9.0h2 - Started: [2014-02-05 16:46:44]
                                          
 (5s):408.1G (avg):415.5Gh/s | A:1883755  R:16256  HW:6779  WU:10806.0/m                                                        
 ST: 2  SS: 0  NB: 38  LW: 2015649  GF: 0  RF: 0                                                                                
 Connected to stratum.mining.eligius.st diff 256 with stratum as user 1Bk9VSwaXn9UhpdowTwHEPEWw95i5F7N2_Miner3                  
 Block: 12f30e18...  Diff:2.62G  Started: [22:31:54]  Best share: 4.79M                                                        
                                            
 [P]ool management Settings [D]isplay options [Q]uit                                                                          
 HFA 0: 81C/.79V 76C/.79V 73C/.79V 73C/.79V 410.5G/415.6Gh/s | A:1884011 R:16256 HW:6780 WU:10808.2/m                          
--------------------------------------------------------------------------------    
newbie
Activity: 28
Merit: 0
February 06, 2014, 12:27:59 AM
Hi Phil,
I would like to OC to 600MHZ, but does that void the warranty? Anyway since the warranty is only 10 days(which does not make sense for such an expensive device), do you see at what chance I will need it since it is running smoothly for about a day already?

Or should I call the warranty under the reason device does not perform as described and let Hash Fast "FIX" it for me? If the official fix is to OC it, then I don't think that will void the on going warranty right(even it's not long)?  Grin
I'm not allowed to comment on what constitutes overclocking and/or what particular events will void the warranty.  As far as I am aware, (my personal opinion alone, and not constituting legal advice or that of HashFast.) is that "Overclocking COULD void your warranty".

I can tell you from personal experience that I have seen zero boards damaged by anything you might be able to do externally via software.  If you don't physically mess with the board, I don't see how you can harm it.

Sorry, I wish I could give you better answers, but the lawyers will probably then tell me I have to stay out of the forums.

-Phil

newbie
Activity: 19
Merit: 0
February 06, 2014, 12:04:39 AM
Also Phil, I have two more questions:
1. I can see some errors in cgminer out put, how to fix them?
[image snipped]
As you can see the hardware error is keep going up, is this normal?
Also the first line of the scroll-able zone there is an error message:
HFA 0 attempted reset got err:(-5) LIBUSB_ERROR_NOT_FOUND 
How to fix it?

2. Second, with default configuration, from cgminer's interface I can see the hash rate is always around 400GH/s. But from pool statistics average is only between 320 to 380. How to explain that? Can I say my BJ does not perform as described in specification?
Thanks you!
[image snipped]

In both these cases, we hope the new firmware will help correct these issues.  It's absolutely normal to have some hardware errors.  We have fixed the source of some of these in the upcoming firmware.  Con Kolivas is also working with us to make many improvements in cgminer, so we will release both together as soon as we can.

You should be able to get about 400Gh/s at the pool (average) though you might have to clock to 600mhz to get it at present.

My test system at http://setup.hashfast.com/rpi/ is running at 600, it's got a bunch of errors, but as you can see from the pool stats on Eligius it's cruising along at about 428Gh/s for the 12 hour average:
http://eligius.st/~wizkid057/newstats/userstats.php/17ao2fT7gbnnKYpMd7x29E8p4oGdE87Ahd

The longer timeframe stats on Eligius tend to be more accurate and useful.  The stuff under 3 hours is hit-or-miss.

-Phil

Hi Phil,
I would like to OC to 600MHZ, but does that void the warranty? Anyway since the warranty is only 10 days(which does not make sense for such an expensive device), do you see at what chance I will need it since it is running smoothly for about a day already?

Or should I call the warranty under the reason device does not perform as described and let Hash Fast "FIX" it for me? If the official fix is to OC it, then I don't think that will void the on going warranty right(even it's not long)?  Grin

newbie
Activity: 28
Merit: 0
February 05, 2014, 11:58:36 PM
Hi Phil,
Yes, I compiled cgminer with --enable-hashfast on my xubuntu. It can detect HF device but can not enable the device, that's why I think it's permission issue.
You'll need to follow the instructions for enabling the udev rules if you want to run as a non-root user:

Create a file called "01-hashfast.rules" in the /etc/udev/rules.d/ directory containing the following lines:
Code:
ATTRS{idVendor}=="297c", ATTRS{idProduct}=="0001", MODE="0660", GROUP="plugdev"
ATTRS{idVendor}=="297c", ATTRS{idProduct}=="8001", MODE="0660", GROUP="plugdev"

You will have to be root or sudo'd to create the file.

-Phil
Pages:
Jump to: