Pages:
Author

Topic: The Habanero - 650GH/s - OOS - page 5. (Read 95985 times)

SVK
sr. member
Activity: 378
Merit: 250
October 13, 2014, 01:35:22 AM
Oh any suggestions for bringing die temps down? I just got a corsair H110 for my hab and am using AS 5 TIM

Dies 1 and 3 are running 15-20C hotter than 0&2.

Tried readjusting torque. Re applying TIM. BUT still 15-20 difference.

And this is limiting me to 675MHz@885mV using HFTool before I run into a die hitting 100C

I have similar problem with ASIC 1. It runs far too hot to a point that I have turned that board off completely.

No point in running it.

legendary
Activity: 1274
Merit: 1004
October 11, 2014, 09:31:23 PM
Oh any suggestions for bringing die temps down? I just got a corsair H110 for my hab and am using AS 5 TIM

Dies 1 and 3 are running 15-20C hotter than 0&2.

Tried readjusting torque. Re applying TIM. BUT still 15-20 difference.

And this is limiting me to 675MHz@885mV using HFTool before I run into a die hitting 100C
Check the flatness of the waterblock. I've noticed a few of the Asetek waterblocks I've seen have been very convex.
hero member
Activity: 658
Merit: 500
CCNA: There i fixed the internet.
October 11, 2014, 07:52:18 PM
Oh any suggestions for bringing die temps down? I just got a corsair H110 for my hab and am using AS 5 TIM

Dies 1 and 3 are running 15-20C hotter than 0&2.

Tried readjusting torque. Re applying TIM. BUT still 15-20 difference.

And this is limiting me to 675MHz@885mV using HFTool before I run into a die hitting 100C
legendary
Activity: 1274
Merit: 1004
October 11, 2014, 06:34:43 PM
Uh oh.... Dabs may have cashed in her chips.

I shut it off to put a kill-a-watt on it, and when I switched back on - nil.  Lights blink for just a sec and then the PSU shuts down.

PSU roulette with another habanero eliminates the PSU as a problem.

Can someone point me in the right direction to troubleshoot it?

Did you take the Kill-a-Watt out of the loop whilst troubleshooting?

Yes - my "production" mining rack is 240v.  Test bed is 120v(my K-A-W is 120v). Tested both PSUs on 120 and 240 and with two different boards.  The problem is iso'd to this board.

@Taugeran - Thanks for the suggestion - die 1 input is the culprit.  Where do I go from here?
You can disable that die by using the hftool.

$ ./hftool.py -w 0:VLT@FRQ,1:0@0,2:VLT@FRQ,3:VLT@FRQ

The board itself is causing the PSU to turn off, it's probably a hardware fault. PM me and I can arrange an RMA.
hero member
Activity: 539
Merit: 500
October 11, 2014, 05:55:26 PM
Uh oh.... Dabs may have cashed in her chips.

I shut it off to put a kill-a-watt on it, and when I switched back on - nil.  Lights blink for just a sec and then the PSU shuts down.

PSU roulette with another habanero eliminates the PSU as a problem.

Can someone point me in the right direction to troubleshoot it?

Did you take the Kill-a-Watt out of the loop whilst troubleshooting?

Yes - my "production" mining rack is 240v.  Test bed is 120v(my K-A-W is 120v). Tested both PSUs on 120 and 240 and with two different boards.  The problem is iso'd to this board.

@Taugeran - Thanks for the suggestion - die 1 input is the culprit.  Where do I go from here?
legendary
Activity: 1358
Merit: 1001
https://gliph.me/hUF
October 11, 2014, 11:20:24 AM
For the board with the regulator programming error, can you verify that 12V is good to each of the connectors?

I'll be able to check that on Monday.


For the other error, I've only seen that once in relation to multipool. Does that happen immediately after a block is detected and you get a stratum restart? I've talked with Con a little about this, and his opinion is also that it is a problem with the GWQ. The longterm solution is probably to completely replace the HF driver and allow cgminer to schedule work, but I honestly don't think that will ever happen. Try removing backup pools temporarily to see if the issue goes away. I don't have 4.6.x on any machines, mostly a mix of 4.5 and 4.3.

I have removed the backup pools and still get that error.

For finding the block detections and stratum restarts I guess I have to pipe the cgminer output to a log file? Or is there another way other than sitting in front of it and waiting for a block to happen?


JakeTri cgminer 4.4.0
Code:
[...] --hfa-options "Chip:950@980,Dabs:950@0@980@980@970:0:0:0:-25"

In the meanwhile I have read through the HF-Tool thread as well. But no matter how I send the info to my boards, it only picks up the first to values and assigns the second value to the other 3 boards as well. To illustrate:

Code:
--hfa-options "hab2:800@890,hab3:875@935,hab4:850@920,hab5:875@935,hab6:850@920"

This results in hab2 at 800 and hab3, hab4, hab5 and hab6 at 875. I'm using 4.4.1

I am going on a limb and say nobody ever tried that with more than 2 devices?



Uh oh.... Dabs may have cashed in her chips.

I shut it off to put a kill-a-watt on it, and when I switched back on - nil.  Lights blink for just a sec and then the PSU shuts down.

PSU roulette with another habanero eliminates the PSU as a problem.

Can someone point me in the right direction to troubleshoot it?

Did you take the Kill-a-Watt out of the loop whilst troubleshooting?
hero member
Activity: 658
Merit: 500
CCNA: There i fixed the internet.
October 11, 2014, 02:16:15 AM
Uh oh.... Dabs may have cashed in her chips.

I shut it off to put a kill-a-watt on it, and when I switched back on - nil.  Lights blink for just a sec and then the PSU shuts down.

PSU roulette with another habanero eliminates the PSU as a problem.

Can someone point me in the right direction to troubleshoot it?

Maybe try each individual atx connector by itself? See if one in particular shorts. Cuz if the PSU shuts off that sounds like a short
hero member
Activity: 539
Merit: 500
October 10, 2014, 05:30:49 PM
Uh oh.... Dabs may have cashed in her chips.

I shut it off to put a kill-a-watt on it, and when I switched back on - nil.  Lights blink for just a sec and then the PSU shuts down.

PSU roulette with another habanero eliminates the PSU as a problem.

Can someone point me in the right direction to troubleshoot it?
hero member
Activity: 658
Merit: 500
CCNA: There i fixed the internet.
October 10, 2014, 01:44:52 AM
I know it will sounds odd but I've enjoyed a certain amount of success using bfgminer
[...]

Thank you for the input. Which version are you using?

On 4.9.0 I get a ton of:

Code:
[2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035203000050) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035303000046) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035403000024) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035503000032) => -1 errno=5(Input/output error)

and others which go past too quick for a copy and paste.

After a few minutes it quits with:
Code:
Segmentation fault (core dumped)


4.2.0
legendary
Activity: 1358
Merit: 1001
https://gliph.me/hUF
October 10, 2014, 01:43:06 AM

MrTeal and xjack, also thanks for the input. I will try those suggestions.
legendary
Activity: 1358
Merit: 1001
https://gliph.me/hUF
October 10, 2014, 01:41:54 AM
I know it will sounds odd but I've enjoyed a certain amount of success using bfgminer
[...]

Thank you for the input. Which version are you using?

On 4.9.0 I get a ton of:

Code:
[2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035203000050) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035303000046) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035403000024) => -1 errno=5(Input/output error)
 [2014-10-10 07:32:08] hashfast fd=40: SEND (aa0b035503000032) => -1 errno=5(Input/output error)

and others which go past too quick for a copy and paste.

After a few minutes it quits with:
Code:
Segmentation fault (core dumped)
hero member
Activity: 539
Merit: 500
October 09, 2014, 06:43:02 PM
@ Newar - fwiw, I run JakeTri cgminer 4.4.0 on ubuntu 13.10, and JakeTri 4.4.1 on Debian/Beaglebone.  Both are rock solid 24/7.

Here are my configs...

BBB - runs speed/voltage set in firmware - hfa-hash-clock 1.
Code:
screen -dmS hab /root/cg-hab/cgminer -c /root/cg.conf

root@beaglebone:~# cat cg.conf
{
"pools" : [
        {
         ...snip...
        }
],
"hfa-hash-clock" :"1",
"hfa-fan" : "100",
"hfa-temp-target" : "0",
"hfa-temp-overheat" : "104",
"hfa-fail-drop" : "10",
"api-allow" : "W:127.0.0.1,W:192.168.2.0/24",
"api-listen" : true,
"api-port" : "4028",
"failover-only" : true,
"widescreen" : true
}

ubuntu - same cgminer.conf, but with hfa-hash-clock removed.
Code:
screen -dmS hab /home/cg-hab/cgminer -c /home/cgminer.conf --hfa-options "Chip:950@980,Dabs:950@0@980@980@970:0:0:0:-25"
hero member
Activity: 658
Merit: 500
CCNA: There i fixed the internet.
October 09, 2014, 02:08:26 PM
Sort-of-update, nothing new or resolved, but maybe some more detail/accuracy:

I have isolated the Zombie to its own cgminer instance and it returns only this on startup:
Code:
ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():
3170
HFA : OP_USB_INIT failed! Operation status 20 (Regulator
programming error)

Repeat.


I was not able to detect a short on the board.


For the good boards, even with fail-over only I still get the:
Code:
[2014-10-09 16:56:52] HFB hab5: Bad work sequence tail 1633 head 322 devhead 32
2 devtail 1730 sequence 2048
 [2014-10-09 16:56:52] HFB 1 failure, disabling!
 [2014-10-09 16:56:52] HFB 3 failure, disabling!
 [2014-10-09 16:56:53] HFB hab4: Bad work sequence tail 1409 head 1881 devhead 1
881 devtail 1505 sequence 2048
 [2014-10-09 16:56:53] HFB hab2: Bad work sequence tail 115 head 726 devhead 726
 devtail 239 sequence 2048
 [2014-10-09 16:56:53] HFB 0 failure, disabling!
 [2014-10-09 16:56:53] HFB 4 failure, disabling!

Whilst at the same time in syslog:

Code:
Oct  9 16:56:52 lubuntu kernel: [171751.096002] cdc_acm 1-4.3:1.0: ttyACM0: USB ACM device
Oct  9 16:56:52 lubuntu kernel: [171751.096370] cdc_acm 1-2.3:1.0: ttyACM1: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.618102] cdc_acm 1-4.4:1.0: ttyACM2: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.622705] cdc_acm 1-2.2:1.0: ttyACM3: USB ACM device

This happens roughly once per hour.


I'd still be interested to hear what cgminer version fellow miners are running. Not that it will solve the problems above, but maybe get same additional stability.

I know it will sounds odd but I've enjoyed a certain amount of success using bfgminer (which uses the other protocol supported by the HF boards). It has very good and off the bat detection/disablement of bad hash cores. Though I have seen oddities where the device(s) must be manually added to bfgminer to initialize properly
Code:
bfgminer  -S HFA:noauto --set HFA:clock=650
M
+
HFA:auto

my two bitcents

and it can dump out the whole HF_Frame using the commandline flags:
Code:
-D --device-protocol-dump 2> HF.Logfile.log
legendary
Activity: 1274
Merit: 1004
October 09, 2014, 01:13:46 PM
Sort-of-update, nothing new or resolved, but maybe some more detail/accuracy:

I have isolated the Zombie to its own cgminer instance and it returns only this on startup:
Code:
ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():
3170
HFA : OP_USB_INIT failed! Operation status 20 (Regulator
programming error)

Repeat.


I was not able to detect a short on the board.


For the good boards, even with fail-over only I still get the:
Code:
[2014-10-09 16:56:52] HFB hab5: Bad work sequence tail 1633 head 322 devhead 32
2 devtail 1730 sequence 2048
 [2014-10-09 16:56:52] HFB 1 failure, disabling!
 [2014-10-09 16:56:52] HFB 3 failure, disabling!
 [2014-10-09 16:56:53] HFB hab4: Bad work sequence tail 1409 head 1881 devhead 1
881 devtail 1505 sequence 2048
 [2014-10-09 16:56:53] HFB hab2: Bad work sequence tail 115 head 726 devhead 726
 devtail 239 sequence 2048
 [2014-10-09 16:56:53] HFB 0 failure, disabling!
 [2014-10-09 16:56:53] HFB 4 failure, disabling!

Whilst at the same time in syslog:

Code:
Oct  9 16:56:52 lubuntu kernel: [171751.096002] cdc_acm 1-4.3:1.0: ttyACM0: USB ACM device
Oct  9 16:56:52 lubuntu kernel: [171751.096370] cdc_acm 1-2.3:1.0: ttyACM1: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.618102] cdc_acm 1-4.4:1.0: ttyACM2: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.622705] cdc_acm 1-2.2:1.0: ttyACM3: USB ACM device

This happens roughly once per hour.


I'd still be interested to hear what cgminer version fellow miners are running. Not that it will solve the problems above, but maybe get same additional stability.
For the board with the regulator programming error, can you verify that 12V is good to each of the connectors?

For the other error, I've only seen that once in relation to multipool. Does that happen immediately after a block is detected and you get a stratum restart? I've talked with Con a little about this, and his opinion is also that it is a problem with the GWQ. The longterm solution is probably to completely replace the HF driver and allow cgminer to schedule work, but I honestly don't think that will ever happen. Try removing backup pools temporarily to see if the issue goes away. I don't have 4.6.x on any machines, mostly a mix of 4.5 and 4.3.
legendary
Activity: 1358
Merit: 1001
https://gliph.me/hUF
October 09, 2014, 12:36:45 PM
Sort-of-update, nothing new or resolved, but maybe some more detail/accuracy:

I have isolated the Zombie to its own cgminer instance and it returns only this on startup:
Code:
ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():
3170
HFA : OP_USB_INIT failed! Operation status 20 (Regulator
programming error)

Repeat.


I was not able to detect a short on the board.


For the good boards, even with fail-over only I still get the:
Code:
[2014-10-09 16:56:52] HFB hab5: Bad work sequence tail 1633 head 322 devhead 32
2 devtail 1730 sequence 2048
 [2014-10-09 16:56:52] HFB 1 failure, disabling!
 [2014-10-09 16:56:52] HFB 3 failure, disabling!
 [2014-10-09 16:56:53] HFB hab4: Bad work sequence tail 1409 head 1881 devhead 1
881 devtail 1505 sequence 2048
 [2014-10-09 16:56:53] HFB hab2: Bad work sequence tail 115 head 726 devhead 726
 devtail 239 sequence 2048
 [2014-10-09 16:56:53] HFB 0 failure, disabling!
 [2014-10-09 16:56:53] HFB 4 failure, disabling!

Whilst at the same time in syslog:

Code:
Oct  9 16:56:52 lubuntu kernel: [171751.096002] cdc_acm 1-4.3:1.0: ttyACM0: USB ACM device
Oct  9 16:56:52 lubuntu kernel: [171751.096370] cdc_acm 1-2.3:1.0: ttyACM1: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.618102] cdc_acm 1-4.4:1.0: ttyACM2: USB ACM device
Oct  9 16:56:53 lubuntu kernel: [171751.622705] cdc_acm 1-2.2:1.0: ttyACM3: USB ACM device

This happens roughly once per hour.


I'd still be interested to hear what cgminer version fellow miners are running. Not that it will solve the problems above, but maybe get same additional stability.
legendary
Activity: 1358
Merit: 1001
https://gliph.me/hUF
October 09, 2014, 01:22:20 AM

Thanks for the quick reply. I see if the additional hub helps and report back.

Let me add to the experimental firmware request too Wink
legendary
Activity: 1274
Merit: 1004
October 08, 2014, 06:26:52 PM

MrTeal, is there any chance you could release a purely experimental big red warning firware with the 110 ceiling?  I've got a three-die card that isn't staying cool enough and I'd like to raise the ceiling, even if it means burning it up.
+1
Hmmm.. I suppose I could. I have some changes I've added to the code base anyway that should probably be pushed out once I get them tested fully. Might help identify Newar's regulator programmer error as well.
legendary
Activity: 1274
Merit: 1004
October 08, 2014, 06:24:38 PM
sr. member
Activity: 272
Merit: 250
October 08, 2014, 03:45:33 PM
Anybody have boards for sale? PM please
legendary
Activity: 1593
Merit: 1004
October 08, 2014, 03:43:19 PM
I got my hands on some Habaneros and I'm looking for some answers that I couldn't find in the thread(s) on here, the website or duckduckgo-ing:

Am I correct that hfa-temp-overheat does not work for temps over 105? It seems that is the cut off set by the firm-/hardware? I.e. I set it to 110 (as a few posted in this thread) and it still cuts off at 105.


I got this error, for which I couldn't find an explanation, occasionally cgminer (4.6.1) also crashes after this:
Code:
Bad work sequence tail 1153 head 1792 devhead 1792 devtail 1258 sequence 2048        #(sometimes other numbers)
HFB 0 failure, disabling!


Is there a way to find out which firmware they are flashed with?


What is everybody running? JakeTri cgminer or the "stock" cgminer?


My impression from the thread is that the USB is somewhat finicky. I have one Zombie, that I think is because of USB troubles. It sort of reminds me of the CM-1s. Apart from the power up sequence (12V to boards first, wait, plug in USB) is there anything else that can be done about it?


Also, anyone selling Habaneros in Europe, send me a PM Wink
105 is the thermal shutdown in the firmware. I know HF increased that value to 110C in later firmwares in order to allow the Evos to clock higher, but I didn't feel comfortable with that and left it at the lower value.
With that bad work sequence error, are you running multiple pools with load balance or balance in cgminer? I have seen that before in that specific instance, and it has to do with the HF global work queue. Disabling multipool should fix that error.

What kind of error do you get on the zombie?

MrTeal, is there any chance you could release a purely experimental big red warning firware with the 110 ceiling?  I've got a three-die card that isn't staying cool enough and I'd like to raise the ceiling, even if it means burning it up.
+1
Pages:
Jump to: