Pages:
Author

Topic: The Chili – 30+GH/s BFL based Bitcoin Miner Assembly - page 13. (Read 138078 times)

sr. member
Activity: 280
Merit: 250
Helperizer
Thanks, not sure of the numbering and I wasn't looking really well when I shorted the wrong pins (hence the mistake), but I *think* it would have been either one of the inner row of pins (farthest away from the edge of the PCB) right next to the pair I was trying to short.  So, if it looks like this:

-------------------edge of PCB
2  4  6  8  10
1  3  5  7  9

then I think it might have been pins 3 and 6 or even 3+5+6, but again that's a guess since the whole mis-reset happened because I was too quickly trying to reset the board when I really should have just turned off and on the PSU (was trying to keep the other Chili on the same PSU running during the reset).  All I know is that the USB comms didn't seem to work after that and so I replaced the FTDI chip and then all seemed pretty normal (once I had reprogrammed the FTDI).

Alright, so now I'm even more confused.  I've moved it upstairs to work on it with my laptop (running the same version of Ubuntu as the machine it was on previously), and now it's been running solid for almost 4 hours.  A few bfgminer crashes after 10 minutes at first, and now it's solid.  I'm reluctant to move it back to where it was, but it's not like it's in a good place.

Anyway, anyone run into this kind of an on-again off-again problem moving from machine to machine?  Thoughts?
What kind of hashrate were you getting before it would reset?

It seems to reset now (no more comms errors, just bfgminer crashing/stopping though it's solid with other units (chili and other USB miners)).  But it does so randomly, between a few minutes to several hours (even up to 12 hrs as the longest so far).  No raise/lower in GH/s, getting a solid 34-35 GH/s depending on pool and no noticeable change right before crashing.

Are there some test points I can measure while it's hashing that could help now that it's no longer got comms errors but instead leads to bfgminer crashes?

BTW, I can "make" it crash bfgminer by bumping the table it's on or moving it.  Must be a loose connection somewhere?  (maybe USB port/cable?  Will try a different cable tonight).
legendary
Activity: 1274
Merit: 1004
Has anyone of you tried to use just the thermal compound instead of thermal pads?
Something like AC MX-4?

Or is the thermal pad absolutely necessary because the chips might have slightly different height?
We were worried about that at the start, but after building them it turns out to not be much of a problem as long as you can support behind the chips with a backplate. I run mine with paste and they work great.
legendary
Activity: 1274
Merit: 1004
I run 3 chilis on a 1000 watt PSU.

If your PSU's wont turn on with the chili's disconnected then the PSU has a problem of some kind. PSUs are pretty simple. Are you jumpering them correctly to get them to turn on? Using a paperclip?

Yes paper clipping it. As I said everything runs fine for a while and then the PSU just turns off. Have not actually sat beside to see it happen as it will run fine for a 6-10 hours at a time no problems. Then it will be off , and I will be unable to get the power supply to turn back on for quite a few hours.

Edit: As a side note. What is the best way I should be powering off the power supply using the paperclip method? Just flip the switch or shold i be removing the paperclip then powering down? thx!
Just use the switch on the back.
I have never heard of anything like you're talking about.
What are the model numbers of the PSUs you're using? I have an older Corsair unit that won't running mining hardware unless I put a load on the 5V rail.
legendary
Activity: 3220
Merit: 2334
I fix broken miners. And make holes in teeth :-)
I turn mine off using the power switch. I have the paperclip taped in pretty tight so it wont fall out.

It doesn't sound like this is a chili issue but rather a PS issue.
I just cut the green wire and crimped it to a black wire. Yes it ruined a $50 power supply for resale, but I value that more than dumping a day's worth of mining due to a paperclip. :-)

C
full member
Activity: 195
Merit: 100
Has anyone of you tried to use just the thermal compound instead of thermal pads?
Something like AC MX-4?

Or is the thermal pad absolutely necessary because the chips might have slightly different height?

I am only using thermal paste-goo-stuff. I run 6 chilis and get results of 33-38 GH/S on them depending on if I have bottom cooling or not. I had very little luck with pads.

I've used thermal compound and thermal pads with mixed results. I think you can only get away with using just thermal compound if you've got a solid backplate. Otherwise the board may flex to the point where some chips aren't getting any contact. That is obviously less than ideal regardless of paste/pad but the thickness of the pad often times allows for contact where paste alone wouldn't.

On my air cooled chili I'm using a pad and on my water cooled chili I'm just using the paste that came on it. Both run ~37.
legendary
Activity: 910
Merit: 1000
I turn mine off using the power switch. I have the paperclip taped in pretty tight so it wont fall out.

It doesn't sound like this is a chili issue but rather a PS issue.
member
Activity: 80
Merit: 10
I run 3 chilis on a 1000 watt PSU.

If your PSU's wont turn on with the chili's disconnected then the PSU has a problem of some kind. PSUs are pretty simple. Are you jumpering them correctly to get them to turn on? Using a paperclip?

Yes paper clipping it. As I said everything runs fine for a while and then the PSU just turns off. Have not actually sat beside to see it happen as it will run fine for a 6-10 hours at a time no problems. Then it will be off , and I will be unable to get the power supply to turn back on for quite a few hours.

Edit: As a side note. What is the best way I should be powering off the power supply using the paperclip method? Just flip the switch or shold i be removing the paperclip then powering down? thx!
legendary
Activity: 910
Merit: 1000
Has anyone of you tried to use just the thermal compound instead of thermal pads?
Something like AC MX-4?

Or is the thermal pad absolutely necessary because the chips might have slightly different height?

I am only using thermal paste-goo-stuff. I run 6 chilis and get results of 33-38 GH/S on them depending on if I have bottom cooling or not. I had very little luck with pads.
legendary
Activity: 910
Merit: 1000
I run 3 chilis on a 1000 watt PSU.

If your PSU's wont turn on with the chili's disconnected then the PSU has a problem of some kind. PSUs are pretty simple. Are you jumpering them correctly to get them to turn on? Using a paperclip?
member
Activity: 80
Merit: 10
I've recently purchased a couple used chilis , and am having a very tough time keeping them mining constantly. They will run fine for a couple hours or so then power supply will turn off and will not turn back on for quite some time. At 1st I had 2 chilis on an older 450watt thermaltake. Ended up blowing that supply. I think from turning it on and off a few times trying to get orientation and configuration correct. I then moved over to a 500 watt Corsair supply brand new out of the box. Both were up and running when i went to bed and when I got up the PSU was off and I was unable to get it to come back on for several hours.

Next I tryed with only 1 chili hooked up thinking maybe they draw way more power then what I was lead to believe. Nope same issue. After that I thought maybe to much heat. So I outfitted the bottom of the board with heatsinks and added more cooling. Board would run below 70c after hours of running. But again would fail and PSU would not power back on for numerous hours.

I have yet a 3rd PSU on the way to re-test everything. But does anyone have any ideas on whats happening? Whats really confusing me is that the PSU takes so long to be able to be turned back on again. The fan will not come back on until several hours have passed with it being un plugged. Its almost like the PSU is having to dump current stored up or something. Very odd.  The boards once they have power dont seem to miss a beat. But something is causing the power supply to be knocked out?

When the miners are running they do a good job hash @ round 33/34 and are magnitudes quieter then the BFL single I have. Any help would be appreciated in this matter. The seller I got these boards from has a no refund policy obviously. So thanx in advance!
legendary
Activity: 1274
Merit: 1004
Thanks, not sure of the numbering and I wasn't looking really well when I shorted the wrong pins (hence the mistake), but I *think* it would have been either one of the inner row of pins (farthest away from the edge of the PCB) right next to the pair I was trying to short.  So, if it looks like this:

-------------------edge of PCB
2  4  6  8  10
1  3  5  7  9

then I think it might have been pins 3 and 6 or even 3+5+6, but again that's a guess since the whole mis-reset happened because I was too quickly trying to reset the board when I really should have just turned off and on the PSU (was trying to keep the other Chili on the same PSU running during the reset).  All I know is that the USB comms didn't seem to work after that and so I replaced the FTDI chip and then all seemed pretty normal (once I had reprogrammed the FTDI).

Alright, so now I'm even more confused.  I've moved it upstairs to work on it with my laptop (running the same version of Ubuntu as the machine it was on previously), and now it's been running solid for almost 4 hours.  A few bfgminer crashes after 10 minutes at first, and now it's solid.  I'm reluctant to move it back to where it was, but it's not like it's in a good place.

Anyway, anyone run into this kind of an on-again off-again problem moving from machine to machine?  Thoughts?
What kind of hashrate were you getting before it would reset?
legendary
Activity: 1274
Merit: 1004
How can i fix Temperature problem ?
say 100-105C. ( 1 flashing light )
But i was running fine after that. I just restart pc then it doesnot work any more.
I disconnected all cable for couple hours but 2 light flashing and not off.
Then i turn PSU off and turn on again. Chili was flashing 1 light again with temperature 100-105 SICK again.
Please help.

I would remove the heatsink, clean off any thermal paste you have on there, and reapply it. It sounds like you have one or more ASICs making really poor or no contact with the heatsink.

It was running fine. But i remove heat sink and reapplied thermal paste. Problem is still the same.
Do you have any new firmware for it ?

No firmware that would fix that. If your temperatures are that high there is something seriously wrong.
When you turn it on, the 7th LED continually flashes, correct? While it's doing that, open up the comm port it's on in a terminal program like Hyperterm or Putty, and type in "ZlX". You'll have to copy that and paste it instead of typing letter by letter. That will return the temps of each chip.
legendary
Activity: 1190
Merit: 1002
How can i fix Temperature problem ?
say 100-105C. ( 1 flashing light )
But i was running fine after that. I just restart pc then it doesnot work any more.
I disconnected all cable for couple hours but 2 light flashing and not off.
Then i turn PSU off and turn on again. Chili was flashing 1 light again with temperature 100-105 SICK again.
Please help.

I would remove the heatsink, clean off any thermal paste you have on there, and reapply it. It sounds like you have one or more ASICs making really poor or no contact with the heatsink.

It was running fine. But i remove heat sink and reapplied thermal paste. Problem is still the same.
Do you have any new firmware for it ?
sr. member
Activity: 280
Merit: 250
Helperizer
Somewhere between 0.85V and 1.15V.
Most crashes are caused by the power supply turning off for some reason, but this sounds like it might be different. Do you know which other pin you shorted out? Some of the other ones around there are the communication pins between the VRM and the microcontroller. The micro will shut the power down and reset if it reads funky values from the power supply, perhaps that could be a source of the error.

Thanks, not sure of the numbering and I wasn't looking really well when I shorted the wrong pins (hence the mistake), but I *think* it would have been either one of the inner row of pins (farthest away from the edge of the PCB) right next to the pair I was trying to short.  So, if it looks like this:

-------------------edge of PCB
2  4  6  8  10
1  3  5  7  9

then I think it might have been pins 3 and 6 or even 3+5+6, but again that's a guess since the whole mis-reset happened because I was too quickly trying to reset the board when I really should have just turned off and on the PSU (was trying to keep the other Chili on the same PSU running during the reset).  All I know is that the USB comms didn't seem to work after that and so I replaced the FTDI chip and then all seemed pretty normal (once I had reprogrammed the FTDI).

Alright, so now I'm even more confused.  I've moved it upstairs to work on it with my laptop (running the same version of Ubuntu as the machine it was on previously), and now it's been running solid for almost 4 hours.  A few bfgminer crashes after 10 minutes at first, and now it's solid.  I'm reluctant to move it back to where it was, but it's not like it's in a good place.

Anyway, anyone run into this kind of an on-again off-again problem moving from machine to machine?  Thoughts?
legendary
Activity: 1274
Merit: 1004
Is there a site, web page or pdf out there with the schematic of the chili board? I ask for reference and to trouble shoot my fan.

Thanks.
No, there isn't. What kind of problem are you having with your fan?
legendary
Activity: 1274
Merit: 1004
How can i fix Temperature problem ?
say 100-105C. ( 1 flashing light )
But i was running fine after that. I just restart pc then it doesnot work any more.
I disconnected all cable for couple hours but 2 light flashing and not off.
Then i turn PSU off and turn on again. Chili was flashing 1 light again with temperature 100-105 SICK again.
Please help.

I would remove the heatsink, clean off any thermal paste you have on there, and reapply it. It sounds like you have one or more ASICs making really poor or no contact with the heatsink.
legendary
Activity: 1190
Merit: 1002
How can i fix Temperature problem ?
say 100-105C. ( 1 flashing light )
But i was running fine after that. I just restart pc then it doesnot work any more.
I disconnected all cable for couple hours but 2 light flashing and not off.
Then i turn PSU off and turn on again. Chili was flashing 1 light again with temperature 100-105 SICK again.
Please help.
member
Activity: 110
Merit: 10
Is there a site, web page or pdf out there with the schematic of the chili board? I ask for reference and to trouble shoot my fan.

Thanks.
sr. member
Activity: 280
Merit: 250
Helperizer
Somewhere between 0.85V and 1.15V.
Most crashes are caused by the power supply turning off for some reason, but this sounds like it might be different. Do you know which other pin you shorted out? Some of the other ones around there are the communication pins between the VRM and the microcontroller. The micro will shut the power down and reset if it reads funky values from the power supply, perhaps that could be a source of the error.

Thanks, not sure of the numbering and I wasn't looking really well when I shorted the wrong pins (hence the mistake), but I *think* it would have been either one of the inner row of pins (farthest away from the edge of the PCB) right next to the pair I was trying to short.  So, if it looks like this:

-------------------edge of PCB
2  4  6  8  10
1  3  5  7  9

then I think it might have been pins 3 and 6 or even 3+5+6, but again that's a guess since the whole mis-reset happened because I was too quickly trying to reset the board when I really should have just turned off and on the PSU (was trying to keep the other Chili on the same PSU running during the reset).  All I know is that the USB comms didn't seem to work after that and so I replaced the FTDI chip and then all seemed pretty normal (once I had reprogrammed the FTDI).
legendary
Activity: 1274
Merit: 1004
I'm still trying to get my second Chili back up and reliable.  Right now, it stops talking to cgminer/bfgminer after 2-8 minutes and needs to be powered off/on and the miner restarted to resume mining.  Rinse and repeat.  I've posted the errors earlier (https://bitcointalksearch.org/topic/m.4557778).  Anyone have any ideas/suggestions?

So far, I've tried: reflashing the the 1.1v-limited firmware, reflowing the FTDI chip, reseating a nearby small SMD capacitor that might have gotten dislodged, but so far still unstable.  Before I had the pin 5-6 shorting mishap (hit an adjacent pin at the same time during a reset, trashing my FTDI chip), it was pretty solid, though I did have to reset it about once a day.  Replaced the FTDI chip with a new one, programmed it (thanks Mr. Teal!), and was back hashing, but never stable or even usable.

Help?
So, it doesn't reset or anything, it just stops talking?
Do you have a multimeter, and the next time it stops talking can you measure the voltage across C19 (one of the unpopulated ceramic capacitors between the PSU and ASICs).

When that happens, the indicator lights go dark and the miner can't talk to it anymore.  It will actually reset smoe/most of the time, but of course the miner program doesn't notice that and has to restart before it'll catch again.  Once it's back up it repeats the crash with these three errors randomly streaming/repeating (mostly the timed out and unexpected queue result ones):

Code:
[2014-01-16 22:57:42] BFL 1: Failed to send queue
 [2014-01-16 22:57:43] BFL 1: Error: Get temp returned empty string/timed out
 [2014-01-16 22:57:43] BFL 1: Received unexpected queue result response:

I'll hook it back up again once I have a free PSU and look at the voltage across C19.  What should it be, and what are you hoping to see from that?
Somewhere between 0.85V and 1.15V.
Most crashes are caused by the power supply turning off for some reason, but this sounds like it might be different. Do you know which other pin you shorted out? Some of the other ones around there are the communication pins between the VRM and the microcontroller. The micro will shut the power down and reset if it reads funky values from the power supply, perhaps that could be a source of the error.

Thanks, but supposedly the non-Lucko boards don't have the hairdryer problem.  Still, I might try that for good luck.  Maybe the initial problem somehow duplicated the strangeness of the Lucko boards...
Lucko and another user who has both types of boards reported they could get similar results on one board if they cool them down substantially (I believe Lucko did it at -10C). I tried a couple of mine at -15C and couldn't reproduce it, but I can't rule out that there is some kind of temperature dependent thing happening. I've only heard of the two units displaying those symptoms though, and none with the ambient above freezing.
Pages:
Jump to: