Pages:
Author

Topic: ZTEX USB-FPGA Modules 1.15x and 1.15y: 215 and 860 MH/s FPGA Boards - page 37. (Read 182443 times)

brand new
Activity: 0
Merit: 250
catfish, beside the obvious (FPGA), it would be interesting to see what temperatures the two buck controllers and the MOSFET reach after, say, an hour of continuous mining. Especially the AOZ1025 8-amp buck controller in its tiny DFN-16 case, which has one MOSFET built in. I have a 1.15x, but no infrared thermometer.

Will report back. Just added a fan on a separate circuit out of sheer paranoia.

If it's hours before I add to this thread, then bad news Sad

donator
Activity: 305
Merit: 250
Code:
[quote author=catfish link=topic=49180.msg767929#msg767929 date=1330164218]
[quote author=CAcoins link=topic=49180.msg767881#msg767881 date=1330158965]
Well, I have to give it to you.  If you're willing to code support for ztex on a Hackintosh, you are a better man than I am.  I also run VMware.  I notice the workstation version runs much better with win7 as a host than Ubuntu.  Believe it or not, I ran a Win2k8 server as a VM inside Win7 (don't ask me why) running a web/SQL server.  No issues for 5 straight months.

Regardless, good luck with your endeavor.  I see you have a nice setup with the 1.15d although I have to say I like how compact the 1.15x is (big where it counts=heatsink  ;D)
[/quote]
Cheers :)

The board is very much a Test Unit as per the child-like writing on the piece of MDF :D

I like the additional length and 'wind tunnel' feature of the 1.15d / Exp. Board 1.3 combo - and not just because I can bolt in a big SDHC card to run rainbow table apps etc. Purely focusing on BTC mining right now, I've run into big thermal problems with a 9 GHash GPU-based frame rig contraption (pics on the other thread here, won't clutter this one up). Running high-density units (in my first case, GPUs) horizontally, with the fan having to blow air *down* onto a heatsink, with maybe another hot unit sitting above or to the side, is just thermodynamically inefficient. The fans are working against physics, and if there's another GPU directly above the fan, then it can only blow red-hot air from the back of the upper GPU core onto the heatsink. Cue high temperatures and fans running at 85-100% capacity.

I've had loads of fan failures on GPUs and eventually switched to a vertical layout, where the main GPU exhaust points straight upwards. This allows convection to provide cool air for the fans to blow sideways onto the heatsinks... but with a high-density setup, radiated heat from the backside of the GPU next door is *still* a problem. The retail GPUs don't cool the rear side of their circuit boards and I've measured them at up to 90˚C in places using an IR thermometer.

With the 1.15x all-in-one solutions, the design is very much like a traditional GPU / CPU. A fan blows air down onto a big heatsink. Of course, the TPC is so much lower that my considerations are almost moot... but with the comment that the FPGA must have hit 90˚C (by Stefan, discussing when the active fan failed), there's room for making use of convective cooling as well as relying on downward-blowing physics-fighting fans :)

OK, so I'm a mad fan of the Apple G4 Cube. If there was a hope in hell of compiling Stefan's SDK for PowerPC then I'd be using my example (pride of place on my desk, kept alive at all costs, though not doing much right now after having burnt out 4 GPUs, 3 hard drives and a PSU) to control the FPGA array I'll eventually build.

Using the same concepts, look at the 1.15d and how it connects to the Exp Board 1.3. My picture above is my Test Unit as explained. My quad-board rig design (had to downsize from 5-board because the fans pushed the power budget too close to the PSU max output!!!! heh) will have each unit mounted vertically, with the power and USB connections at the bottom, in a Cube-styled enclosure with adequate venting for cool air to enter from below, rise up through the boards, and exit at the top.

I don't plan to use ANY active fans on the 25mm heatsinks supplied with the 1.15d boards. The cheap crappy fan on my picture above is there for safety, and I'll be removing it during testing to see whether the horizontal fan keeps the board cool enough.

You see, there's a tunnel between the FPGA board and the Experimental Board. Even with convection only, cool air will flow through this 'tunnel' and provide cooling to the immediate *reverse* of the FPGA, which will be getting hot. The Experimental Board itself will provide a heat barrier to prevent the back of the FPGA heating the front of the board next to it (remember it's high-density, like the GPU solutions).

I think this 'tunnel' and the ability to direct cooling air to the *back* of the board, something most people ignore with GPUs (and is a very significant heat load in high density rigs) is a great feature of the 1.15d setup, which is why I'll be going to use these rather than the optimised 1.15x boards.

Without tall, small-diameter fans on each FPGA heatsink, my enclosures can be made more elegant whilst still maintaining high density. I don't know what Stefan's 1.15x active cooling product is like, as I can't use it, but my experience with small-diameter cooling fans as per those used on older GPUs etc. is that they can make a horrible high-pitched whining noise. I'm sure Stefan's is a quality item but in 6 months, will it be making a racket like some of my GPU fans?

This is all moot if my design doesn't work, of course. I'm doing a Cube but having two 80mm PC case fans blowing cool air up from the bottom, and two identical fans at the top of the enclosure blowing hot air out. Inside a sealed acrylic enclosure, this should provide enough pressure to ensure air is *forced* through the 'tunnels' and through the pins on the FPGA heatsinks. We're only talking about 10W per board max, after all...

If there's a way to read the temperature of the FPGA core and feed this back to the fans, then a *really* neat automatic solution could be built - but to begin with I'll simply chuck a potentiometer in with the fans. I'm hoping that four 80mm 3W fans is massive overkill for my design, and the fans can be run slowly and *quietly*.


Put it this way, my girlfriend has put up with my GPU frame rigs for months now. The noise made by all the fans is enough to affect *my* sleep and I'm accepting it, not getting irritated by it. Hopefully my FPGA rigs, when complete, will be damn-near silent. It would be poetic to have the G4 Cube managing the FPGAs... but the Cube doesn't even have USB2 and I still haven't managed to compile the SDK on a reasonably-standard Intel Snow Leopard Mac yet... so my old first-gen AppleTV, which (with a bit of hacking) can be made to run the regular OS, may be put into use if there are no OS snags. Otherwise it's a Mac Mini.

Regardless, it's going to look cool, hopefully run cool, and whilst replacing 9 GH/sec is beyond my financial means at the moment, once I've got this SDK built on the Mac, I'll be discussing volume discounts and best deals with Stefan ;)
[/quote]

I can see where you're coming from, and it looks like your setup will probably work pretty well with the fan blowing in from the side.  I actually remember your rig when I was setting up mining rig for the first time and looking at the pictures of all the different rigs.  And I agree with you, most people under-estimate the thermal issue.  I ran into that problem setting up my first Beowulf cluster a long time ago (and have learned my lessons). GPUs are a pain in the butt in-terms of heat.  (If I can't touch the back-plate of the GPU because it is too hot, it doesn't belong in my computer).  And forget blowers, the noise is too much!

Coming from that, I think you would be pleasantly surprised about the 1.15x.  It's cool, the fan is quiet, and even with the top-blowing-down of the chipset cooler design, it is almost self-sufficient.  It's winter right now, but I had a bunch of them running in a somewhat enclosed rack with no additional fans and the heatsink temp never got above 30C (measured with a probe).  All the boards clocked above 200MHz.  I have some Silverstone fm 121's blowing on them at like 1000 RPM, and you don't hear the chipset fan and the airflow is more than enough.

The LX150 has high junction temperature but low thermal conductivity, so you have to make sure the heat is being conducted away from it efficiently.  But with TDP like 9 watts, I think you, and your girlfriend, will appreciate the decrease in heat/noise.  I am glad I didn't go the 6990 route  Cheesy
donator
Activity: 305
Merit: 250
Well, I have to give it to you.  If you're willing to code support for ztex on a Hackintosh, you are a better man than I am.  I also run VMware.  I notice the workstation version runs much better with win7 as a host than Ubuntu.  Believe it or not, I ran a Win2k8 server as a VM inside Win7 (don't ask me why) running a web/SQL server.  No issues for 5 straight months.

Regardless, good luck with your endeavor.  I see you have a nice setup with the 1.15d although I have to say I like how compact the 1.15x is (big where it counts=heatsink  Grin)
brand new
Activity: 0
Merit: 250
Bring on the Fear!

Test Board number 1 is connected up, 25mm heat sink applied with supplied thermal tape, baby 12V fan screwed on top (a bit wobbly - but it's not going to fall off, and I have a case fan (if needed) for horizontal airflow), and 12V feeds soldered onto the Experimental Board (I've used the Vin pin and GND since the board will be receiving a nice 12.14V from a switching bench test supply)...

Basically it's ready to switch on... I have an infra-red laser-dot thermometer handy and will be monitoring the temperature of the key components on the PCB.

What temperature should I consider 'something is wrong - panic' and cut the power? These FPGAs are rated to 70˚C according to Stefan's website - is this a sensible working temperature for the IC, or a 'maximum, not recommended continuous' temperature?

Obviously I'll be running it flat out as a Bitcoin miner, but I'd expect that Stefan's software doesn't overclock the hell out of the unit like all my GPUs, which are slowly dying due to severe lifespan reduction Smiley

I'll watch the software output, but if I've done something really idiotic and the board doesn't even report to the OS, but is consuming power somehow and overheating (shorts, etc.) then when should I pull the plug?

Also, since I'm running the 1.15d with the *much* smaller heatsink and non-Ztex-approved fan, what FPGA temperature (area temperature, as I can only measure with the IR thermometer) is a 'happy BTC mining stable temperature'?

Basically, I've decided to make sure I can get these things running first by mounting my first 1.15d board on a bench rig, then if all goes well and I've sold this spare Macbook Air and iPhone 4, I'll be contacting Stefan with a 4 or 5 board batch order. Hence the first rig is using the supplied heatsink but a tiny fan on top. My 'display case' multi-board design will have an enclosed tube with four 80mm PC case fans, two blowing air in and two sucking air out. This *should* give enough airflow to allow each board to use the heatsink only, and not require HSF stacks, which make each board too tall for my aesthetic design sensibilities Smiley

I'll also test the bench-test board with no fan on the heatsink but a single 80mm case fan mounted horizontally behind the board unless advised not to.


If I'm just about to kill my board, shout now! Cheesy Power meters and thermometers at the ready - this is going to be interesting!

PS. Stefan - BTW, you're right, I've been able to fit a microSD card onto the board with the 25mm heatsink fitted as per your instructions. It's a bit dodgy getting it in and out, a full power-down and anti-static job - but if any future further-optimised bitstreams that make use of onboard memory and mass storage become available, it's up to the job.
donator
Activity: 305
Merit: 250
Hey Catfish, just curious, does OS X support running VMs?  I would think the java process/drivers should have no problem running inside a VM (as opposed to GPUs).  It would be nice to have an OS X software, but maybe run linux inside a VM if you run into more roadblocks.
donator
Activity: 305
Merit: 250
The software clocks up/down based on the error rate.  Use the latest version (120221.jar) and it should shut the board down automatically if the rate drops significantly (ie. fan failure, etc).  It's running fine for me but I haven't tested the auto-shutdown feature yet. 
donator
Activity: 367
Merit: 250
ZTEX FPGA Boards
What temperature should I consider 'something is wrong - panic' and cut the power?

Provided that the heat sink is installed properly you should make sure that the temperature is less then 60°C. For optimal operation the heat sink temperature should be less than 40°C.

I'm not sure whether IR thermometers work on Al surfaces.

Quote
These FPGAs are rated to 70˚C according to Stefan's website - is this a sensible working temperature for the IC, or a 'maximum, not recommended continuous' temperature?

This is the board limit and defined by other components.

The absolute maximum junction temperature of the FPGA is 125°C the maximum operation junction temperature is 85°C .
sr. member
Activity: 448
Merit: 250
catfish, beside the obvious (FPGA), it would be interesting to see what temperatures the two buck controllers and the MOSFET reach after, say, an hour of continuous mining. Especially the AOZ1025 8-amp buck controller in its tiny DFN-16 case, which has one MOSFET built in. I have a 1.15x, but no infrared thermometer.
donator
Activity: 367
Merit: 250
ZTEX FPGA Boards
A new BTCMiner version with improved overheat protection has been released. Read https://bitcointalksearch.org/topic/m.761071 for details.
legendary
Activity: 1022
Merit: 1000
BitMinter
Actually, on second thought, you may be right. During one of the "downclock of death" episodes I encountered, it went as low as 126 MHz before I stopped it. Thus, I now think you're right - it simply stabilized at 174 MHz, worked at 174 MHz for a while, before the microcontroller crashed.

Thats correct. I had several DCODs with my 1.15d and the controller never crashed !

Apples and oranges.

DCOD...caused by the Bitstream not doing something 100% right when programming the DCM (digital clock manager) on the FPGA, turning DCM programming into something  like a lottery.

Cypress controller crash...believed to be caused excessive conducted and/or radiated heat from a PASSIVELY cooled FPGA.


I never had a DCOD because of something else than passive or bad cooling, ever. But you are right that the layout of the boards is different.
donator
Activity: 367
Merit: 250
ZTEX FPGA Boards
Interesting to see what happens when a fan fails during mining.  I guess the FPGA doesn't completely clock down to 0 right away due to the heatsink, but the USB controller poops out first.  Good to See that no serious damage to the board though.  Anybody suspect that it causes any damage to the USB controller/board components when the fan dies?

If the original heat sink is used and if there is at least free convection the board is still save if the fan fails. This only costs about 5-10 MH/s of performance. (With some well placed case fans and good airflow the 40mm fans are obsolete.)

AFAIR Inspector2211 did not use the original cooler. The board was either shut down by a USB controller overheating (there is only a small gap between heat sink and USB controller) or the overcurrent protection was triggered (more likely) by the overheating of the FPGA or a disfunction caused by overheating.

Quote
No, Stefan set the minimum at 174 MHz, but at this clock rate a PASSIVE heat sink (which my active heat sink turned into after the fan failure) seems to completely suffice (read the log that I posted).

There is no such limit. (There is an internal limit in the Firmware at 100MHz bit this limit is not visible to the host software).

I don't agree that your (non-original) heat sink without fan was sufficient. There must have been temperatures around 100°C.

But it is an good idea to shut down the board if an frequency drop (of 10%) occurs. I will add this feature to BTCMIner.

donator
Activity: 367
Merit: 250
ZTEX FPGA Boards
For the 1.15d boards - worth sticking a passive heatsink on the USB chip then?

The power disspiation of the USB controller during bitcoin mining should be less less 100mW.

donator
Activity: 367
Merit: 250
ZTEX FPGA Boards
Yup - though the Xilinx SDK *is* required if I want to change the 'code' the FPGA runs, or write a new app for the FPGA as hinted at above - am I correct?

Yes, Xilinx ISE is required if you want to do FPGA development.

sr. member
Activity: 448
Merit: 250
Actually, on second thought, you may be right. During one of the "downclock of death" episodes I encountered, it went as low as 126 MHz before I stopped it. Thus, I now think you're right - it simply stabilized at 174 MHz, worked at 174 MHz for a while, before the microcontroller crashed.

Thats correct. I had several DCODs with my 1.15d and the controller never crashed !

Apples and oranges.

DCOD...caused by the Bitstream not doing something 100% right when programming the DCM (digital clock manager) on the FPGA, turning DCM programming into something  like a lottery.

Cypress controller crash...believed to be caused excessive conducted and/or radiated heat from a PASSIVELY cooled FPGA.
brand new
Activity: 0
Merit: 250
For the 1.15d boards - worth sticking a passive heatsink on the USB chip then? Small heatsinks intended for RAM (on GPU boards and the like) are easily found and small enough to sit on that second IC on the board.

On the 1.15x you probably couldn't do this because the main heatsink covers part of the second chip, but the 1.15d has more room, and the USB chip is a couple of centimetres from the FPGA.

I've got plenty of these little RAM heatsinks... could easily equip each 1.15d board with one... reckon it'd cause harm rather than good?

Ultimately I'd like a thermistor somewhere on the board to monitor temps, but I've found a 25mm fan to sit on my 1.15d heatsink for testing, and it has a three-wire connector intended for PC logic boards, so the speed should be easily monitored with extra logic or a separate fan controller...
legendary
Activity: 1022
Merit: 1000
BitMinter
Actually, on second thought, you may be right. During one of the "downclock of death" episodes I encountered, it went as low as 126 MHz before I stopped it. Thus, I now think you're right - it simply stabilized at 174 MHz, worked at 174 MHz for a while, before the microcontroller crashed.

Thats correct. I had several DCODs with my 1.15d and the controller never crashed !
sr. member
Activity: 448
Merit: 250
Quote
No, Stefan set the minimum at 174 MHz, but at this clock rate a PASSIVE heat sink (which my active heat sink turned into after the fan failure) seems to completely suffice

I see.  I didn't know Stefan set the minimum at 174.  It doesn't look like the error rate was too high either at 174 from your log.  I was initially concerned that in a cluster, some fans are bound to fail at one point.  I wouldn't want to have to check on the boards every so often, but I guess you will be able to pick up that a fan failed based on the block rates.   

Actually, on second thought, you may be right. During one of the "downclock of death" episodes I encountered, it went as low as 126 MHz before I stopped it. Thus, I now think you're right - it simply stabilized at 174 MHz, worked at 174 MHz for a while, before the microcontroller crashed.
sr. member
Activity: 448
Merit: 250
For the 1.15d boards - worth sticking a passive heatsink on the USB chip then? Small heatsinks intended for RAM (on GPU boards and the like) are easily found and small enough to sit on that second IC on the board.

On the 1.15x you probably couldn't do this because the main heatsink covers part of the second chip, but the 1.15d has more room, and the USB chip is a couple of centimetres from the FPGA.

I've got plenty of these little RAM heatsinks... could easily equip each 1.15d board with one... reckon it'd cause harm rather than good?

Ultimately I'd like a thermistor somewhere on the board to monitor temps, but I've found a 25mm fan to sit on my 1.15d heatsink for testing, and it has a three-wire connector intended for PC logic boards, so the speed should be easily monitored with extra logic or a separate fan controller...

>passive heat sink on the USB chip

I don't believe in trying to fix SYMPTOMS as opposed to CAUSES.
The USB chip got too warm by conducted [via the PCB] and radiated heat - putting a black heat sink on it
might actually cause it to pick up MORE radiated heat. Thus, I'd say "no" to your heat sink idea.

>speed should be easily monitored with extra logic

Stefan does feed the fan's RPM signal into the USB microcontroller, but AFAIK it's not [yet] being monitored by the microcontroller's firmware. While that would be easy enough to do, it'll then thwart people like me who intentionally want to use [low-profile] 2-wire fans. 2-wire meaning, they don't have an RPM signal output.
(I have this vision of slotting several 1.15x boards into some kind of "card cage", vertically, and the 2 inch or 3 inch tall Xilence fan would not be compatible with that.)

Btw., I don't think it is my low-profile fan that failed - quite likely it was just my makeshift fan power wiring that came undone.
donator
Activity: 305
Merit: 250
Quote
No, Stefan set the minimum at 174 MHz, but at this clock rate a PASSIVE heat sink (which my active heat sink turned into after the fan failure) seems to completely suffice

I see.  I didn't know Stefan set the minimum at 174.  It doesn't look like the error rate was too high either at 174 from your log.  I was initially concerned that in a cluster, some fans are bound to fail at one point.  I wouldn't want to have to check on the boards every so often, but I guess you will be able to pick up that a fan failed based on the block rates.   
sr. member
Activity: 448
Merit: 250
Quote
Module 1.15x fan failure during mining

Interesting to see what happens when a fan fails during mining.  I guess the FPGA doesn't completely clock down to 0 right away due to the heatsink, but the USB controller poops out first.  Good to see that no serious damage to the board though.  Anybody suspect that it causes any damage to the USB controller/board components when the fan dies?

>clocks down to 0

No, Stefan set the minimum at 174 MHz, but at this clock rate a PASSIVE heat sink (which my active heat sink turned into after the fan failure) seems to completely suffice (read the log that I posted).

>damage to the USB controller/board components

I think it was just operating out of spec, and thus failing to execute its program properly.
As I said, there is no evidence of any damage, and it's mining just fine.

That said, prolonged exposure to out-of-spec temperatures may cause electrolyte capacitors, but also other components, to age more rapidly than usual. But this is typically not a "bang, you're dead" event, more like "instead of an anticipated life span of, say, 6 years, we're now moving closer to 3 or 4 years".
Pages:
Jump to: