Author

Topic: Water Cooling for large FPGA mining cluster (hundreds of FPGAs) (Read 4488 times)

member
Activity: 84
Merit: 10
In theory yes.  In practice it likely is not worth it unless the power draw is very high and the output PER CHIP is very high.  The waterblock (part that attaches to the heat source) is a large cost of any system.  And 50 GH/s made up of 500 MH/s chips means 100 waterblocks.  50 GH/s made up of 200 MH/s chips means 250 waterblocks. 

One should have a LOT of experience water cooling before looking into custom setups.  Personally I think BFL Singles are a good candidate for watercooling and would like to experiment with a couple of them.  Sadly BFL has made that all but impossible which is a shame.  No other chip FPGA has enough "density" to make watercooling even "interesting". 



the new heatpipe design of the singles may very well have a mounting system that would lend itself to aftermarket cooling solutions.
hero member
Activity: 489
Merit: 500
Immersionist
I didn't realize that. Would be interesting to know the price history during those 4 years and the availability. I could imagine the Spartan 6 wasn't available to small scale operations like ngzhang, Ztex and BFL at the beginning, and not for attractive prices.

My company once had a pre-release flash IC sample (A7?) available from Intel many years ago, but they were very interested in our application otherwise they wouldn't even have talked to us. Took forever to get those few samples and dev notes, then forever to get the actual production chips for the factory. I'd say a year just for that. Later it became available everywhere and was used in may products.

donator
Activity: 1218
Merit: 1079
Gerald Davis
Sure, but there is always another Artix 7 around the corner. That's not an argument in my opinion.

Well we can agree to disagree then.  Spartan-6 launched almost 4 years ago.  So the corner comes every 4 years or so and you are 80%+ of the way to the next corner.
hero member
Activity: 489
Merit: 500
Immersionist
Sure, but there is always another Artix 7 around the corner. That's not an argument in my opinion.
donator
Activity: 1218
Merit: 1079
Gerald Davis
Oops missed that.

Yeah I do think there is some value in building your own cluster but I wouldn't want to build out a large amount of Spartan-6 nodes.  Artix-7 should be in volume production by Q1-2013 and be offering 50%-80% higher MH/W and at least 30% to 50% better MH/$.  

The only point to buying 100 or so ztex boards now would be to have a template for rapidly building another unit based on Artix-7 boards.  Buying 100 boards now likely would let you talk ztex into selling you entire first run of Artix-7 boards.

That 4-6 weeks is also my point. Why wait 4-6 weeks (and then 4-6 weeks again) if you can just build a cluster by yourself in the same time frame.

Simply put unless you have very high power costs the BFL Single are (at current price) to hard to simply ignore.  Yeah BFL has mismanaged the launch horribly, the multiple fans and cut cases is a total hack job, and they use more power but they are still impressive in terms of total hashing power vs total cost.
hero member
Activity: 489
Merit: 500
Immersionist
DOT, I didn't miss the point. I said in my second post I would forget about water cooling. Everything I said afterwards was talking in general, not about water cooling.

Quote from: antirack
As for water cooling, thanks for your clarification. Makes my planning a bit easier since I can just cross out water cooling.

These now infamous 4-6 weeks are another argument for my case. Why wait 4-6 weeks (and then 4-6 weeks again) if you can just build a cluster by yourself in the same time frame.
donator
Activity: 1218
Merit: 1079
Gerald Davis
antirack I think you are missing the point.

ztex boards have very low power density.  1 chip = 9W.  Cooling 9W with waterblocks is going to be horribly expensive. 

Now BFL boards 1 chip = 80W is starting to get to the power densities where it gets interesting.  The problem is BFL has made water cooling impossible.

So you have high thermal output chips which are impossible to water cool and low thermal output chips which don't make any economical sense to water cool.  Starting to see the problem.

Now if BFL so a "naked" single with no heatsink, power supply case, or fans and knocked $50 off the price well I would be watercooling that tomorrow er "4-6 weeks". Smiley
member
Activity: 80
Merit: 10
I have been tossing the idea of FPGA water cooling around a bit. Based on the Icarus (only type i have).

I think conventional water blocks and pipe are out, too many parts, balancing etc

My thinking at this point is a 'G' clamp type of setup. The back of the clamp is a metal square tube about 100mm x 50mm that the water runs though. The top of the clamp would be 10mm flat bar that covers the top of both chips and is welded at right angle to the tube. The clamp would be a nut welded to a rod also at right angle to the tube allowing enough room for the board to slide in between rod/nut and flat bar. Use a bolt in the nut to tighten a square of flat bar and thermal pad to secure contact with the underside pushing the chips against the flat bar on top.

In this way a 1m length of tube with a clamp every 10cm could host 20 boards (10 each side) and would be quick to fabricate without the messy water hoses and connections eliminating most of the potential leaks.

At this stage I don't have enough boards for heat to be an issue so am not planning on making one anytime soon.

hero member
Activity: 489
Merit: 500
Immersionist
A single large 20 GH/s or 50.4 GH/s unit would be less (technical) hassle, definitely. But if it fails, your complete income stream comes to a grinding halt and you are a the mercy of people you don't know. If you have multiple of them, it's a bit different. If in contrast one of your few hundred FPGA boards fail or a fan and a USB hub, you replace it with minimal down time and minimal cost. A price worth paying in my opinion.

And if you are not in Vancouver or Kansas City, shipping will be a nightmare, very risky and costly (admittedly more so for a rig box which will be bigger than a 1U case, not only because of the 2 external PSU). So, unpredictable downtime (but very likely to be weeks) in a worst case scenario, untested and unproven hardware, unpredictable release date (they only exist in the developers mind after all), unreliable company history and track record (maybe not LC but BFL). At the moment at least.  This all might change. Or not. I am not pro nor con, I am just saying.

As person willing to invest some cash (that I had to work for) at this moment, FPGA is a much safer bet. There are a hand full of products to choose from and many of the variables are known. Power consumption, heat, delivery time, cost, source codes are available, FPGA types are known, third party miner support, people already having small and medium size clusters, and more. It may not be the final solution, but sitting around waiting for better times has never lead to anything good, has it?

I wouldn't mind filling a few racks with FPGA boards *NOW*. FPGA also allows you to grow slowly by adding smaller units one by one. You don't have to wait and put all your money on the new turnkey solutions and do nothing in the meantime. And putting your money on FPGA right now doesn't mean you can't upgrade to something else later on. This will probably never stand still. Current bitstream/design is being improved all the time (and may see an unexpected boost soon if I didn't understand this wrong), Artix 7 is on the way, other companies have FPGAs too, sASIC apparently being worked on, maybe somebody else has something different up his sleeve, who knows. Exciting times ahead, and you have to keep an open mind. At least that's how I see it.

Icarus is open source, Ztex has a licensing program, there are a few other Spartan 6 based designs available here on the forum, X6400 I didn't look at. But it seems to me if you invest time & money (not just money), you could build a cluster with equivalent hashing power for a price not far away from the so-called turnkey solutions. While having total control over the whole process, instead of relying on claims and promises of third parties (if anyone is interested in this feel free to PM me - don't worry I am not selling anything, just looking for like-minded or technical interested persons).

And after all, we are all in this for Bitcoin, so we don't mind "getting our hands dirty". If I'd want to just 'invest' money and lean back, then GLBSE or more traditional stock markets are probably a better option, or even speculating on Bitcoins. I would think a typical investor would agree, I don't think the availability of a turnkey solution will change that.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
Large-format FPGA people won't bother with most commodity hardware in such a manner. They know the value of their time, and want something fast and reliable, so they choose something turnkey like LargeCoin.
By the same logic, Google would also choose something turnkey instead of off the shelf hardware, since they know the value of their time.
The key thing to remember is whether your custom solution is actually cheaper than the off-the-shelf solution. Let's take the example of singles vs. largecoin: LargeCoin is supposed to get 20ghash at 100 watts, in a 1U rack formfactor. BFL singles do ~830mhash at 70 watts (before PSU losses). So, about 25 Singles to each Largecoin. 70x25=1750 watts. Space taken up would be a lot more than 1U, not sure how you would fit it into a rack properly anyway. ~$599x25=$14,975 plus shipping. LargeCoin was (for a brief moment) $15,000, but presumably that is sold out at that price and it will go back to $30,000.

Now, assuming you were a lucky first-25 LC customer at 15 grand, we can compare the 2 solutions based on price. Most anyone has 1U somewhere and 100 watts is just another lightbulb's worth of power and heat. 25 Singles is probably going to take up a bit of square footage, but if you have the space that is cool too. The main killer I think would be the power - sure 1750 watts is the same as a powerful quad 6990 rig with an OC, but it is 17.5x the LC wattage, not to mention associated cooling. If you have free power, BFL is the way to go. It just has drastically reduced scalability.

Now let's say you were too late, and LC is going to cost you the full 30k. If you have free power and lots of space, BFL is certainly the winner. You could even develop a watercooling system just for them. But if you want to put them in a datacenter, I'd say that 25 would probably take up 6 to 8 U for a bunch on a shelf with some fans for cooling, and a 1U atom-based rig to run them. Calculations based on 19x29 inch rack and BFL measurements of about 4.5 inches cubed plus space for air and wires (15 per 4u shelf). Since power and space are the primary cost of any datacenter, LC starts looking better (although still not very competitive).

The only other main point is reliability. Where there are more parts, or more moving parts, or more anything, there is more to fail and go wrong. LC being a single solid-state appliance with presumably just a few fans would likely be less hassle than 25 units with 3 fans each, statistically speaking.
hero member
Activity: 489
Merit: 500
Immersionist
Large-format FPGA people won't bother with most commodity hardware in such a manner. They know the value of their time, and want something fast and reliable, so they choose something turnkey like LargeCoin.

Who would those large-format FPGA people be? LargeCoin doesn't exist yet and I am not aware of any other turnkey products in existence.

By the same logic, Google would also choose something turnkey instead of off the shelf hardware, since they know the value of their time.

There are only around $43 million worth of Bitcoins in existence now and we all know the estimated network hashrate. What percentage of any of the two would you consider "large scale"? And how much would it cost to fill these "large scale" dreams with the two imaginative products LargeCoin and Rig Box or other turnkey solutions you know? (I am not even asking how long you'd have to wait and what risk you'd have to take before they can start generating any ROI)

Could it be that those "large scale people" that have powers we mere mortals don't posses only exist in our fantasies? Last time I looked even "syndicates" started to invest in FPGA boards and keep an open mind, but nothing else.

You guys are two of the few most knowledgeable people posting on the forums. I have been here only for a few weeks but I do read a lot of your posts. Thanks for your valuable contributions, really.

As for water cooling, thanks for your clarification. Makes my planning a bit easier since I can just cross out water cooling.
donator
Activity: 1218
Merit: 1079
Gerald Davis
In theory yes.  In practice it likely is not worth it unless the power draw is very high and the output PER CHIP is very high.  The waterblock (part that attaches to the heat source) is a large cost of any system.  And 50 GH/s made up of 500 MH/s chips means 100 waterblocks.  50 GH/s made up of 200 MH/s chips means 250 waterblocks. 

One should have a LOT of experience water cooling before looking into custom setups.  Personally I think BFL Singles are a good candidate for watercooling and would like to experiment with a couple of them.  Sadly BFL has made that all but impossible which is a shame.  No other chip FPGA has enough "density" to make watercooling even "interesting". 

rjk
sr. member
Activity: 448
Merit: 250
1ngldh
Would it be feasible and affordable to do water cooling on a large FPGA mining cluster with hundreds of FPGA chips instead of AC and airflow cooling?

For instance each radiator (that's what they are called, right?) could be covering 4 FPGA chips. As you can probably tell, I have no idea about water cooling and I have no idea if stock equipment is available that could be used.

But if FPGA prices come down, and somebody is going to build a larger cluster of hundreds or thousands of boards, they will eventually run into cooling problems.

Say you are building a rig box equivalent of 50.4 GH/s, that means you already have around 250 FPGA right there.

Large-format FPGA people won't bother with most commodity hardware in such a manner. They know the value of their time, and want something fast and reliable, so they choose something turnkey like LargeCoin. Watercooling is difficult at best, and FPGA based bitcoin miners don't really have a standard way of mounting a waterblock, for example.
hero member
Activity: 489
Merit: 500
Immersionist
Would it be feasible and affordable to do water cooling on a large FPGA mining cluster with hundreds of FPGA chips instead of AC and airflow cooling?

For instance each radiator (that's what they are called, right?) could be covering 4 FPGA chips. As you can probably tell, I have no idea about water cooling and I have no idea if stock equipment is available that could be used.

But if FPGA prices come down, and somebody is going to build a larger cluster of hundreds or thousands of boards, they will eventually run into cooling problems.

Say you are building a rig box equivalent of 50.4 GH/s, that means you already have around 250 FPGA right there.
Jump to: