Avalon went with a very simple chip on a large(110nm) process. I imagine this strategy gave them very high yields out of the wafers because of the small chip size. This came at the cost of a large PCB to route to all the chips.
Well, that's certainly going to be part of the reason. However, take a look at the distance between chips on the BFL board and compare it to the distance between chips on the Avalon board. Now remember Josh's comment about the PCB ground plane not being able to dissipate enough heat due to the thermal density of their design, and how having to redesign the chips to fix this delayed shipping whilst Avalon shipped on schedue using similar QFN chips.
Excellent point. I have minimal knowledge of using the the ground pad and vias to transmit heat through the PCB but it makes logical sense that Avalon would use a larger than electrically necessary PCB to give each chip some buffer from neighboring chips.
Wouldn't this require also the chips be physically farther apart to avoid heat soaking the inner ones? Plus wouldn't moving them farther apart increase the delay while communicating?
I guess a person could just make a huge board and run copper out to the edges but the thermal transfer isn't as effective as a nearer solution. The size of the piece of whatever metal would dictate the maximum amount of heat that would flow at a given thermal difference. It seems like going with a flip chip (ala P3) rather then older QFN would save materials and allow a more efficient board design. By efficient I mean all data would transfer faster going over a shorter connection. Yes short trips nearly speed of light etc but small changes in location on other devices have caused >10% increases in speed.
I think the more compact design would do better if it can be cooled. Thus far I see no reason it can't be cooled.
The communication between the chips and the controller is minimal. I am also of the opinion that the more compact design would be better but for compute density reasons. BFL excites me for the possibility of watercooling their chips and ending up with an insanely dense mining cluster. I hope in a few generations BTC ASICs are something that we could order like FPGAs now and assemble custom PCBs to suit our own desires.
I agree I guess my point was at each speed increase the unit either needs to roll work (more complicated design per chip) or get new work more often making the latency a larger issue as even from on board ram takes time (not much). I don't mean that the latency is huge per se. Just that say a current single has a latency of about .01 second I think it was with the proper bus speed (mostly USB delay and fixed in asics). now the Jalapeno will hash 5x as fast and if the latency stayed the same instead of .01 per 5.1 seconds it would be 0.19% but .01 of about 1 second becomes 1%. The percentage is small enough at low rates. As speed increases latency becomes a speed limiter. Again current to new isn't apples to apples. Current gets work, works, finsihes, submits, gets new work. New will they said cache next work so get work, work, get more work, finish work, work, get more work. This will remove a ton of latency.
I like the sound of what BFL said they changed. I like the higher density board. I like the heatsink getting less thermal resistance to the core. All these things will make for a better end product as long as they are accurate and work as expected. Not that I am saying a FCBGA isn't a viable chip, just that caching and FC will help a lot.