The way the power supply works is that the 1 volt line is attached to the 12v line for a short time, then to the 0v line for a longer time. The capacitors and the the chips smooth out the resulting voltage, and the monitoring circuit picks this up as a stable 1 volt. What this also means is that the bottom chips (the 1v) are on longer, but the top chips (12v) have a higher inrush current. One set of FETs is going to take a lot more current than the other, this might be the reason the "hot FET" issue popped up in the beginning, and why BFL put bigger FETs on one side (and beefed up all FETs on the singles). Mystery makes sense.
Interesting note for those that are interested to follow up on what lightfoot said. The highside (top) FET doesn't actually conduct any higher peak current than the lowside (synchronous or bottom) FETs, and the average current is actually quite a bit lower than the lowside FETs because the duty cycle is quite low. That is why you often see designs with the lowside FETs have a higher current rating and lower resistance.
The power dissipated in each of the lowside fets is pretty simple to calculate, it's just (1-Vout/Vin) * Rds(on) * [(Io/n)^2+(Ir*nphase/n)^2/12], so the percentage of time the lowside is on, times the resistance when it's on, times the current. n there is just the number of lowside FETs in total (6 for the Jalapeno and LS), nphase is the number of phases (2 for the Jalapeno and LS), and Ir is the ripple current. That's usually designed to be about 20%-50% of the output current, and depends only on switching frequency, output voltage and inductance. For the BFL design Ir is ~6.8A. For the lowside FETs the power dissipated is directly proportional to the resistance (Rds(on)) of the mosfets, so you often see a lot of big, beefy FETs used since they have lower Rds(on).
The topside mosfets are a little trickier, since there's two things to calculate. The first is the conduction losses, which is the same formula as for the lowside fets except instead of using (1-Vout/Vin) (0.917 for the 1V from 12V that BFL uses) you use Vout/Vin; in this case 0.083. Obviously if you used the same FETs for the top as you do for the bottom the dissipation from conduction would be a lot lower, 11x lower in this case. However, you never use the same FETs in the top because there's another kind of losses that the highside FETs have to deal with, conduction losses.
The conduction losses aren't from the steady state P=I^2R losses you get when the FET is fully turned on and there's very little voltage across it. Real MOSFETs aren't perfect switches though. When you switch on and off the highside FET, the change doesn't happen instantly, so there is a time when you have a lot of voltage across the FET and have current flowing at the same time, so you burn up a bunch of power there.
The formula is P = 2 * f * (Vcc*Io/n) * Rg * (n/nphase) * Ciss. f is the frequency (300kHz for BFL), Vcc is 12V, Rg is the total gate resistance of the driver and FET, and Ciss is the input capacitance of the FET.
What's interesting about this is that the n term cancels out; adding more highside FETs per phase doesn't actually decrease the switching losses per FET. This is because even though if you run 2 or 3 FETs in parallel each of them sees less current, the time it takes the driver to fully turn on the FETs is 2 or 3 times longer due to the input capacitance being higher. That also means that if you run 3 of the same FETs for the highside, your total switching losses actually go up instead of down like they do with the conduction losses. That's why you generally always see designers use lower current rated FETs with higher Rds(on); those ones have a lot less input capacitance so your switching losses are lower. It's also why you see arrangements like this one, with one highside and two lowside FETs.
Pulling up some calculations I did on the BFL PSU at 1V and 100A output using the new "cool" BSC014/BSC0902 pair (which is what's in my Single), the conduction losses per lowside FET were 0.52W, or 3.12W for all 6. The conduction losses in the highside FETs is 84mW each and the switching losses are a relatively massive 2.2W per device or 13.5W in total. At 100W output, the BFL PSU (at least on paper) would dissipate 16.6W in the FETs and 3.25W in the inductors, so about 20W in total, or 83% efficient.
As an interesting note, if they would have used a transistor better suited to the highside with a lower Ciss like the BSC052N03LS, those hot highside FETs might run a lot cooler. Those would have 0.93W of switching losses each, and with 1 the conduction losses would be 1.64W each, with 2 the conduction losses would be 0.41W each, and with 3 the conduction losses would be 0.18W each. If you ran with two highside BSC052's, not only would your per device dissipation go from 2.28W to 1.34W, but since you only have two of them the total highside dissipation would drop to 5.35W from 13.5W. You should give that a shot next time you're replacing the FETs on a Jalapeno.