Could work. If you're already doing board-level soldering, all I'd really need is jumper pads to reroute comms. It'd be best to pull the chips so you don't get comms interference, and once that's done you jumper from VCore to GND pads where the chip used to be. You could also lift the big node-level cap and jump across it. Pads are pretty much free, after all.
Actually, my two-chip "L board" for testing BM1384 is set up a lot like that. I have two power-input jacks, one for Vcore and one for 2*Vcore, and a set of five jumper pads to take comms from either a second chip at the same node (on Vcore) or a second chip at another node (at 2*Vcore). Heck, it's hooked up and running right now (
http://eligius.st/~wizkid057/newstats/userstats.php/1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr should be seeing 22GH, two chips in a string at 200MHz) so I know the concept is sound.
You'd have to remember to take your core voltage down by one node's worth or your remaining chips will run pretty hot.
On the S5, there are no VRMs at all. 2 chips per node gets them the desired power consumption and hashrate; it's a 30-chip board, just like the S1 had a 32-chip board. Using more chips per node increases your total current (which is one reason the S7 pulls around 400W per board versus the S5's 250W per board, because it has 3 chips wide instead of 2) but you also get better balance. If one chip is running a bit high but another on the same node runs a bit low, they kinda cancel each other out. It's easier to buffer out brief transients with a wider string because any one chip's ripples will be absorbed by the other two chips.
Wider nodes also reduces the number of level shifters required compared to the number of chips. Each node needs a level shifter to bring comm data up to its local ground reference (and other node-level parts for IO voltage and such). The S5 has 15 nodes and 30 chips, so 1 shifter per 2 chips. The S7 has 18 nodes and 54 chips, so 1 shifter per 3 chips. When considering the cost of parts that aren't directly increasing your hashrate, you want to maximize the ratio of ASICs to non-ASICs. This means more ASICs per node.
The optimal from that criteria would be to have all chips in one node, but then you have a VRM design and you start factoring in the relatively high cost of VRMs. The more chips you have per VRM the better, so things like the S1 were okay. The standard chip for VRMs has been the TPS 53355 which has a maximum current output of 30A, which is great for higher-voltage lower-current chips like the BM1380 on the S1, but not so great for the low-voltage high-current BM1384. At top clock, a '55 could power two chips. At midrange setting (say, 275MHz - 15GH/5.4W per chip) you could just barely run four but it'd probably catch on fire if your ambient was warm. The S5 would have needed 15 VRMs (at probably $5 each minimum) per board to run the same hashrate and those VRMs would have decreased the system efficiency by between 10 and 15 percent. By going string on the S5, Bitmain ended up saving $50 in VRMs by adding about $15 in additional node-level parts, and increased the board-level efficiency by at least 10%.
If you can keep the chips in recommended temperature and voltage range, I'd think the odds of failure would be pretty low. If an auto-rerouting system cost $10 for an otherwise $150 board, you'd want a probability of board failure greater than 10/150 or 6.7% for it to really be feasible in the long run. I highly doubt the odds of board failure are that high or we'd be seeing a lot more threads yelling at Bitmain. I'd be surprised if the odds of Prisma board failures was even that high, and those were famous for spontaneous (and often dramatic) death. I ran 44 boards for six months without any failures.