I'd be very surprised if their chip is anywhere near for example an intel xeon in complexity. While I have no facts to act on, I'd find it very unlikely that their die should have the same wafer yield failure since I'm pretty sure they ain't packing 6.8b transistors per chip.
Transistor complexity is only one factor that affects yields, and it is one of the smaller ones.
The MOST important factor is defect density of the wafer itself. No matter how good you are at manufacturing, your wafers will have microscopic defects in them, and when the chips are etched over these defects they usually result in non-functional chips. At best, if you get lucky and the defect occurs in a non-critical area of the CPU, then you can shut that off in post-processing and have a "partially functioning" CPU.
This is where smaller die sizes are key, because they result in higher yields. Defect density in the wafers is approximately constant, and therefore if you have smaller dies, then you can toss away the defect chips and still have a very high yield. When each CPU starts to occupy larger and larger surface areas, the chances of their being a critical defect goes up, and therefore yield goes down.
It just boils down to simple math, and this is why companies like Intel and AMD always strive to be as efficient as possible in transistors per mm2 of die area. They know that as the die gets bigger, the yield drops like a stone, and profits go along with it (even when you own your own fab, like Intel).
Yes, but would this really apply to a large chip that has mostly replicated sections (hashing engines). I would think that Orsoc would have designed it in such a way as they could shut down sections of the chip that have defects. That would make the most sense economically.
As I have mentioned earlier, it depends on where in the chip the defect is. Certain parts, if the defect lies in them, you can forget salvaging the chip.
Also, building in "redundancy" like what you have mentioned is doubled-edged sword. The bigger you make the chip, the more likely you are to have a defect result in a non-functional chip.
GPU (not CPU) builders have used this for 15 years now. This is why you see AMD/Nvidia release multiple parts with various numbers of active "shader" units in the product (i.e. 7830 vs. 7850 vs 7870). They are all made from the same design, but this is a way for companies to recoup some of the loss in post-processing. By pulling partially functional chips and sell them at lower price points.
I suspect we will see something similar out of KNC.
Oh, and for those following BFL (who isn't, right?), they are already running into this. It is why you see so much variability is the GH/s of the products they release. Some of the chips are only partially functional.
And the BFL chip is a lot smaller than this one, so just scale that variability up even more for the KNC design (granted, they are 65nm vs. 28nm - but it all comes down to die SIZE for defects, not so much lithography node).