Each board will need its dedicated I2C bus anyway, so why not have a dedicated JTAG bus as well?
We only need one I2C bus, it just needs to be fragmented into different partitions by a switch. I mentioned one example for such a switch before, the NXP PCA9547PW. The reason why I2C needs to be partitioned is the limited availability of addresses on the bus. That is not a problem for the JTAG bus, though. Logically, you can make it as long as you like. Electrically, you need drivers in the TCK and TMS lines for a design with many chips.
Given that there wasn't more than one I2C bus planned and that no more than one JTAG chain are needed, can you clarify why you think more JTAG chains are needed?
So not use the JTAG or I2C signals on the bus connector at all, just the USB D+ and D- lines? That is a very interesting idea: it simplifies the design a lot if it works: none of the non-supply signals I mentioned in my last post are needed in that case, as the backplane can detect the presence of a DIMM the "USB" way. So a simple backplane contains wires and a couple of mini-USB connectors? Or it contains a home-grown USB-hub? (I am limiting myself to a cheap backplane in this discussion because the intelligent one with a CPU can be build on top of the cheap design in a second step).
This is basically shifting the interface chip completely on the DIMM, removing (by design, not material cost) the overhead of supporting hybrid DIMMs. Of the different options, it is not the cheapest, but certainly elegant:
- slave-only DIMMs, USB-chip only on backplane: cheapest, JTAG and I2C on bus
- hybrid DIMMs, USB-chip only on DIMMs: mid price, simple bus with only USB, but needs hub somewhere
- hybrid DIMMs, USB-chip both on DIMMs and backplane: most expensive, JTAG and I2C on bus
Oh, and don't forget to add a means for boards to interrupt the backplane, e.g. when a share was found or keyspace was exhausted.
Is that actually needed? I agree that a later backplane that contains a CPU may make good use of the interrupt, but for the USB based devices it is only a question of how much data to transmit: you still need to use polling because USB does not have a direct IRQ. I admit that reading the GPIO value of an FT2232 connected to the IRQ signal is quicker than reading the JTAG chain. But how bad is that for even 10 boards each with 16 FPGAs?