I think he meant a merkle tree on top of the blocks, not the one inside the blocks. Read the last paragraph again. Each quorum has its own block chain, not simply a new set of transactions inside the existing chains tree. This solves the problem of people who don't care about DNS not needing to handle those transactions and vice-versa.
So you'd have two separate blocks. Unlike today, the nonce/extraNonce would not necessarily result in a hash lower than the target. It would result in a hash that is then combined with a hash from the other parts of the super-tree to produce a new merkle root that is lower than the target. You'd need to introduce some kind of super-block data structure as well, for instance instead of
Block 1234
- version
- hashPrevBlock
- hashBlockMerkleRoot
- nTime
- nBits (difficulty)
- nNonce
like today, for each chain you'd broadcast/store
SuperBlock 1
- version
- nBits (difficulty)
- merkleBranch
- Block 1234
- version
- hashPrevBlock
- hashBlockMerkleRoot
- nTime
- nNonce
merkleBranch in the superblock would be like
abcdef1234567890 (ROOT)
/ \
hashBitDNS hashBitCoin
or potentially in future
abcdef1234567890 (ROOT)
/ \
hashBitDNS 123456abcdef
/ \
hashBitCoin hashBitVoting
or whatever, as new quorums are introduced. To solve a tree then, you gather up block headers from each network and start adjusting the nonces in each block (or perhaps just one nonce) until the root hash is lower than the global difficulty target. On disk and on the network you can now store/broadcast just the bitcoin block, the hash of the BitVoting block and the hash of the BitDNS block. These can then be combined to prove that the hash is lower than the target.
In this way the block broadcast for a particular network only needs to contain a small number of extra hashes, namely the merkle branch linking that block to the root, thus it scales nicely to very large numbers of applications of the BitCoin technology.
Now it seems Satoshi had some kind of backwards compatibility trick in mind when he wrote that, and the scheme I just described would be the "modernized" version he envisioned a long term migration to. I'm not sure how he imagined retrofitting this onto BitCoin today, exactly.