I'm not assuming it's all in memory. I'm not talking about pointers, but about a graph of hashes.
To be sure we're on the same page - we are talking about a merkle trie with one hash at the root, right?
There does appear to be a disconnect here. Let me explain what my understanding of your proposal is:
(1) There is a database of TxOuts. It has all current, unspent TxOuts in it. TxOuts are added and deleted (or marked deleted) when blocks come in.
(2) The TxOuts need to be organized into a tree of some sort in order to come up with the single "merkle root" to be included in the meta-chain header ("merkle" is in quotes, because using a binary tree or trie it's no longer a "merkle" tree). The relationship between DB elements (1) will be represented either as a Binary Search Tree or trie using "pointers".
(3) Thus, after every block, you update the database of TxOuts (1), and the pointers in binary search tree or trie (2). I was focusing on (2) when I say that you need to store pointers. Not necessarily in RAM... but somewhere, data needs to be held that identifies the structure of the tree/trie associated with the TxOuts in database (1).
(4) In a classic binary search tree held in RAM, it uses pointers -- specifically a left pointer and a right pointer for each node. In this context, whether it's held in RAM or on disk, there's still 8 bytes needed (at minimum) per pointer to represent the relationships. And a binary tree will use about 1 such pointer per node.
This is not a negligible amount of data. If TxOuts are 35 bytes, pointers are 8 bytes, then a "tree" of pointers will be about 25% of the size of the entire TxOut database. It's a rough estimate, not intended to be exact.
So now back to my original point: it looks to me that in order to "save the state" of the tree, you need to save the entire tree of pointers between nodes (2). And you need the full tree. Sure the TxOuts database stores each TxOut once, but saving all the pointers to represent the past states of the tree will triple your space requirements to save just 8 blocks.
Also you keep saying O(1) when you mean O(log N). I don't think you would agree with the following statement:
"Each transaction is associated with a unique hash. There are only 2^256 hashes, therefore the number of transactions is O(1)."
Another disconnect here. I'm not sure where your example came from, but here's what I'm referring to -- let's compare a binary search tree full of Bitcoin addresses to a trie of Bitcoin addresses (the entire set of all addresses on the network containing unspent TxOuts). Assume at a given time there are N such addresses.
Binary Search Tree (red-black or any balanced tree) -- depth is O(log(N)).
-- Querying an element is O(logN)
-- Inserting an element is O(logN)
-- Deleting an element is O(logN)
Basic trie: depth of the tree is equal to the length of the longest key: so the depth is 20 (because addresses are 20 bytes).
-- Querying an element is O(1) -- it takes 20 hops to get from the root node to the leaf you want
-- Inserting an element is O(1) -- it takes 20 hops to get from the root node to the leaf you want
-- Deleting an element is O(1) -- it takes 20 hops to get from the root node to the leaf you want
So even if there are 100 quintillion bitcoin addresses with unspent TxOuts in the master alt-chain tree, it still takes you exactly 20 hops to query, insert or delete a value in the trie (and less if you're using a patricia tree).