Except you can't.
If you carelessly combine token bitcoins with normal bitcoins in the same transaction then yeah, in the outputs you can't tell which one is the real one. But an address that has coins from several outputs can certainly tell how many came from each output.
Suppose that you never pay transaction fees, and never merge several tokens in the same transaction. Then there is a linear chain leading from the original output to wherever the token ended up (there can be multiple chains for the original output, but one chain for the endpoint). When an address wants to show it has tokens, it simply references the output from which it received them, and by assumption the transaction with this output only has a single input, so you just follow the chain backwards until the original output.
Transaction fees complicate things a little, because you need an unambiguous way to determine which inputs are tx fees and needn't be traced back. But it should be possible to agree on such a designation (eg tokens are only transferred in multiples of 10 satoshis, and for tx fees the input will be chosen not to be a multiple of 10).
Merging tokens also adds complication because for each output it's possible that several inputs will need to be verified. But the total work shouldn't exceed the total token transfers done.
And this whole thing becomes trivial if a protocol-enforced way is introduced to add markers to outputs. So a marker will be the hash of an output, and a transaction is valid only if the marked total in the output is at most the marked total in the input, where the output itself which has this hash is also considered marked. If not in Bitcoin itself, then in an alternative Bitstock blockchain (or maybe it will be BitAsset to make it more general).
Address 456: 1s
I make a transaction using 123 & 456 as input and sending 2s to 888 & 4999s to 999.
The ending output is
Address 888: 2s
Address 999: 4999s.