When you say "they just have to install TOR", you're building the Linux experience, where you have to get countless random libs to make it work, and you have to be a wizard in his 20s. And then it will be as popular as Linux, 20 years later. It has to be the Apple experience: "Here's everything you need", "It just works", it has to be literally a button in the client, or normal people won't use it, because it's moonspeak to them. It's not easy for everyone. One month ago I had someone ask me to help with his computer, he typed the URL in Google and didn't even know Ctrl-C. But he owned a few buildings. This is the target audience, or it should be. He doesn't know TOR but he deserves privacy. The other problem is that then to support Bitcoin TORing you have to support illegal data TORing, it seems. Of course I'm pro-TOR and maybe I'm grasping at straws, but still. I'll address your criticism, it's great, you're awesome.
- Attacks are possible and surely always will be when it's a matter of sending through proxy. And I was indeed imagining a tracker of nodes, shared between clients, as it's complex enough, though it'd be less Sybil resistant than TOR. The goal is not absolute anonymity, but moderate. My thinking is that it's better than nothing, and enough for most uses. What I mean is that if you're Al Qaida or a big corporation you wouldn't use the fast synchronization idea, like you wouldn't trust TOR 100%, it's for casual use. Maybe the limitation should be more reflected in the name not to misled people, like 'partial synchronization', with what it entails. But I agree there'd be people trying to gather information from the service, because such information surely has value, so it's a sensitive issue, it can't be too easily exploitable. You convinced me that we could probably only query one address at a time. Maybe it should just go through TOR servers then, integrated in the client, but it's annoying to lose control over how it works when it could be adapted to Bitcoin's needs. I'm worried about address-linking. But I guess we shouldn't reinvent the wheel, though it'd be fun. For example nodes could be required to sign with an address owning bitcoins or having solved a block in the past to be considered favorite as relays in the tracker, and other nodes could be filled by mining pools depending on the node's hashrate, so there could be some Sybil but not infinitely. Tricks like that.
- I didn't think of that timing issue, good point. Honestly, I'm not an expert on TOR. I'd say requests for multiple addresses shouldn't be synchronous (there's time during block downloading; maybe we'd tick the addresses we care about and the rest comes later) and there could be constant dummy queries to hide, at random intervals. Apparently it's an actual strategy, there are many. There's little security in obscurity, but it's not a matter of absolute anonymity, because that probably can't be done.
- The incentive problem was on my mind too. There are people who seed torrents even on Pirate Bay, so I'm thinking many people will turn it on just to help or because they don't care or are too lazy to turn it off after using it, if it costs little to run, like for tx propagation. It'd just become natural that cryptocurrencies are slightly like torrents, clients participate. Many people would turn it on ideologically, because they want to help Bitcoin in the 'fight' against banks & co, but can't mine or run full nodes as it requires massive money or bandwidth. They'd manage transaction TOR-ing, while full nodes do the synchronization, it'd be a beautiful collaborative effort. Also, enabling the feature increases utility, the value of Bitcoin, so that's some minor incentive to have enough such nodes. But at worse it's 3mb every 10min, 3 relays x 1mb block, shared across all nodes, so 10kb/10min each with only 300 nodes, + negligible synchronizing data. Maybe 200kb/10min with fake data, 0.3 kbps + low level data. It seems like not much, and we'd set a speed limit in options. At worse, fees, but it'd be beautiful if it could just be free. Integrated proxying between nodes.
- TOR channels data constantly and massively, surely it's not the same, I think changing paths could be afforded here, though the maths should be done and I'm not the best at networking. But if it's the same path then the full node can link addresses too easily. Also I was mistaken, it would not be a path per output, but a path per address, as we'd query by pubKeyHash, so it's output-linking.
For the meta-argument, overcomplexity, if it's what it takes then it's what it takes. We want to trustlessly query specific outputs minimizing who knows whose is what. Maybe it's the naive approach, but there are probably not many ways to go about it. Fast & light clients with minimal loss of trust & privacy is worth a complex solution. And it'd be fun stuff to code, if you ask me, I wish I had what it takes. But if someone has an easy solution, I'm all for it. But this is what I would do, and if it was there, I would use it. In fact I don't have a client running because my HDD is full and I have like 0.7 BTC. But if I could just download 5gb from the chain, for the massive distributed PoW trust it gives, combined with multiple full-node 3rd parties telling me the 1st ledger hash is valid, it'd lighten the load on full nodes and it'd be just perfect for my use, as an attack combining both would have little effect and is very very unlikely. Unlikely enough not to bother with 21gb and counting.
https://blockchain.info/charts/blocks-size?timespan=1year&showDataPoints=false&daysAverageString=1&show_header=true&scale=0&address=Maybe it's not that much but it'll be like 100gb in 6 years at this rate. I can't afford it, there are just too many games and movies I can't delete. Worse comes to worst I'll buy a hard drive, but still.
edit: Andytoshi, I didn't understand the part about having to check every transaction in the future, and how it's bad and should be avoided or something. Isn't it already how it works? I should read the code it seems. I don't think normal nodes check the full validity, that's for full nodes, but they're tracking their own outputs in the block, aren't they? So it would change little, to my understanding. Download a ledger and update with downloaded blocks tx. It's just that instead of using the special case Genesis block with Ledger = empty set, it's a later block with non-empty ledger. There's a trade-off of trust for space and speed, but god knows I'd take it, it's surely unavoidable. And like I said, it could just be to speed up synchronization, then download in reverse to the genesis block if people want, for complete trust.