It's interesting. I have a background in computer architecture (Multicore CPU design), and the whole concept of bitcoin as a distributed time stamp network reminds me of the problem of multi-processor memory coherency (
http://en.wikipedia.org/wiki/Cache_coherence). Basically when there are many processors sharing the same memory space, there is the same issue about how to determine the correct global ordering of memory transactions. For instance, if processor A writes to address X about the same time that processor B writes to address X, then what should be the final value of memory address X? The problem arises because each processor keeps frequently accessed memory addresses in their local caches (and additionally each processor is most likely speculative and executes out-of-order), so it is difficult to determine what is the "correct" memory transaction ordering without some shared arbitrator. But for your typical shared-bus multi-core CPU that most of us are proably using right now, there is a shared bus connected all the processors' private caches together with the last level shared cache or main memory, so it is relatively easy since the the shared bus can act as global arbitrator which determines the "correct" global order of memory transactions based on the order in which memory transaction appear on the shared bus. And the individual caches have special cache coherence protocols (e.g. MESI) that snoop the shared bus and update cached memory values efficiently.
However, shared busses are not scalable since they become very slow as the number of connected components increases, making them impracticable for more than 16 cores or so. Larger shared-memory multiprocessor systems must use slightly more clever systems for dealing with memory coherency since the interconnect fabric is much more complex and may not have any centralized arbitrators. There are hundreds of research papers dealing with this in the past 30 or so years. One such solution is distributed directory-based cache coherency, whereby each processor cache controller is responsible for handling the transaction ordering for a certain memory range. For instance, if there are 256 cores on a chip arranged in a 2-D grid of 16x16 cores, then the memory address space would be divided up into 256 different sections. Whenever a cpu generates a memory transaction, that transaction is forwarded to the cache that is responsible for keeping track of the ordering and sharing based on some bits of that particular memory addresses.
Regarding bitcoin, it seems the current design has the same drawback of a shared bus cache coherenece since miners must process every single transaction (well technically not "every" transaction, since if a transaction is not put in the block chain then that transaction never *really* happened
). It seems to me that as the bitcoin network reaches the scale of millions of users, then it will become impractical to broadcast every transaction to every single miner. A better solution would be along the lines of distributed directory-based cache coherency, whereby a particular miner is only responsible for handling transactions from coins belonging to a certain address range. Basically, the ordering of two transactions that use different coins is not important to the problem of double-spending. Rather, it is only important to maintain the proper ordering for transactions that use the same coins. Two transactions that use an entirely mutually-exclusive set of coins can be reordered with respect to one-another in any manner without having any issue of double spending (of course if two transactions share some coins but not all, then care will needed to be taken to ensure proper global ordering). So I would be interested in some modification of the bitcoin protocol whereby different miners may be responsible for hashing transactions of a subset of the coin address space. Of course selecting which address space I am responsible for mining and broadcasting this info to all bitcoin clients is not trivial. And of course eventually these mini-block chains (which are only handling part of the coin address space) would need to be reincorporated with the rest of the global block chain whenever there are transactions involving coins from multiple addresses spaces. But anyway, once this network bandwidth problem becomes a real issue, then I am confident that such a fork could be implemented. (One possible "solution" that doesn't involve *any* code rewriting would be to simply allow multiple block chains to exist simultaneously, and then simply perform a standard currency exchange between the different competing bitcoin block chains whenever there is a transaction involving coins from different block chains. This would effectively divide up the coin addresses space into different mutually non-overlapping address chunks that could be processed entirely independently. Overtime, the exchange rate between different bitcoin block chains would stabilize, and would become effectively one currency as far as the layman is concerned. Of course, the downside is that each individual block chain would be weaker than a single global united powerful blockchain.).