Alright, I'm putting up a 10 BTC bounty for anyone who can find such a function OR prove that it doesn't exist, with a reasonable description of what combination of features makes is making it impossible.
All it has to do is this: Define a hashing function GETTHEBLOCK_1( inputs ) which takes a map of (private key, balance) pairs, and produces a very large number based on it, in such a way that any pairs with a balance of zero passed to the input does not affect the output, and additionally such that an examination of the hash can securely prove that the amount of currency embodied within it has not changed from a reference.
I'm assuming you are asking for a
cryptographic hash function because, as has been discussed, merely outputting the input is not acceptable. The Wikipedia page lists four properties of a suitable function, paraphrased here:
- The function can be computed easily
- Given an output, it is infeasible to find a matching input
- Given an input, it is infeasible to find another input with the same output
- It is infeasible to find any two inputs with the same output (birthday attack)
Your requirement that a balance of zero not be included in the hash violates the last two properties. The third property is violated because an input can be constructed from another input that has the same hash, simply by appending an account with a balance of zero. The final property is most trivially violated; any time a new (zero) balance is added, the hash is unchanged. I'll ignore these for the purpose of discussion, but I consider them alone a valid proof of the nonexistence of a
cryptographic hash function, and claim the bounty.
As has been discussed, you are looking for a hash function which produces a fixed-size output, which, while not explicitly a property of a hash function, is reasonable for any practical system. Now consider the requirement that the hash can be examined to check that the total amount of currency in circulation has not changed. (I'm a bit unclear on this, I'm assuming this is what you are saying). A fixed-size output can only store a fixed amount of information. A 256-bit hash can only store 256 bits of information. The fundamental advantage of hashes like SHA is that they are destructive. You feed in an variable-length message and get a fixed-size output. Information is lost in the process so it is impossible to reconstruct the original message from its hash. As more and more inputs are added to GETTHEBLOCK_1 (more bits of information), you will eventually not be able to fit the precise allocation of funds into the hash. This is OK because all you want in the hash is a verification of the total amount of currency.
So let's consider an alternative method: addition. Forget traditional hash functions for a moment. You add the balances of every account in the system together, and take the sum as your output (if it's really critical that the output be fixed-length, pad the result appropriately). This is one of the computationally simplest hashes for any number of inputs (follows property 1). However, it is trivial to find another set of balances that have the same sum (violates properties 2, 3, and 4). Now consider the number of addresses in existence. Looking at the SourceForge download statistics page, I'm going to make a conservative estimate of 300,000 clients. Each of these clients produces 100 keypairs the first time they are started, and users will produce, on average, as a conservative estimate, 1 keypair per week. Even though zero balances are not reflected in the hash, they do have to be either added to the running total (with no change) or branched around. This makes 30 million balances that have to be added together each time a transaction is made (conservative estimate of 5 per minute), and this is still completely ignoring the private keys (which don't even need to factor into the verification). I say this violates the first property even though it runs in linear time, simply because n is very large. Also, it will be impossible to scale as more and more balances are introduced. The current model works because nodes look at a balance and a transaction, check that it is valid, and then do some SHA hashing, and then clients trust that the nodes are doing their math correctly.
Finally, who has all of the private keys in the first place? Who is computing GETTHEBLOCK_1 and how does he "securely" collect private keys?
As for GETTHEBLOCK_2, what about hash(hash(private key
1) x balance
1)+hash(hash(private key
2) x balance
2)+hash(hash(private key
3) x balance
3) and so on where + is concatenation, x is multiplication, and hash is a standard cryptographic hash function of your choosing?
Bitcoin address: 1AJnX8Rf29kw72D4L9hBBEdHmZZRMYjW6