I was thinking of the algorithm due to Berry Schoenmakers (reference is unpublished) but Brands describes on p129 of chapter 3.
So the only remaining flexibility with that algorithm is to minimize bitcoin precision its still currently say 40 bits with bitcoin price up to $500 and precision down to 1c US. A bit close to the limit if the price rises. Bitcoin actually uses 51 bits (1c precision up to $1m per coin). Anyway you need to run that proof which involves 40 or 51 subproofs [...]. I cant see that working out at under 14kB to 18kB, probably a lot worse.
An update on this, actual numbers for the size and its a bit better than thought. This is from the above Brands reference, algorithm due to Schoenmakers. Lets call bitcoins precision m=51. x is a secret value and p commitment to the coins value (like pederson commitment) p = g^v*h^x. g and h are bases such that the discrete log of h wrt g is unknown (log_g( h ) = y aka g^y = h , y is unknown). I use exponentiation notation because I prefer it but the same works in EC.
v=bitcoin value, bits of v are v0 ... v50 (ie so that v=sum 0..50(2^i*v_i) with v0 is the LSB.
Choose m=51 random values in Zn (where n is the order of the group) call x0, ..., x50, and set x=sum 0..50(2^i*x_i) mod n. Primary commitment is y=g^v*h^x.
Now do m=51 auxiliary commitments i=0..50 calculate h_i = g^v_i*h^x_i.
Prove for each auxiliary commitment that v_i = 0 or 1 as follows. If v_i = 0 compute v_i=0 proof and forge v_i=1 proof, in such a way that you can only forge sub-proof, if v_i = 1 compute v_i=1 proof and forge v_i=0 proof.
If v_i=0:
prove v_i=0: k=random, a=h^k,c=H(a,h_i), c'=random,r=k+c'*x_i mod n.
forge v_i=1: r'=random, c"=c-c',a'=h^r'*(h_i/g)^-c",
else if v_i=1:
forge v_i=0: c'=random,r=random,a=h^r*h_i^-c',c=H(a,h_i)
prove v_i=1: k=random,a'=h^k,c"=c-c',r'=k'+c"x_i
sub-proof message is: (h_i,a,c',r,a',r") 6x 32 byte values.
sub-proof verification is the same for v_i = 0 or v_i = 1 (as the verifier must not be able to tell the value of v_i).
check a=?h^r*h_i^-c' and a'=?h^r'*(h_i/g)^-c"
Finally the verifier also checks y =? prod 0..50 h_i^(2^i) mod n.
This works because x=sum 2^i*x_i and v=sum 2^i*v_i.
So adding up! Each value is 32 bytes (for 256-bit values/compressed points and 128-bits of security margin) where m is the mantissa in bits we have presuming for now g, and h are shared parameters. We have h, plus m times (h_i,a,c',r,a',r") so that is 1+6m=307*32=9824 bytes. The other values c=H(a,h_i), c"=c-c are computed.
However I spent the weekend on size optimization:
1. note either r'=random (v_i = 0 case) or r=random (v_i = 1 case) and both are public values, so instead of choosing them randomly we can reuse r'=r (v_i=0 case) or r'=r (v_i=1 case). We cant use a seed and KDF to create r' or r respectively because that would reveal whether v_i = 0 or 1. So then we get 1+5m=8192 bytes.
2. same argument for c' being random and public, in this case we can use a public seed of all non-dependent values, so then we have 1+4m=6560bytes.
3. we can send H(a) and H(a') instead (H truncated to 128-bits according to Schnorr though Brands cautions that one must be careful with this depending on the complexity of the formula proven, so some more things have to be checked) and modify the sub-proof verification step to check H(a)=?H(h^r*h_i^-c') and H(a')=?H(h^r*(h_i/g)^-c"). So then 1+3m=4928bytes
4. instead of H(a) from 3 send a (cant do both H(a) and H(a') as we need a) we can compute the public key h_i from the extended schnorr signature/representation proof (analogous to the way you can compute the public key from a DSA signature). As we're verifying h^r=?a*h_i^-c', we can rearrange and compute h_i=(a*h^-r)^{c'^-1}. We need to verify h_i values but we can do that with one hash h'=H(p,h_0,...,h_50). So then we have 1.5+2.5m=4128bytes (or 2+3m=4960 bytes with out optimization 3).
Recap at this point the proof message is { a,H(a'),r } or {a,a',r} the other 3 values are r' (copied from r or vice-versa), h is computed from sub proof (v_i=0 proof/forgery) verification relation, and cross checked with sub proof 2 (v_i=1 proof/forgery) verification relation. c is computed as H(a,h_i) and c' is from the KDF seeded with the non dependent values.
There is one more promising optimization which might take it to 3+2m, I need to verify for feasibility of dependencies.
Also note one could reduce m eg to 24 bits, and include an exponent e also so that bitcoin value is m^(2^e). 24 bits is enough precision for 1c precision for up to $1.6m. The exponent can be public (and signed as an auxiliary message to the proof). Some ambiguity about the range of the value can be achieved by using a unencrypted mantissa offset o=random(-4,4) and from the original mantissa instead encoding: m'=m*2, e'=e+o so the coin value is the same m'^(2^e')==m^(2^e).
The exponent itself could even be homomorphically encrypted and proven in a range and compared for equality using an equality proof proving the e1 and e2 values are the same, or differ by the unencrypted offset q1=g^e1*h^z1 and q2=g^e2*h^z2. To compare offset do a range proof on e1 and e2 as above, and the encrypted range could be eg o/3 (in multiples of 8 bits) and the remaining bits from the public offset. To verify the offset you'd prove q1 has same e value as q2/g^o. As we only need to cover 25 bits, a 3-bit range proof for the exponent is enough which is quite small. recalling (2+3m)*32=352bytes.
You could probably set m=20 and e=31/8=4 for 2+3m+2+3e=2432 bytes.
Finally some interesting implications arising: you can add coins without owning them (owning them is knowing the private keys xi.) You can pay someone by adding your coin to their coin and broadcasting (or sending privately) the private key encrypted with their public key. Anyone can verify publicly that the coin adds up. You cant oveflow mod n to attacks because there are by definition less than 21mil BTC. You can randomize a coin by adding a 0-value like g^0*h^x. How you add is multiply the coins together (or point addition in EC terminology). The new private key is the addition of the old private key and the new one. No one not involved can tell if that was a 0-valued red-herring or 1c payment or $100. The network itself could pre-emptively add the coin without the recipient doing anything and compact them if the encryption is also a proof. (ie I try to add a new value v to your coin g^v*h^x, which comes with a range proof proving v is positive and doesnt wrap mod n and to do that I encrypt with your public encryption key pk (which was signed by the person who sent you the input) the key x and I also prove that the person with the private key corresponding to pk can decrypt to the same value as x (that anyone can verify) proving equality of discrete log. If you had to you could probably reuse the coin public key for encryption (like sharing a schnorr signature public key and an elgamal public key, need to be careful but there is some analysis).
For inputs, you only need to refer to the outputs (and prove knowledge of the private key) without having to do a new range proof, because by definition all bitcoins in existence cant wrap if added up. You do need fresh range-proofs when you split a coin (to spend part of it). You can split the private key too (x = x1+x2).
To create a coin receiving address you just produce a zero-valued coin and prove knowledge of discrete log wrt h (because g^0*h^x = h^x). Then people can send you coins without being able to spend them.
For mining you prove knowledge of representation g^25*h^x which is efficient (no range proof because 25 is public). Then you split the coin with range proofs when you spend (or when the mining pool divides up the reward to the pool workers).
You maybe no longer need an address (hash of public key) that signs, because the coin private key is a signature key, but you do need an encryption public key.
A type of taint mitigation is spending to some statistical number of addresses a 0 or low value spend together with your real spend.
I really need to implement the crypto as a library or stand-alone demo program and double check somethings are not confused but this appears quite practical.
The bloat caused may not be so much because there is some privacy gain that may reduce incentive to have multiple addresses or use mixes.
Adam