Elsewhere I described a number theoretical hash which is provably a hash using pairing cryptography. But since it's slow it may not be of interest. For that you have some paring compatible elliptic curve group which supports an efficient operation to coerce a random value to a point on the curve. You compute point=G*H(data), point2 = to_point(H(point||masked_tx_hash)), proof = point2*H(data). You transmit {point, proof}. The other side computes point2 from point and verify that pairing(G,proof) == pairing(point,point2). This proves that point is a EC curve point for which you know the discrete log. The troof can be omitted from historical blocks (like the P2SH^2 above) to avoid the storage overhead. This also has an advantage that it's replay proof— even if you log proofs you cannot make another transaction reusing the prove hash without knowing the data. The downside of this is the pairing computation probably takes 1ms per transaction.
In either of these cases people could randomly select their data and keep trying until the prefix of the point matches some data they want to send, but this has exponential complexity in the number of bits stored per value.
Client privacy can be improved further for ligher-than-SPV resolvers by using information theoretic PIR: http://percy.sourceforge.net/