Shouldn't this be obvious under inspection? The balance of winning bets vs. losing bets would be skewed due the lack of random nature of the "released" bets.
Nope, because they select the bets to be released such that the laundry customer's net profit before fees (and the house "take") is exactly zero. That's what you'd expect in the long run from true randomness.
There is no bias in the overall win/loss ratios. The only bias is in
which wallets win and which lose, but you can't detect that, especially if the customer provides multiple clean wallets and multiple dirty wallets.
Then your wagers for both the losing as well as the winning side would need to be placed on the same wager option (e.g., all on the same "less than 8000") so that they can be evened out without introducing any bias.
The customer then spends the money out of the clean wallet so SatoshiDice can't use the unreleased transactions.
That's not even needed. The transaction hash used by the mixing service to determine whether the wager is a winner or loser will change based on any of the data in the transaction changing. So if the INPUT and OUTPUT are the same for each trial, the amount can be changed to cause multiple transaction hashes.
So if I understand your approach correctly, here' s the approach with this INPUT re-use variation:
This assumes the desired amount to mix is 1,000 BTC, as you described was the amount of "dirty" coins.
So there would first be construction of the winning wager using the "less than 8,000" bet which has a 8x payout. So that bet amount needs to be about 125 BTC (which will win 1,000 BTC). So multiple wager transactions are created for the same INPUT and OUTPUT, except the amount for the first one is 125.0 BTC, the second one 124.9999999, the third one 124.99999998, etc.. So maybe a few dozen of these are created to ensure that there is at least one transaction out of the batch is a winner (transaction hash that results in a number less than 8,000). These trials are not broadcast to the network but sent out-of-band to the mixing service.
Then there would be the construction of the losing wagers. There are a little over seven losing wagers needed to average out the one winner. So to "burn" the 1,000 BTC plus a 5% mixing fee, for example, the target amount of 1,050 BTC is divided by seven to get 150 BTC per losing wager.
Thus seven addresses for the losing wagers are each funded with 150. Then for each there is maybe a dozen trials created for each using the same approach as above, one at 150 BTC, the next at 149.99999999, then 149.49999998, etc.
The end result of that is for each of the seven "losing wagers", there is at least one trial with a transaction hash from the batch that is a loser (i.e., transaction hash that results in a number of 8,000 or greater). These trials too are not broadcast to the network but instead are sent out-of-band to the mixing service.
The mixing service then evaluates the batch to find one winner and evaluates the losing batch to find a loser for each of the seven and broadcasts those eight transactions to the network. The end result is mixed coins with nearly no detectable bias. (The correct ratio is one winner out of 8.192 trials so there is a slight bias for rounding down to 7 losers. Even much of that bias could be corrected for by the mixing service by instructing for there to be 8 losers (with a lower wager amount requested) for every N "mixing jobs".
The benefit to a popular wagering service offering this rather than some other mixing service presumably comes from there being an enormous number of unrelated transactions flowing through the service such that the payouts to the "winning" wager (clean) would essentially be unrecognizable as having any amount of taint higher or lower than any other wager payout from that service.