1) Wouldn't higher gap lengths have higher merits? Or is there some feature of gaps that are more isolated that makes them lower merit? Are more isolated gaps more predictable to find?
If they were maximal gaps, yes. But if one looks at current records, they drop in average merit as the gap size increases.
This is almost certainly an artifact of the time taken per test. It takes much longer to find gaps at the larger sizes, hence we can run many more tests at small sizes given the same processing time, so it's more likely we find a large merit at the smaller sizes.
Gapcoin ran for a long time with shift 25 and increased the average merit quite a bit in the 5k-7k range. So it finds fewer records in that area now because the threshold is higher. On the other hand, that would still seem to be a good choice if looking for very large merits (34+). It depends what your goal is. My point was mostly that spreading out the shift values would result in more even coverage leading to more records.
2) If there are tools much better than the current algorithm, what would be the reason for not converting Gapcoin to use the more efficient algorithm?
Probably because very few people actually do the coding, and most have no incentive. There are some arguable distribution benefits of the random selections rather than traditional ones, and it does simplify things for the project. It wouldn't surprise me if the developers had other reasons.
The non-CRT algorithm is like throwing darts at a target while blindfolded and being spun around. It's nice that you throw quickly, but most of your darts are hitting everything in the room but the wall with the target.
The CRT algorithm does a better job of throwing only when we think we're pointing toward the target wall.
The traditional method is not using a blindfold and aiming at the target. Measured in speed of throws (primes per second) it is much slower, but each prime is far more likely to find a gap. I'm basing this on seeing a single computer in the shift 512 range generating 10x more records/day than Gapcoin's overall results. There are lots of programming differences as well, but I think the largest difference is the search process itself.
In theory, with Gapcoin's p = sha256(Blockheader) * 2^shift + adder (adder < 2^shift), one would select shift and adder such that they come out to the prime found using sieving around something like 1234567*193#/210. I don't have any idea if that would be possible for a miner. It might be possible but result in fewer reported blocks (even if the average merits are higher) so be worth less over time.