- Generate up to a certain size polynomial. I use 200560490130 or the next as my base primorial and store a vector of all 48923875 entries.
- Sieve *this* out up to the huge primorial in advance.
- Do your operations relative to the huge primorial.
But, as warned - the simple bitvector is still working better for me.
Cool, that is what I am going but looking at your numbers I also pre sieve the possible p6 chains reducing my candidate count by ~128 times:
const uint64_t primorial = 7420738134810;
const uint32_t sexcount = 14 243 984;
Then I run a second scan inline to catch the next 2 dozen or so primes (lets me avoid gmp and use simple 64bit math) before I hit the expensive code. General idea was to get a list of candidates which could be feed into something else (GPU was the thought).
It is much faster than reference but it is reaching the limit of how fast I can push it.
I have probably made a horrendous error in my algorithm... but coding again was fun...
Regards,
My implementation is a little different from both implementations mentioned above. If fact, the overhead is much less than that of jh00's implementation. My implementation is almost identical to Kim Walisch's primesieve implementation with a few minor exceptions.
Please see Kim Walisch's description of wheel factorization if you would like to know exactly what I am doing:
http://primesieve.org/@bsunau7 - mine does the same. I kill any location that fails to produce a six-set. I wonder which of us has a bug? *grin* I'll check my sieving code again. As one way to start comparing, the polynomials for the first few primorials are:
Generator at Pn7 (210)
97
Generator at Pn11 (2310)
97 937 1147 1357 2197
Generator at Pn13 (30030)
97 1357 2407 3457 4717 5557 5767 6817 7867 8077 8287 10177 10597 11647 12907 13747 13957 15007 16057 16267 17107 18367 19417 19837 21727 21937 22147 23197 24247 24457 25297 26557 27607 28657 29917
@Supercomputing - Did you figure out a way to combine wheel factorization with storing a dense bitvector div 2310 (or div 210)? Or do you just allow a large bitvector and handle it through segmentation? I liked the way the jh implementation saved a lot of sieve space that way, and a straightforward prime sieve achieves a less dense packing (3-4x).
Well, think of a primorial as a wheel with no pre-sieving. For example, 43# guaranties that k, k+4, k+6, k+10, k+12, and k+16 will have no divisors less than or equal to 43. Therefore, the bigger the primorial, the more efficiently the sieve will run. Each bit in your sieve array already represents a potential chain k, and the trick is to segment the sieve so that you do not keep eliminating the same false chains over and over again within your sieve array.
Only a single static table of 32-bit integers (primes interleaved with prime inverses) is needed to coalesce memory access.