The DPs are first loaded into memory. You control how often the DPs are written to the workfile. Once every minute, hour, 2 hours, 12 hours, however often. And this is good depending on how much RAM you have and/or the DP size.
You still didn't get my point. Let's revisit your example:
Keyspace - 129 bits
DP = 32
Let's pretend we do like 500 Mkeys/second. So we find a DP once every 8 seconds or so (one in 4 billion jumps).
Let's also go small and allocate 512 MB of RAM for all the bit gymnastics.
Will you wait
4 years and 3 months to write the DPs in the memory to a file? Because this is how long it will take before your allocated RAM gets filled.
No, you'll want to dump them sooner than that. This gets messy really quick, no matter how you look at it. Oh, you want to use 8 GB of RAM? Maybe 16? Oh shit, you may not live long enough to have that first dump even started. On the other hand, you also have almost zero chances to have a collision inside such a tiny amount of memory (I'm still talking about 16 GB, if not even more), so what do we do? Well, not hard to guess - dump data more often, I guess. And merge and merge and merge... maybe now you get my point.
You really do not understand what you are saying, really, at all.
Are you using a CPU and DDR1 RAM in your current PC?
Please, just use the program in a medium range so you can actually know how it treats DPs and files.
It is nothing for the server (or single CPU) to write to file, even with very low DP. But let's keep up with the ongoing DP 32.
Let's keep this simple and say the hashtable is full of 256 bits, for each entry, 128 for point and 128 for distance. Now let's calculate 10,000,000 entries.
32 bytes/entry×10,000,000 entries=320,000,000 bytes
This is equivalent to 320,000 kilobytes (KB), 320 megabytes (MB), or 0.32 gigabytes (GB)
However, I do not want to split the hashtable at .32 GB, I want a full GB before storing. So we will say I only want to save to file, every 30,000,000 DPs found.
With your example of 500 MKey/s, which is really slow for modern GPUs...but a DP is found and stored in RAM every 8 seconds. There are 86,400 seconds in a day, so that would mean every day, there would be 10,800 DPs found and stored in RAM. Now we wait until we have 30,000,000 DPs to write from RAM to the work file, how long will that take? 2,777 days? But yes, with only 1 card, it would take decades and decades to solve, so we run 256 GPUs getting 500 MKey/s. So now, every day, we should find 2,764,800 DPs a day, so that would mean every 11 days, we would write to file. However, like I said, 500 MKey/s is slow, so let's say that our GPUs get 1,500 MKey/s, a DP found every 3 seconds. So now each GPU finds 28,800 DPs a day. So 256 x 28,800 = 7,372,800 DPs a day, so every 4 days we are writing to file. You can adjust to whatever time frame or GB you want to with the program. Maybe you want to save every 12 hours, or every hour, doesn't matter. Now when you merge those smaller files, it obviously creates a larger file, so now when you go to merge the next batch of smaller files, you merge those with the larger file, so you do not have to keep merging, every file, every merge.
You act like dumping the data more often is a problem; maybe it is if you have the first PC known to man, trying to act as the server. Again, dump as much as your RAM holds in 1 file, or dump every hour, it does not matter, it does not get messy. It can be as clean as you want it/need it to be, or as "messy" as you want it to be.
You can merge once a day, or as often as you like. It's not hard, you just start the program with the merge commands and that's it.
115 was solved with this program/algo, using 256-384 GPUs that were getting 1,500-1,600 MKey/s, with a DP of 25, which created a master work file in excess of 300GB, in 13 days.
So no, I still do not get your point.