(link is to wayback machine so you can trust visiting it)
https://web.archive.org/web/20171212224738/http://www.naturehackerproducts.com/2017/12/new-proof-of-work-pow-method-based-on.html?m=1
The take home message is to find a single factor (of a length defined by the difficulty) of a very large number (over 100 digits).
Since it is such a large number GPU's become very slow at attempting it.
I have a quibble with your "GPU-resistant" claims. We don't know that factorization cannot be efficiently randomized. If it can, it can also be trivially parallelized by randomly dividing the search space and sending each range to a different processing unit. In short, you can't claim an upper bound on parallelization without formal proof. Even if no one knows how to parallelize factorization (is this the case?), it doesn't prove that factorization can't be parallelized.
IMO, the effort expended on trying to democratize mining is wasted. Centralization of mining will always occur. Even factors like bandwidth and network latency are favored by centralized mining. Differences in electricity costs and so on only amplify this effect.
But there is no reason that the effort spent solving PoW puzzles has to be useless. Any problem in NP (including factoring) is a potential candidate and I recommend Circuit-SAT as an ideal candidate for PoW puzzles. Using this approach has the benefit that any real-world problem can be encoded as a circuit and then submitted for solution by the mining network.