Anyway the intention of adding 256 times PoW seems reasonable but it will introduces an undesirable impact on nodes as well (not miners) because when new blocks are synchronised each node also has to run that loop to verify the hash, which will logically slows down blockchain sync with a bit of pain when new node joins.
It may be more efficient if this is improved so only intend to change mining algo.
* It could be possible to put the loop index in the block header, nNonce which is now simply dead part as always set to 0xFEEDBEEF.
* This may be a right time to make this change along with the hard fork at #21111 rather than later time in case needed.
256x will be enough. The goal was to nudge the amount of PoW just enough to make it a PoT chain that requires more computing resources at a much faster rate than BTCW. Testing has shown that a modern i9 core is loaded 100% on about one blocks worth of utxos. There are about 144 blocks per day, this would equate to about 144 new cores required every day to keep up with the potential utxos. This amounts to around 5 modern computers per day. This is about 256 times faster rate than BTCW. Each month the network would need about 150ish new modern computers to keep up. These numbers are night and day compared with BTCW and will be sufficient to require more computers to mine whilst keeping in line with the proof of transaction PoT model and prevent much of a benefit to mine with GPU or asic.
When initially implementing the code update, I was going to embed the number inside the nonce like you stated, however there were places in the code where block height wasn't available to make the distinction and I did not want to resort to using time as a differentiator for fork determination; therefore I decided to let the network verify each block by brute forcing up to 256. On average there will be 128 attempts, once a solution is found the loop is exited. On modern day computers, this is negligible core cycles and will not be the bottleneck in initial block download nor block to block processing.