I may not understand what you are trying to accomplish.
If H1 is slow but does not require a lot of random access to memory, then you can run H1 on a GPU or ASIC, then deliver a set of indexes into the blockchain to the node.
That's why H1 have to be fast, as you wrote in shortcomings.
If the blockchain fits in memory then you are doing a handful of memory accesses and the other work may dominate. If the blockchain does not fit in memory, then you are giving a huge advantage to people with large solid state drives (flash or battery DRAM) or probably better the ability to store the block chain in a memory kvs across multiple servers.
This may frustrate decentralization because you are better off just maintaining a connection to a node/pool with such a device than running node yourself.
Let's do some calculations to see what we have:
To get hashing data from block we use:
1. Coinbase outs: usualy 10*32 = 320 bytes.
2. Tx hashes: 32 * (from 1 to 80) (80 is current bitcoin transaction flow) = from 32 to 2560 bytes
With 720 block per day we will increase scratchpad from 92 MB to 758MB per year. Enough to make ASIC's stay away but ok for normal miners. Even if we will be a very success and will get tx flow like a bitcoin in next ten years, scratchpad will be about 10GB, not a problem even now.
The real problem i think is to have SPV client with this approach.
What do you think ?
If you want to use the block chain for PoW like Ethereum to require miners to run nodes (but see above), then you can probably do something simple like:
B = block
E = hash function, such as Keccak
B(i) = blockchain data at index i (mod len(blockchain) or some such)
H1=E(B)
H2=E(B+1)
PoW= E(B(H1))+H2)
Could be repeated, but not sure that adds much.
Maybe that is close to what you propose, but again I don't see the point to using a scratchpad at all. The blockchain is essentially your scratchpad.
We do very similar:
E' - first phase hash.
E - final phase hash.
H1' and H1' - is different parts of same hash (low and high) used to address random block
H1=E'(B)
PoW= E(H1 + E(B(H1')) + E(B(H1'')) )
E' can be a keccak(at least it should be as fast as keccak), but better to use some hash with more complicated instruction set as i said (64-bits numbers multiplication, AES/SSE)