However, I'm not sure how the sub-p2pool would pay out the earnings.
Either the master p2pool share chain would have to know every sub-p2pool address and pay them their micro-payments in the master p2pool block
Thats it.
Simple version.
1) Increase LP interval of p2pool to 60 seconds (and increase difficulty by factor of 6).
2) Expand protocol from 1 share = 1 address to 1 share = multiple addresses and weighted split
3) Build sub p2pool protocol which creates its own share chain (unknown and invisible to main p2pool).
When p2pool find a block it payouts as normal with the exception of the fact that 1 share may be paid to more than one address.
When subpool find a low diff share it adds it to its internal share chain.
When subpool finds a p2oool diff share it submits it to p2pool (which updates p2pool split)
When subpool finds a block it submits it to Bitcoin network.
Just thinking out loud...
I'm not sure how the subpool would be notified of share difficulty changes and submit shares to the parent p2pool : each subpool node should be notified of the share difficulty changes so could probably maintain a connection to the parent p2pool to get it (or can it trust other subpool nodes to forward the information and only have a subset of the subpool in charge of broadcasting it?). If having "proxies" is a security problem and mandates a direct parent p2pool connection for each subpool node, is the p2pool protocol scalable enough to support this diff broadcast to thousands of nodes or even more (assuming we want every miner to be able to join p2pool) instead of the current 200+ nodes?
I can see how the share submission could be efficient : it's supposed to be low traffic.
Another problem would be to make sure one subpool node would have to reward other nodes having successfully found low-diff shares while having a valid p2pool share. I'm not familiar enough with Bitcoin to be sure it is doable.
If p2pool can scale to an arbitrary large number of nodes, and still get low latencies for broadcasting diff changes, the first class of problems don't really exist : just make all subpool nodes join both p2pool and the subpool and submit information to both according to the difficulty having been resolved. I suppose it should work, I'm only worried about latencies on large scale p2p nets (raising the target LP interval would help).
For the second problem, sorry no clue.