Are all shares paid equally, or is this a scoring-based system?
Shares are paid in proportion to their difficulty. You take the difficulty for a share, divide by the summed difficulty of all shares among the last M (or N for vanilla p2pool), and multiply by the block reward for that block (including fees), and that gives you the reward a share would receive for that block if it were found.
I think if I am reading correctly, all shares are paid equally. That means during periods that M is higher due to random variation, the value of mining on the pool is lower. During periods that M is lower, the value of mining on the pool is higher. The steady-state average is not what matters here because the miner can choose to participate only during certain system states.
In periods where M is higher due to random variation, the probability of a share getting rewarded by another share finding a block increases, but the payout size decreases. These effects cancel out.
For example, if M is 4 (and all shares are the same difficulty), then if I mine a share, it will have 4 chances of probability 1/D to earn 1/4 of a block. If M is 256, then my share will have 256 chances of probability 1/D to earn 1/256th of a block. In either case, the expected revenue is the same.
The problem comes not when M is higher or lower, but when M is rapidly changing.
A share will be payable for a considerable amount of time, and M might change during that time. For example, if N is 4, my share will have its potential reward calculated 4 different times as it travels through different levels of the DAG. Let's say that we start with 1 share per level in all 4 levels of the DAG when my share is mined (my own share being the top level). After this, 1 more level is mined with 1 share per level, and then a level with 5 shares is mined, followed by another two levels with 1 share. The first chance I have to get paid comes from my own share, where the DAG looks like this:
Round 0:
1*
1
1
1
I get one 1/D chance of getting a 1/4-block reward here. (The asterisk denotes my share.) Next round:
Round 1:
1
1*
1
1
1/D chance of getting a 1/4-block reward.
The next round is a bit tricky. Five shares are mined at once, but since they aren't yet aware of their siblings, each one sees the top level of the DAG as having only one share:
Rounds 2a-2e:
1
1
1*
1
Five 1/D chances of getting a 1/4-block reward. Finally:
Round 3:
1
5
1
1*
One 1/D chance of getting a 1/8-block reward.
Total expected reward = 2*(1/4D) + 1*(5/4D) + 1*(1/8D) = 15/8D blocks. The fair reward would be 1/D blocks.
The above was a rather contrived and extreme example. I
wrote a script to simulate the rewards for a N=2016 PPLNS system with between 1 and 10 shares per level, and the typical revenue per share varies by less than 1%, with the maximum I've seen being 2.6%.
Again, you can't hop based on this, because it depends on events that happen after you mine a share, which can't be predicted. It's a selfish mining vulnerability, but not a hopping vulnerability.
And also again, this variation is completely eliminated by the PPLKD scheme I described earlier.