Bitcoin's criteria has always been that each node adopt the first block it sees which meets the target, which creates an incentive to announce (one which breaks down if your share of the hashpower is over a large threshold, but if any party controls that much hashpower the system has other worse problems).
First off, I really appreciate you taking the time to reply. I am aware of the previous proposals to use lexigraphical mechanisms to create a preference in block choice. However, the proposal here is different in that it has two key new realizations that although subtle are what make this powerful. I know the paper is very dense so I will try to give a more intuitive idea of what is going on.
1) Although each hash is an independent event, each block is a dependent event because it references the hash of the parent on which it is built.
2) The weighting of each block should be based on the expected number of hashes that are needed to reproduce the chain of blocks specified by the tip block.
These two ideas when added together allows us to propose a new weighting mechanism, entropy reduction, which eliminates issues with regards to withholding attacks.
When a block is produced the number of bits (~ leading binary zeros, or = log2(hash)) which exceed the block threshold value follow an exponential distribution b = e^x which has a mean expectation (lambda) of 1. This means a miner will expect on average to get 1 more bit than the threshold, but can achieve many bits within the exponential distribution. However, they could get
luck and get > 1 extra bit or they could get unlucky and get < 1 extra bit. The interesting thing here is that even if they get lucky and say get 3 extra bits there is still a 12.5% chance that if there is a competing block produces they will lose. The likely hood of loosing never goes to zero.
In addition, the weighting using entropy is adding bits, which is the same as multiplying in the linear field. Specifically, Bitcoin: Total Difficulty(n) = diff(n) + Total Difficulty(n-1) whereas PoEM is: Total Difficulty(n) = diff(n) * Total Difficulty(n-1) .This means that even if a miner was to get super "lucky" and find 20 extra bits (1 in 1 million blocks) the extra entropy would only count for 1/3rd of a normal block and they would still have a 0.0000001% chance of losing if they withheld the block. This means that no matter
how lucky a miner gets, they will always maximize revenue by broadcasting the block as quickly as possible. This actually reduces the withholding attack that exists within the current difficulty weighting system.
If we took the same example, but use the current method of adding difficulties to determine block weight, the withholding period for finding 20 extra bits would be 1 million blocks. So compare 1 million to 1. This example, elucidates that this is a very different proposal than the simple proposal of using apparent difficulty. Rather this proposal is doing two things. First, It is realizing apparent
difficulty is beneficial because it creates a rule which makes heavier chains and it actually adds randomness into winning a block to prevent withholding. Second, it is using the proper combinatorics, treating blocks as a series of dependent events, to weight them which in turn solves all of the issues with using apparent difficulty.
I hope that provides some intuition. In the paper we have shown that this is indeed the case both empirically as well as experimentally. If you look at the graph on page 12 we show that for any g (block production rate per network delay) that the confirmation delay is smaller using this method compared to the current Bitcoin weighting.
We also explored the concept of concentration or looking at how the outcome distribution effects the statistical convergence. Bitcoin would be a binomial distribution and PoEM would be the biased gamma.
PS: Another way to think about this is that there is infinite variance in a single sample, so reintroducing randomness in the weighting, creates a better outcome because it means that a miner that has <50% of the hashrate is less likely to get sequentially lucky and produce a sequence of N blocks that is longer than the honest majority.