So, would you say the approach below would be a better way to determine the hash rate?
This of course takes into account the maximum likelihood. L() is in fact the likelihood function. I've always been terrible with statistical models but this one may give you a more coherent distribution than Poisson's which smells really fishy wink wink
Can you explain how the above likelihood function would help? I'm not sure I follow.
Here's how I understand estimates of the network hashrate:
Firstly, if the network hashrate is unchanging then the number of blocks per unit time
is Poisson distributed.
Any hash has the same chance of being sufficiently under target to solve a block, the number of hashes per block is geometrically distributed, with p = 1/target. Since p is extremely small we can approximate this quite closely using the exponential distribution. Assuming the hashrate doesn't change, then the amount of time between block solves is exponentially distributed, and thus the block solving process is a Poisson process.
However the block solved rate changes, sometimes significantly, between any two retargets. This makes the process a non-homogenous Poisson process.
So you can calculate the hashrate from the number of blocks per time, or the inverse of the amount time per block/s, and use confidence intervals based on whichever method you used (Poisson distribution for blocks per time, Erlang distribution for amount time per block/s).
My own method is to use a non-parametric smoothing spline. I then time stretch the data so that the smoothing spline is 12 blocks every two hours and optimize such that all of the actual two hourly hashrate are Poisson distributed random variables with a mean rate of 12.