Author

Topic: How to calculate Total Network Hash Rate? (Read 5104 times)

full member
Activity: 210
Merit: 100
August 10, 2015, 04:11:15 PM
#10
If you don't want to calculate it but just to get the number from a web api you can use the blockchain.info api to get it from the url https://blockchain.info/q/hashrate
You can browse the query api page here https://blockchain.info/q/ for a full list of available api queries.
sr. member
Activity: 362
Merit: 262
The network hash rate estimate (over say the last 1008 blocks) is estimated by saying:
1. How long did it take to solve those 1008 blocks?
2. Then we ask what hash rate would result in 1008 blocks solved in that time if solved exactly as expected.

This gives us the estimated hash rate over the last 1008 blocks.

Interesting points to note:
- That gives us not the hash rate right now but, on average, an estimate for the hash rate over 1008 blocks.
- It's an estimate.  The more blocks you include in the average the closer the estimate becomes to the true hash rate.
- The more blocks include though the further back your estimate for hash rate becomes.
- If the network was "lucky" (or unlucky) over that period it would over (or under) estimate the true hash rate. The more blocks we include the less chance we are dealing with random luck.

So there is a counter balance.  If you really want to esitmate the recent hash rate you need to use less blocks.  But the less blocks you use the less stable the hash rate estimate (i.e. the more likely it will be off from the true hash rate).
legendary
Activity: 924
Merit: 1000
Dark Passenger Bitcoin miner 2013,Bitcoin node
A more mathy answer:

So we start with the Binomial distribution:
were can i learn to do the mathy stuff,Regards
Bin(n, k) = (n choose k) p^k q^(n-k)

where n is the total number of hashes that have occured in a time-period, and k is the total number of successes (blocks generated).
We can find p (which is the probability of success) by dividing the total success space by the total address space (so target hash value over maximum hash value - all 1s in binary). q is just 1 - p, the probability of failure.

We can take the large number limit of n - to get the Gaussian distribution (sorry, said Poisson earlier, my bad). You can look up this process elsewhere, but really it's not too hard.

So what we want, is to calculate n (the total hashes) from knowing k (the successful hashes) and p (the probability of success of a single hash).

So, make into a Gaussian distribution:

Gauss -> 1/(std-dev * sqrt(2 * pi)) * exp (- (x - avg)^2 / (2 std-dev^2))
(Sorry for awful formatting, check it out on wikipedia).

So, this itself is not too important, but what is important is that we can integrate this function between two values and get the probability of getting a result between those values.

So we plug in our mean block generation time (avg above, mu on wikipedia), then we plug in our standard deviation (we can calculate the variance in block generation time based on the last set of blocks - since the difficulty change would be good).

Then we want a certain probability out of it (which we calculated above with our target address space over our total address space), which we set equal to the integration of our function from some point to another point. Assuming equal spread on both sides of the mean for our block generation time, we can then say that our probability P (target / total) = integration(Gaussian distribution, from avg - t to avg + t), or just that P = 2 * Integrate (Gauss - infinity to avg + t) - 1. Once again sorry for horrible formatting.

This integration can't actually be done analytically, but numerically it's no problem, and we have tons of lookup tables for the "Standard Normal Distribution" which you can convert our problem to be represented by fairly easily.

So really, I probably actually went too far here. The quantity I was interested in was 't' above, because it's unknown. But to calculate the purported hash-rate, you could just look at the average of the distribution of block times........

Hmmm, I'll think about it a bit more....
newbie
Activity: 17
Merit: 0
February 12, 2014, 09:39:49 PM
#7
Thanks for everyone's replies, gives me a good jumping off point. Who would have guessed all that math you learn in college could actually have a use.
newbie
Activity: 27
Merit: 2
February 12, 2014, 08:17:51 PM
#6
Alright, sadly, I think I've been overthinking it way too hard - sadly because I was quite enjoying trying to piece it together.

So the probability of finding a successful hash is equal to the target hash value over the total hash values. The target hash value is calculated directly from the difficulty I believe.

So say p = probability of finding a hash, tg = target hash value (calculated from difficulty somehow - this would be in the code somewhere), and tot = total hashes possible (2^256 for a 256 bit hash, etc...)

So:

p = tg / tot

But also, due to the (Strong or Weak) Law of Large Numbers (http://en.wikipedia.org/wiki/Law_of_large_numbers), we can say that the relative frequency of event occurances approaches the probability of event occurances when you have lots of events (hashes).
I would say that we are dealing with pretty big numbers (though not really very large when compared to the total address space, otherwise reversing a hash would be too easy), so we can probably safely operate under this assumption.

So, we can say f = frequency of successful hashes, all = count of all hashes in a time period, and good = count of successful hashes in the same time period

// Common sense
f = good / all

// Apply law of large numbers
p = f
tg / tot = good / all

// The only unknown is all
all = good * tot / tg

So, to calculate the average hashrate for a given period, simply:
1) Take the total number of blocks generated for that period
2) Muliply it by the total number of hash possibilities (2^256 for a 256 bit hash)
3) Divide it by the number of hashes that would be considered "valid" for a block (the target hash value)

The number of valid hashes can be calculated from the difficulty, which is something that is coded into the protocol.
A good place to start to find out how difficulty is transformed into a target hash value would be: https://github.com/bitcoin/bitcoin/blob/ca1913e8f64fc245157b008d6b37160306aa1d83/src/miner.cpp
Just search for "target".
newbie
Activity: 27
Merit: 2
February 12, 2014, 07:28:14 PM
#5
A more mathy answer:

So we start with the Binomial distribution:

Bin(n, k) = (n choose k) p^k q^(n-k)

where n is the total number of hashes that have occured in a time-period, and k is the total number of successes (blocks generated).
We can find p (which is the probability of success) by dividing the total success space by the total address space (so target hash value over maximum hash value - all 1s in binary). q is just 1 - p, the probability of failure.

We can take the large number limit of n - to get the Gaussian distribution (sorry, said Poisson earlier, my bad). You can look up this process elsewhere, but really it's not too hard.

So what we want, is to calculate n (the total hashes) from knowing k (the successful hashes) and p (the probability of success of a single hash).

So, make into a Gaussian distribution:

Gauss -> 1/(std-dev * sqrt(2 * pi)) * exp (- (x - avg)^2 / (2 std-dev^2))
(Sorry for awful formatting, check it out on wikipedia).

So, this itself is not too important, but what is important is that we can integrate this function between two values and get the probability of getting a result between those values.

So we plug in our mean block generation time (avg above, mu on wikipedia), then we plug in our standard deviation (we can calculate the variance in block generation time based on the last set of blocks - since the difficulty change would be good).

Then we want a certain probability out of it (which we calculated above with our target address space over our total address space), which we set equal to the integration of our function from some point to another point. Assuming equal spread on both sides of the mean for our block generation time, we can then say that our probability P (target / total) = integration(Gaussian distribution, from avg - t to avg + t), or just that P = 2 * Integrate (Gauss - infinity to avg + t) - 1. Once again sorry for horrible formatting.

This integration can't actually be done analytically, but numerically it's no problem, and we have tons of lookup tables for the "Standard Normal Distribution" which you can convert our problem to be represented by fairly easily.

So really, I probably actually went too far here. The quantity I was interested in was 't' above, because it's unknown. But to calculate the purported hash-rate, you could just look at the average of the distribution of block times........

Hmmm, I'll think about it a bit more....
newbie
Activity: 27
Merit: 2
February 12, 2014, 07:09:34 PM
#4
I think that calculating an actual hash rate would be excedingly difficult (to nail down anyway, maybe not to estimate).

The difficulty on the other hand is not hard to calculate, because we have a target block time and can measure that very accurately.

The reason the hashrate would be difficult to calculate is simply because hashes are generated by hardware, which is supplied by economies, which is difficult to measure directly in a protocol - how can you account for changes such as new ASICs getting twice the hashes of the previous generation?

Instead we have to assume that hash-rate is tied to the difficulty - which it should be correlated at least. If you want to actually calculate the hash-rate, you would have to look at how probable it is to generate a valid hash for a given difficulty by guessing random values (using the current difficulty if you're looking for the current hash-rate). So look at the total guessing space (the number of possible hashes that exist), and divide the number of valid hashes (those below the target value) by that total value. Then you have you "probability" of getting a correct hash. Then you can use the same inverse exponential model used for radioactive decay (the Binomial (in low-number limit) distribution, or the Poisson (I believe, not sure on that) distribution - whatever it is it's a random event with no memory of previous events) to back out the frequency of events - using the current average block-generation rate (with some modification) as the "half-life". Then you could probably back out the frequency of smaller and smaller events, until you got to "probability 1"  - equivalently "every hash that happens."

Sorry if that made no sense, I learned most of this stuff in German despite speaking English natively.
newbie
Activity: 27
Merit: 2
February 12, 2014, 06:56:35 PM
#3
How about this: https://github.com/bitcoin/bitcoin/blob/16403b42759205461918c652fdc5b71c422c44a2/src/rpcmining.cpp

Just search for "hash" and you'll get the function "GetNetworkHashPS"

Description:
Quote
// Return average network hashes per second based on the last 'lookup' blocks,
// or from the last difficulty change if 'lookup' is nonpositive.
// If 'height' is nonnegative, compute the estimate at the time when a given block was found.

Maybe not quite what you're looking for, but could be a place to start.

Full disclaimer, I haven't looked through the function, just figured the source would be a good place to figure this out.
member
Activity: 112
Merit: 10
Do you moo?
February 12, 2014, 02:32:44 PM
#2
I'm trying to understand how to calculate the total network hash rate for a coin.
The only thing I could think of was taking the average time that blocks are being found and derive it from that.
Is there any other way to do it?

My other related question is what is the formula that is used to compute difficulty, or actually projected difficulty?

Check out the "hash_rate" field here:
https://blockchain.info/api/charts_api

I don't know how they calculate it, though.
newbie
Activity: 17
Merit: 0
February 12, 2014, 01:05:35 PM
#1
I'm trying to understand how to calculate the total network hash rate for a coin.
The only thing I could think of was taking the average time that blocks are being found and derive it from that.
Is there any other way to do it?

My other related question is what is the formula that is used to compute difficulty, or actually projected difficulty?
Jump to: