Pages:
Author

Topic: probability that 2 clients generate the same public key? (Read 4969 times)

full member
Activity: 140
Merit: 100
Shall we stop worrying about this problem now?

But it can happen, so it will happen!
At least so my physics teacher told me in the introduction to quantum mechanics.  Wink
legendary
Activity: 1246
Merit: 1016
Strength in numbers
As it happens, the numbers in bitcoin are so large that it doesn't matter too much.  As you can read above, if every person on the planet generated 77 thousand addresses every second of their life for the next 75 years we would expect a 50% chance of collision.

To be honest, I don't really think that's enough of a margin if we're planning to be secure for a long long time, I would rather see orders of magnitude similar to the age of the universe.  However, it's certainly enough for our practical purposes today.  Possibly by the time it becomes relevant, bitcoin will have evolved to use a different addressing scheme.


When your team of 6 billion finally finds a collision on average you'll get 21 million bitcoins / total number of addresses. That'll be exciting.

I'll keep my wishes for safe car rides.
donator
Activity: 1218
Merit: 1079
Gerald Davis
(a) My concern wasn't with "very low"; it was with "only 365 possible birthdays so if you have 100 people 100 is roughly 1/3rd of 365".  The implication being that 1/3rd is the scaledown factor.  It isn't.  The scaledown factor is 100*99.  You'll agree that that is significantly higher than 3.  That huge disparity between what is expected (1/3) and what is the truth (1/9900) is what makes the "Birthday Surprise" a surprise for people who haven't heard of it.

I was never implying the odds are 1/3rd but that because in a sample of 100 people with each persons having only 1 of 365 possible choices the RATIO between the number of samples and sample range is what result s in a large birthday effect.  In hindsight it may not have been clear.    For example with only 2 people the ratio between the samples and sample range is much larger and the birthday effect is much smaller.

Bitcoin falls into the latter category.  Realistically even a quadrillion users with a quadrillion active keys is an infinitesimal tiny amount (hence large ratio).  

Quote
I'm not really suggesting that collision is likely (I did specifically say earlier that we shouldn't worry about this).  However, to think that my worry is each human generating 77k keys per second is to miss the point.  Bear in mind that miners are measuring their hashing power in hundreds of MEGA hashes per GPU.  Also bear in mind that computing power is likely to continue to increase as time passes.   Each GPU therefore represents about a thousand people -- today; what it will be in ten years is anybodies guess (Moore's law suggests it will be 160 thousand people).  I was therefore merely commenting that 77k per second doesn't seem like an enormous margin of protection when typical cryptographic security measures talk about orders of magnitude that make the number of atoms in the universe seem small.

Of course it does because that computing time has cost and it is the # of active keys which boosts the chance due to the birthday effect.   Your example involves 77k leys per second WITH VALUE to have any possible chance of collision.  If the # of keys that have value remain relatively small (say a quadrillion users with a quadrillion keys each) then even a billion fold increase in computing power puts the chance of a collision in the next millennium at the rounding error range.
donator
Activity: 29
Merit: 252
hero member
Activity: 504
Merit: 502
Despite that long rant you said the same thing.

a) that with 100 people the odds in all having unique birthdays is very low.  I said that.  You said that.
b) that under any realistic scenario that odds in having enough private keys used to make birthday problem a problem isn't a problem.

I would say your every person on the planet generating 77 thousands keys per second to be dubious.  All economic activity on the planet in all forms (even informal) is on the magnitude <1 per second per person.  Of course a collision with a long since forgotten key isn't a security risk.  Even if every human was generating 77 thousand transactions per second continually over their entire lives it is unlikely they would be saving all ~2.5 trillion private keys every year.

So given the risk of a collision only applies if the collided key is currently in use (a viable copy remains) there is no realistic scenario where enough keys would be "in circulation" for the the birthday paradox/problem/surprise has any meaningful effect on the chance of a duplicated key.

It wasn't meant to be a rant; so my apologies if it came across that way.

(a) My concern wasn't with "very low"; it was with "only 365 possible birthdays so if you have 100 people 100 is roughly 1/3rd of 365".  The implication being that 1/3rd is the scaledown factor.  It isn't.  The scaledown factor is 100*99.  You'll agree that that is significantly higher than 3.  That huge disparity between what is expected (1/3) and what is the truth (1/9900) is what makes the "Birthday Surprise" a surprise for people who haven't heard of it.

(b) The problem is further that the scaling factor goes up as the square.  The more keys we generate the faster and faster we approach the danger zone.  The birthday paradox will, eventually, inevitably, be a problem.  In terms of address collision it is the problem.

I'm not really suggesting that collision is likely (I did specifically say earlier that we shouldn't worry about this).  However, to think that my worry is each human generating 77k keys per second is to miss the point.  Bear in mind that miners are measuring their hashing power in hundreds of MEGA hashes per GPU.  Also bear in mind that computing power is likely to continue to increase as time passes.   Each GPU therefore represents about a thousand people -- today; what it will be in ten years is anybodies guess (Moore's law suggests it will be 160 thousand people).  I was therefore merely commenting that 77k per second doesn't seem like an enormous margin of protection when typical cryptographic security measures talk about orders of magnitude that make the number of atoms in the universe seem small.
kjj
legendary
Activity: 1302
Merit: 1026
The probability is effectively zero.  Even if the entire mass of the solar system was converted to computing devices and run for a very long time.
donator
Activity: 1218
Merit: 1079
Gerald Davis
Forgive me for the newbie question, but have such calculations taken into account the "birthday problem"  effect?

That is, when looking for collision probability, by, say, 2020 AD, one should estimate all the keys in use by then, and then adjust the answer according to that number?




The birthday paradox only applies when the number of elements in the set is a significant fraction of the number of potential unique values of the set.

For example there are only 365 possible birthdays so if you have 100 people 100 is roughly 1/3rd of 365.  The odds that all one hundred are unique is very small.

Even if in 2220 if there have been 1 quadrillion users (current and historical) and they have generated a quadrillion addresses each only 1 in  95,780,971,304,118,100,000,000,000,000,000,000,000,000,000,000,000,000 addresses will have been used.  The effect of Birthday paradox is insignificant.


That's not the Birthday Paradox.

The Birthday Paradox is not a paradox at all.  It should be better called the "birthday surprise" (and your description implies that you've fallen for it); in that it requires far fewer people in the pool than the typical non-mathematician would expect before it becomes likely that two of them share a birthday (it's 23 people for a 50% chance).  The odds of a preselected birthday matching for any member of the pool is indeed 1/365, but what people forget is that that is not what we're looking for -- we're looking for any member of the pool to match any other member of the pool.   I dealt with this above in my example, the key line is that the standard odds are reduced by a factor according to the binomial operator, or the n-choose-r function.

The Birthday Problem definitely is relevant in all situations where we're looking for collisions.  The factor that reduces the input odds goes up as the square of the number of members of the pool.

As it happens, the numbers in bitcoin are so large that it doesn't matter too much.  As you can read above, if every person on the planet generated 77 thousand addresses every second of their life for the next 75 years we would expect a 50% chance of collision.

To be honest, I don't really think that's enough of a margin if we're planning to be secure for a long long time, I would rather see orders of magnitude similar to the age of the universe.  However, it's certainly enough for our practical purposes today.  Possibly by the time it becomes relevant, bitcoin will have evolved to use a different addressing scheme.


Despite that long rant you said the same thing.

a) that with 100 people the odds in all having unique birthdays is very low.  I said that.  You said that.
b) that under any realistic scenario that odds in having enough private keys used to make birthday problem a problem isn't a problem.

I would say your every person on the planet generating 77 thousands keys per second to be dubious.  All economic activity on the planet in all forms (even informal) is on the magnitude <1 per second per person.  Of course a collision with a long since forgotten key isn't a security risk.  Even if every human was generating 77 thousand transactions per second continually over their entire lives it is unlikely they would be saving all ~2.5 trillion private keys every year.

So given the risk of a collision only applies if the collided key is currently in use (a viable copy remains) there is no realistic scenario where enough keys would be "in circulation" for the the birthday paradox/problem/surprise has any meaningful effect on the chance of a duplicated key.
hero member
Activity: 504
Merit: 502
Forgive me for the newbie question, but have such calculations taken into account the "birthday problem"  effect?

That is, when looking for collision probability, by, say, 2020 AD, one should estimate all the keys in use by then, and then adjust the answer according to that number?




The birthday paradox only applies when the number of elements in the set is a significant fraction of the number of potential unique values of the set.

For example there are only 365 possible birthdays so if you have 100 people 100 is roughly 1/3rd of 365.  The odds that all one hundred are unique is very small.

Even if in 2220 if there have been 1 quadrillion users (current and historical) and they have generated a quadrillion addresses each only 1 in  95,780,971,304,118,100,000,000,000,000,000,000,000,000,000,000,000,000 addresses will have been used.  The effect of Birthday paradox is insignificant.


That's not the Birthday Paradox.

The Birthday Paradox is not a paradox at all.  It should be better called the "birthday surprise" (and your description implies that you've fallen for it); in that it requires far fewer people in the pool than the typical non-mathematician would expect before it becomes likely that two of them share a birthday (it's 23 people for a 50% chance).  The odds of a preselected birthday matching for any member of the pool is indeed 1/365, but what people forget is that that is not what we're looking for -- we're looking for any member of the pool to match any other member of the pool.   I dealt with this above in my example, the key line is that the standard odds are reduced by a factor according to the binomial operator, or the n-choose-r function.

The Birthday Problem definitely is relevant in all situations where we're looking for collisions.  The factor that reduces the input odds goes up as the square of the number of members of the pool.

As it happens, the numbers in bitcoin are so large that it doesn't matter too much.  As you can read above, if every person on the planet generated 77 thousand addresses every second of their life for the next 75 years we would expect a 50% chance of collision.

To be honest, I don't really think that's enough of a margin if we're planning to be secure for a long long time, I would rather see orders of magnitude similar to the age of the universe.  However, it's certainly enough for our practical purposes today.  Possibly by the time it becomes relevant, bitcoin will have evolved to use a different addressing scheme.
legendary
Activity: 1896
Merit: 1353
Your chances of hitting the megamillion lottery and being killed by a meteorite are far, far greater.

The probability of being hit by a meteorite depend on the mass of the meteorite (and also on the shape and composition of the meteorite, but we will not consider these parameters here).
There is a mass that maximizes the probability; smaller meteorites have an increased probability of being destroyed by the atmosphere, and larger meteorites tend to be less common than smaller ones.

It is possible to express the probability of very unlikely events as the mass of a meteorite having same probability of hitting you during a period of one year.
(some practical work will need to be done to calibrate the conversion, but this is not infeasible)

Armed with this new conceptual framework, we can convert the length of a private key into kilograms.
Hashes per second spent by an attacker on a key change the mass of the object (nonlinearly)

Useful, isn't it?
donator
Activity: 1218
Merit: 1079
Gerald Davis
Forgive me for the newbie question, but have such calculations taken into account the "birthday problem"  effect?

That is, when looking for collision probability, by, say, 2020 AD, one should estimate all the keys in use by then, and then adjust the answer according to that number?




The birthday paradox only applies when the number of elements in the set is a significant fraction of the number of potential unique values of the set.

For example there are only 365 possible birthdays so if you have 100 people 100 is roughly 1/3rd of 365.  The odds that all one hundred are unique is very small.

Even if in 2220 if there have been 1 quadrillion users (current and historical) and they have generated a quadrillion addresses each only 1 in  95,780,971,304,118,100,000,000,000,000,000,000,000,000,000,000,000,000 addresses will have been used.  The effect of Birthday paradox is insignificant.
legendary
Activity: 1274
Merit: 1004
Your chances of hitting the megamillion lottery or being killed by a meteorite are far, far greater.

people winning the lottery: happens every day

people killed by meteorites: it happens

1) Win lottery
2) Killed by meteorite on way to bank
3) Second coming of Christ
4) You get resurrected
5) Win the lottery again, buy the Cubs
6) Cubs win the WS
7) Every member of the Cubs wins the lottery

Would that be about close? I'm thinking 6 might make it a little too difficult.
Bro
full member
Activity: 218
Merit: 100
Your chances of hitting the megamillion lottery or being killed by a meteorite are far, far greater.

people winning the lottery: happens every day

people killed by meteorites: it happens
legendary
Activity: 1072
Merit: 1181
No I meant keys.
Is it not probable the majority of keys ever created will be orphaned at some point?
I suspect lots of private keys will be created for temporary purposes, then deleted or trashed, people will stop using their old wallet, move funds to a new client with a new priv key, stop monitoring the old one.

Certainly; most keys are (and should) only be used once or a limited number of times. For the purposes of this "attack" all that matters is how many addresses exist with a non-zero balance.

The original question is what would happen if you accidentally generate a private key corresponding to an address that already exists. Simple: you get access to any remaining funds available to that key. In case a rescan is done (typically not done when generating a new key, as the software assumes it is fresh), you would indeed see earlier transactions that key's address was involved in.
donator
Activity: 1218
Merit: 1079
Gerald Davis
Similarly, what would be the size of a .txt file (in Megabytes) that had ALL possible Addresses & Privkeys in the following format:
11705361804156796183489345942421835167357080816800975597555641083563606016 MiB.

Are you being sarcastic, or is this seriously a reasonable estimate of the file size?

Would like to see the math behind the result.

There are 2^256 possible private keys (although as pointed out above only 2^160 addresses thus there are 2^96 private keys which share the same address).

Since they are 256 bits each to store them (with no other meta-data) would require
256*(2^256) = 2.96428E+79 bits
2.96428E+79 bits / 8 = 3.70535E+78 bytes
3.70535E+78 / 1024^5 = 3.29101E+63 petibytes
3.70535E+78 / 1024^6 = 3.29101E+60 exibytes
3.70535E+78 / 1024^7 = 3.29101E+60 zettibytes or ~3,213,876,088,517,980,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 zettibytes

For comparison global data storage in all forms (from data centers to smartphones) is ~0.3 zettibytes.
If you could acheive a storage density of 1 bit per atom it would require more atoms than the estimated universe (including dark matter).

Obviously that would only store just the private keys which is simply a sequence of all 256 bit numbers (in hex) from

0x0000000000000000

to

0xFFFFFFFFFFFFFFFF

Someone said the probability of two people generating identical keys randomly are like getting hit by an asteroid or winning the powerball.  Someone else said it like BOTH happening.  It is probably more like having both happen to you every day for an entire year (and likely that under estimates it by many orders of magnitude).

TL/DR version: 2^256 is very big




legendary
Activity: 1072
Merit: 1181
Assume there are N different bitcoin addresses possible (2^256).

There are only 2^160 possible addresses, but that hardly matters for the conclusion Wink
hero member
Activity: 504
Merit: 502
This is just quickly banged out, so please forgive me if I've made a mistake...

Assume there are N different bitcoin addresses possible (2^160).

Assume there are M different bitcoin addresses generated.

What M is needed before the probability of a collision is greater than 50%?

Number of different pairs in a pool of M: nCr(M,2) or (M)(M-1)/2

Pick one member at random.  The chance that a second randomly chosen member matches it is 1/N.

Probability that any of the possible pairs match from a pool of M is therefore:

  p = 1/N*nCr(M,2)

or

  p = M(M-1)/2N

Our target p is 0.5; so we are solving the quadratic:

  0 = M^2 - M - N

Schoolboy maths will solve it; but I'm too lazy...

http://www.wolframalpha.com/input/?i=solve+x^2-x-2^160

approx 1.20893 x 10^24

That's 184.18 x 10^12 addresses per person or 77 thousand addresses generated every second of every person on the planet's life (assuming a 75 year life expectancy)

Shall we stop worrying about this problem now?


Edit: Correct my bad 2^256 to 2^160.  Thanks to Pieter Wuille below.
sr. member
Activity: 434
Merit: 250
100%
I know the chances of 2 exact keys is mindbogglingly small, but what would happen if there were?

One of the two would probably be orphaned with a 0 balance on it.

One of the two what? You seem to be talking about transactions instead of keys here.

No I meant keys.
Is it not probable the majority of keys ever created will be orphaned at some point?
I suspect lots of private keys will be created for temporary purposes, then deleted or trashed, people will stop using their old wallet, move funds to a new client with a new priv key, stop monitoring the old one.

I guess of all those billions of billions of private keys ever created most will end up not being used for a long time. So even if (let's say 20 years from now) you happen to generate a priv key that someone had been using today chances are the person has long stopped using it.

It's just going to be funny to see someone else's transaction history pop up in your client.

The chances of two people ever using the same key at the same time should be extremely small.

(Or am I missing something there? You're the developer)
legendary
Activity: 1072
Merit: 1181
I know the chances of 2 exact keys is mindbogglingly small, but what would happen if there were?

One of the two would probably be orphaned with a 0 balance on it.

One of the two what? You seem to be talking about transactions instead of keys here.
sr. member
Activity: 434
Merit: 250
100%
I know the chances of 2 exact keys is mindbogglingly small, but what would happen if there were?

One of the two would probably be orphaned with a 0 balance on it.


sr. member
Activity: 476
Merit: 250
I know the chances of 2 exact keys is mindbogglingly small, but what would happen if there were?
they could spend each others coins.

The keys don't need to be identical by the way - only their corresponding addresses.

True that. Well spotted Smiley
Pages:
Jump to: