What are the chances of an address collision? and what happens when it does? - page 3.

flatfly

legendary

Activity: 1120

Merit: 1016

090930

Quote from: flynn on November 03, 2012, 09:58:58 AM

Quote

* getting hit by lightning while taking a crap for how many years in a row is equally probable as finding a collision: log(1E-65) / log(6.8E-12) = 5.8

I don't get why you put logs here

perhaps because we're talking about taking a crap?

flynn

hero member

Activity: 728

Merit: 540

Quote

* getting hit by lightning while taking a crap for how many years in a row is equally probable as finding a collision: log(1E-65) / log(6.8E-12) = 5.8

I don't get why you put logs here

molecular

donator

Activity: 2772

Merit: 1019

on a side not: the probability of getting struck by lightning during your 80 year long life is 1/3500???

1.0 / (1.0 - pow((1.0-1.0/280000),80)) = 3500

on the bright side: in case that happens to me the probability of me taking a crap at that time is only 1/525600.

molecular

donator

Activity: 2772

Merit: 1019

Quote from: DeathAndTaxes on November 02, 2012, 10:42:42 AM

The number of addresses generated isn't important it is the number of funded addresses. A collision with a spent change address that will never be used again isn't very valuable.

So if we assume 1 trillion active addresses (10 billion users with an average of 100 active funded addresses at any point in time). The odds of finding a collision are ~ 1 in 1*10^65.

If the odds of getting hit by lightning are 1 in 280,000 then the odds of getting stuck by lightning 12 times per year is roughly the same as creating a collision with with one of the 1 trillion active addresses. I don't like lightning comparison because lightning is localized (not randomly distributed). If we use the lottery example the odds of buying lottery tickets for 8 consecutive games and winning the jackpot on all 8 is roughly the same as finding a single collision.

I would point out that the single random collision is unlikely to be very profitable. If there are 1 trillion addresses and all coins are mined the average address will hold ~0.000021 BTC (21 uBTC - microBTC). Another way to look at it is if we assume in this 10 billion scenario Bitcoin has replaced all global money supplies then the collective value of the network would be on the order of ~$60T (2012 dollars). At 1 trillion addresses the value of finding that collision (less likely than winning lottery 8 times in a row) would be ~$60. BTW fun fact that would make the exchange rate $2.857 per uBTC!

thanks, Death&Taxes for doing the math.

What would you suggest to answer to a newbies question: "But wouldn't 2 people accidentally find the same address at some point?"

I still 'd like to be able to answer something like: "It's possible, but it's about as likely as you getting struck by lightning while taking a crap - *pause* - every year for 5 years in a row!". Just don't know.

Let me try to do the math:

probability of getting struck by lightning in any given year: 1/280000.
probability of taking a shit at any given point in time: 1/(60*24*365) = 1/525600 (assuming you take a crap every day and the actual process takes 1 minute)
probability of getting struck by lightning while taking a crap in any given year: 1/(280000*525600) = 1/1.47E11 = 6.8E-12
probability of finding a collision: 1E-65
getting hit by lightning while taking a crap for how many years in a row is equally probable as finding a collision: log(1E-65) / log(6.8E-12) = 5.8

is my math roughly correct?

kjj

legendary

Activity: 1302

Merit: 1026

Quote from: flatfly on November 03, 2012, 06:22:42 AM

Quote from: enmaku on August 29, 2012, 01:04:03 PM

Quote from: Gabi on August 29, 2012, 12:58:22 PM

An address collision would be a SHA256 collision

And if you find a SHA256 collision then bitcoin is the last of our problem

Technically it would be a RIPEMD160(SHA256(SHA256())) collision. Not nitpicking, it's an important distinction given that the last step in that process yields 2^160 possible addresses instead of 2^256. 96 bits of keyspace ain't no joke son.

If I understand this right, does this basically mean that each address actually has tons of private keys (only one of which is known)?

Well, probably. The mapping isn't known to be evenly distributed. But on average, there should be something like 2⁹⁶ keys per address.

P.S. It is RIPEMD-160(SHA-256(pubkey)) . Only two hashes, not three.

flynn

hero member

Activity: 728

Merit: 540

Quote from: flatfly on November 03, 2012, 06:22:42 AM

Quote from: enmaku on August 29, 2012, 01:04:03 PM

Quote from: Gabi on August 29, 2012, 12:58:22 PM

An address collision would be a SHA256 collision

And if you find a SHA256 collision then bitcoin is the last of our problem

Technically it would be a RIPEMD160(SHA256(SHA256())) collision. Not nitpicking, it's an important distinction given that the last step in that process yields 2^160 possible addresses instead of 2^256. 96 bits of keyspace ain't no joke son.

If I understand this right, does this basically mean that each address actually has tons of private keys (only one of which is known)?

That reminds me of this joke :

An 80-years old woman is attending a lecture on astronomy, where the astronomer states that our sun will explode and transform itself into a red giant within 5 billion years.

At the end of the lecture, the old lady asks the astronomer :

- How long did you say our sun will last ?
- 5 billion years
- What a relief ! I thought you said 5 millions !

flatfly

legendary

Activity: 1120

Merit: 1016

090930

Quote from: enmaku on August 29, 2012, 01:04:03 PM

Quote from: Gabi on August 29, 2012, 12:58:22 PM

An address collision would be a SHA256 collision

And if you find a SHA256 collision then bitcoin is the last of our problem

Technically it would be a RIPEMD160(SHA256(SHA256())) collision. Not nitpicking, it's an important distinction given that the last step in that process yields 2^160 possible addresses instead of 2^256. 96 bits of keyspace ain't no joke son.

If I understand this right, does this basically mean that each address actually has tons of private keys (only one of which is known)?

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

The number of addresses generated isn't important it is the number of funded addresses. A collision with a spent change address that will never be used again isn't very valuable.

So if we assume 1 trillion active addresses (10 billion users with an average of 100 active funded addresses at any point in time). The odds of finding a collision are ~ 1 in 1*10^65.

If the odds of getting hit by lightning are 1 in 280,000 then the odds of getting stuck by lightning 12 times per year is roughly the same as creating a collision with with one of the 1 trillion active addresses. I don't like lightning comparison because lightning is localized (not randomly distributed). If we use the lottery example the odds of buying lottery tickets for 8 consecutive games and winning the jackpot on all 8 is roughly the same as finding a single collision.

I would point out that the single random collision is unlikely to be very profitable. If there are 1 trillion addresses and all coins are mined the average address will hold ~0.000021 BTC (21 uBTC - microBTC). Another way to look at it is if we assume in this 10 billion scenario Bitcoin has replaced all global money supplies then the collective value of the network would be on the order of ~$60T (2012 dollars). At 1 trillion addresses the value of finding that collision (less likely than winning lottery 8 times in a row) would be ~$60. BTW fun fact that would make the exchange rate $2.857 per uBTC!

molecular

donator

Activity: 2772

Merit: 1019

Quote from: runeks on October 31, 2012, 08:39:45 PM

Comparatively speaking, your odds of being struck by lightning in a given calendar year are about 1 in 280,000. The odds of winning my local lottery are about 1 in 176,000,000. So finding a collision on your first try is roughly equivalent to being hit by lightning 16,540,000,000,000,000,000,000,000 times per second for an entire year or winning the lottery 830,000,000,000,000,000,000,000,000,000 times.

If you find a collision I would stay indoors and play the lottery.

The highlighted part is wrong. The odds of getting struck by lightning n times is not p*n, it's p^n. Ie. the probability of throwing two sixes in a row with a dice is not 1/12, it's 1/36.
[/quote]

right, so how many lightning strikes per year are we talking to have the same probability of seeing an accidental collision occur between any two generated addresses (assuming 10 billion bitcoin users that each generate 1000 addresses per year)?

runeks

legendary

Activity: 980

Merit: 1008

Quote from: enmaku on August 29, 2012, 12:42:52 PM

Quote from: kokojie on August 29, 2012, 12:30:44 PM

Quote from: maaku on August 29, 2012, 12:29:12 PM

Quote from: kokojie on August 29, 2012, 12:25:40 PM

Quote from: Vladimir on August 29, 2012, 12:24:02 PM

Chance are negligible. If collision occurs with a funded address, attacker you can transfer funds elsewhere.

Chances are still negligible when 1 billion people are using it? also can't I just run some kind of bots, that randomly generate addresses to see if
they have funds in them?

Yes, chances remain negligible. You could run your bot, but it'd be a waste of electricity. Chances are you'd wait the lifetime of the universe before finding a collision.

But probability also says, I could have a success on my first run? isn't it?

Sure, you could absolutely find a success on your first run, but let's apply probability to your scenario.

Let's say there are a billion people using 10 addresses each for 10 billion total addresses.

This means that each address you generate has a (1/2^160)*10,000,000,000 possibility of holding a balance, giving your first attempt a 0.0000000000000000000000000000000000000684% chance of finding a collision on your first attempt.

You are correct in stating that with each try it will either happen or it won't, there is no in-between state, and you're correct in stating that it's possible. It's also bad news for the account holder that a collision would give you control of those funds.

Comparatively speaking, your odds of being struck by lightning in a given calendar year are about 1 in 280,000. The odds of winning my local lottery are about 1 in 176,000,000. So finding a collision on your first try is roughly equivalent to being hit by lightning 16,540,000,000,000,000,000,000,000 times per second for an entire year or winning the lottery 830,000,000,000,000,000,000,000,000,000 times.

If you find a collision I would stay indoors and play the lottery.

The highlighted part is wrong. The odds of getting struck by lightning n times is not p*n, it's p^n. Ie. the probability of throwing two sixes in a row with a dice is not 1/12, it's 1/36.

Quote from: BkkCoins on August 30, 2012, 05:30:57 AM

What I'd like to know is why some letters have higher probability than others? Is that because some letters are more likely on the elliptic curve vs not? If you plug sequences into vanitygen it will give different difficulties depending on letter, eg.

11... > 1A...
1Q... > 1D...
1R... > 1Q... and so on.

I made a table of the first three chars after 1 to help me in a project with fast lookup of difficulty calculation. There are broad classes of probabilities depending on what letters are involved. I guess there is some non 1-1 mapping from 2^160 possible addresses and base 58 representations.

It's not related to elliptic curves, as an address is created from a hash, which is basically just random data.
The reason is that the data that is encoded as base58 (and thus becomes an address) is ||<20 byte RIPEMD160 public key hash>||<4 byte checksum> where || denotes concatenation.

This version byte is 0 on the main network. So the binary data of an address will always start with 0000 0000. Since this data is converted into a large number, the probability that some of the leading zero bits will be included in the leftmost bits of the large number is greater than none of the leading zero bits being included (since there are eight of them). This results in the probability of the number starting with a small digit being greater.

This large number is basically just continually divided by 58, and the *remainder* of this division is then used to determine which character is added to the address string (based on it's index position in the string "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"). This is repeated in a loop until the large number has been reduced to zero. After this *the string is reversed*. This means that the last remainder will be the first character in the address (after the '1' which will always be there because a '1' is prepended to the address for every leading zero-byte, of which there is always one because the "version byte" is always zero on the main network). So since there is a greater probability that the first part of the number is a small digit, there is a greater chance that the first character of the address will be one of the first characters of the string from which characters are selected based on their index position in that string.

paraipan

legendary

Activity: 924

Merit: 1004

Firstbits: 1pirata

Quote from: molecular on October 31, 2012, 03:04:37 PM

I'm not saying collisions anywhere close to likely to occur, but...

Quote from: enmaku on August 29, 2012, 12:26:37 PM

Given your example of 1 billion users at 10 addresses each:

There are 2^160 or about 1,460,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible addresses
In your scenario, 1,000,000,000 people are using 10 addresses each for a total of 10,000,000,000 possible addresses
10,000,000,000 / 2^160 should yield the probability of a collision occurring
10,000,000,000 / 2^160 = 0.00000000000000000000000000000000000000684

So the chances of a collision occurring in your scenario are approximately 0.000000000000000000000000000000000000684%

This is the chance of finding a collision when generating 1 address.

The chance of a collision occuring "at all" is higher.

Let's assume (worst case) we use bitcoin for 1E3 years and there are 1E10 people populating earth at any point in time who will generate on average 1E3 addresses during their lifetimes.

What are the chances of a collision to occur under these assumptions?

Hopefully in 1E3 years if a human "stumbles" onto someone else's money they will not touch it, or give it back right away.

"Real is gonna change... just watch"

molecular

donator

Activity: 2772

Merit: 1019

I'm not saying collisions anywhere close to likely to occur, but...

Quote from: enmaku on August 29, 2012, 12:26:37 PM

Given your example of 1 billion users at 10 addresses each:

There are 2^160 or about 1,460,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible addresses
In your scenario, 1,000,000,000 people are using 10 addresses each for a total of 10,000,000,000 possible addresses
10,000,000,000 / 2^160 should yield the probability of a collision occurring
10,000,000,000 / 2^160 = 0.00000000000000000000000000000000000000684

So the chances of a collision occurring in your scenario are approximately 0.000000000000000000000000000000000000684%

This is the chance of finding a collision when generating 1 address.

The chance of a collision occuring "at all" is higher.

Let's assume (worst case) we use bitcoin for 1E3 years and there are 1E10 people populating earth at any point in time who will generate on average 1E3 addresses during their lifetimes.

What are the chances of a collision to occur under these assumptions?

flatfly

legendary

Activity: 1120

Merit: 1016

090930

Quote from: kokojie on August 29, 2012, 12:30:44 PM

Quote from: maaku on August 29, 2012, 12:29:12 PM

Quote from: kokojie on August 29, 2012, 12:25:40 PM

Quote from: Vladimir on August 29, 2012, 12:24:02 PM

Chance are negligible. If collision occurs with a funded address, attacker you can transfer funds elsewhere.

Chances are still negligible when 1 billion people are using it? also can't I just run some kind of bots, that randomly generate addresses to see if
they have funds in them?

Yes, chances remain negligible. You could run your bot, but it'd be a waste of electricity. Chances are you'd wait the lifetime of the universe before finding a collision.

But probability also says, I could have a success on my first run? isn't it?

Hey, I started this little pet project with you in mind

mila

sr. member

Activity: 462

Merit: 250

Quote from: Vladimir on August 29, 2012, 12:47:58 PM

You are, however, much better off generating collisions on various deterministic wallets like brainwallets etc... there are plenty of people out there who do not get it why some passwords/keys must be strong.

If lucky you will teach some punks a good lesson BTW.

repeating the obvious: the seed secret for deterministic wallets is the equivalent to password.
not only must it be secret, also difficult to guess.

muyuu

donator

Activity: 980

Merit: 1000

Quote from: kokojie on August 30, 2012, 10:29:25 AM

I'm also wondering how long it would take to crack an address/key pair. For example, let's say I have a lot of hardware at my disposal, and can generate
1 Billion address/key pair per second. How long would it take to crack a specific address? can some one do a math on this one? I would guess not very long.

You guess wrong.

1 billion per second is nothing in an address space of size 2^160. It would take ~ 3.4 E+21 times the age of the Universe in average.

Let Wolfram Alpha do it for you http://www.wolframalpha.com/input/?i=(2%5E160)%20%2F%201e%2B9%20seconds%20to%20years

kokojie

legendary

Activity: 1806

Merit: 1003

I'm also wondering how long it would take to crack an address/key pair. For example, let's say I have a lot of hardware at my disposal, and can generate
1 Billion address/key pair per second. How long would it take to crack a specific address? can some one do a math on this one? I would guess not very long.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: Boussac on August 30, 2012, 01:47:54 AM

To me a "valid" address is one that has a private key.
The checksum verification of an address does not warrant that a private key exist for that address.

Hence my post above: the point represented by the public key MUST be on the ellptic currve.

In short,
1/you can random type an address with a valid checksum
2/ funds can be sent to that address because miners will get it on the blockchain
3/Nobody will ever spend those funds again because there is no private key for that address: the bitcoins on that address are lost for ever

The network has no method of determining if the private key is "known". It doesn't matter if a private key was known but no longer is (lost), is unknown but theoretically possible, or simply impossible (there is no possible 256 bit number which produces that address).

In any address if the private key is currently unknown the coins can't be spent. If an address confirms to the protocol spec it is valid.

Due to the checksum the probability of entering an valid but incorrect address due to a typo is less than 1 in 4 billion. Realistically it isn't going to happen.

Boussac

legendary

Activity: 1221

Merit: 1025

e-ducat.fr

Quote from: BkkCoins on August 30, 2012, 05:30:57 AM

What I'd like to know is why some letters have higher probability than others? Is that because some letters are more likely on the elliptic curve vs not? If you plug sequences into vanitygen it will give different difficulties depending on letter, eg.

11... > 1A...
1Q... > 1D...
1R... > 1Q... and so on.

I made a table of the first three chars after 1 to help me in a project with fast lookup of difficulty calculation. There are broad classes of probabilities depending on what letters are involved. I guess there is some non 1-1 mapping from 2^160 possible addresses and base 58 representations.

In ECDSA, you start with a large integer as your private key then multiply ("multiplication" here must be understood as a group operator not as arithmetic multiplication) the base point G of your elliptic curve to get the corresponding public key.
Therefore your public key is a point (x,y) on the elliptic curve.

In bitcoin, your address is some hash of you public key.
As a result, if you start from a random string, the likelihood of a collision between said string and a "valid" address depends on how many integers n satisfy the condition "n*G is a point on the curve AND this point hashes to a matching string".
Therefore a brute force algorithm (like vanitygen) might display varying difficulty as a function of the chosen string.

kerogre256

full member

Activity: 161

Merit: 100

There is probability then when you throw ball it will not bounce but go through wall....

BkkCoins

hero member

Activity: 784

Merit: 1009

firstbits:1MinerQ

What I'd like to know is why some letters have higher probability than others? Is that because some letters are more likely on the elliptic curve vs not? If you plug sequences into vanitygen it will give different difficulties depending on letter, eg.

11... > 1A...
1Q... > 1D...
1R... > 1Q... and so on.

I made a table of the first three chars after 1 to help me in a project with fast lookup of difficulty calculation. There are broad classes of probabilities depending on what letters are involved. I guess there is some non 1-1 mapping from 2^160 possible addresses and base 58 representations.

Topic: What are the chances of an address collision? and what happens when it does? - page 3. (Read 22468 times)