An easy way to remember a bitcoin address - page 7.

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: jl2012 on March 03, 2015, 11:12:41 AM

So a 5-words address would have

1 version bit
22 block height bits
1 to 11 txIndex bits
1 to 11 txOutputIndex bits
at least 20 checksum bits

To convert the bits into final address, we first serialize version-height-txIndex-txOutputIndex (the "information"). The information should be within 25-35 bits, and checksum is 20-30 bits

To derive the first word, we take the first 4 bits from checksum and the first 7 bits from information. This is repeated until all information bits are included. The rest will be checksum if there is any left *

A 6-words address would be encoded similarly, with 3 bits of checksum and 8 bits of information for each word.

* I put checksum before the information to make each word looks more "random". Otherwise, some words would be used more frequently than others, especially the first 2 words which encodes the block height.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote from: jl2012 on March 03, 2015, 11:27:36 AM

Quote from: Nicolas Dorier on March 03, 2015, 09:50:14 AM

I suspect I can fit some information about the block in the checksum.

I'm not sure if this is a right way to go. A checksum should be statistically independent from the information. Otherwise, it's not a checksum

Yes, and I think 5 words is no so bad in fact. Short term human memory remember 7 items, our phone numbers are 10 numbers.
I don't think I'll be able to fit that in 4 words anyway.

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: Nicolas Dorier on March 03, 2015, 09:50:14 AM

I suspect I can fit some information about the block in the checksum.

I'm not sure if this is a right way to go. A checksum should be statistically independent from the information. Otherwise, it's not a checksum

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: Nicolas Dorier on March 03, 2015, 09:50:14 AM

With 16 bit of checksum, it is possible to fit everything in 5 words.
Knowing the block, we know how much transaction there is, so we know how much bits to use for encoding.
Knowing the transaction, you know how much bits to use for encoding the TxOut index.

So if we take 24 bit for Block height (block 16 777 216), the transaction count is currently 10 bits, the TxOut index will be 1 bit most of the time. (OP_RETURN + Change)

So this gives us
24 + 10 + 1 + 16 = 51 bits for a 16 bit checksum, (5 words)

I would like to find a more efficient way of encoding the block though. (instead of hardcoding 24 bits)

good idea. This saves a lot.

You certainly don't need 24 bits for block height. It takes >300years. All cryptography in current form should have been broken long before that. 22 bits (70 years) would be enough.

The TxOut index is controlled by user so let's assume it to be usually 1 bit (because people want a short address)

I want at least 20 bits of checksum (1 in a million error tolerance)

I prefer to leave 1 version bit for future extension. (otherwise we may need to use a completely different word list for future encoding scheme)

-----------------
So a 5-words address would have

1 version bit
22 block height bits
1 to 11 txIndex bits
1 to 11 txOutputIndex bits
at least 20 checksum bits

The txIndex and txOutputIndex will always use the least possible bits, leaving more room for the checksum.

For example, if a block is known to have only 500 txs, the txIndex will only take 9 bits so the txOutputIndex may take at most 3 bits. If there is only 3 outputs (2 bits) in the tx, the checksum will have 21 bits

EDIT: If there is not enough bits left to fully encode txOutputIndex, the earlier outputs are assumed. For example, if there are 14 outputs (4 bits) in the tx but only 2 bits is left for txOutputIndex, we could still encode the first 4 outputs with 5 words.

Similarly, if a block has more than 2048 txs, a 5-words address is still valid if it is referring to first 2048 tx in the block and it is referring to the first or second output
------------------
If an address can't fit-in a 5-words address, it would need 6 words anyway. So a 6-words address would have

1 version bit
22 block height bits
1 to 22 txIndex bits
1 to 22 txOutputIndex bits
at least 20 checksum bits

22 txIndex bits should be enough for a 800MB block. If that's still not enough, it could be extended to 7-words in a similar way.

It is also valid to encode a 5-words address in 6-words, for extra checksum security.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

With 16 bit of checksum, it is possible to fit everything in 5 words.
Knowing the block, we know how much transaction there is, so we know how much bits to use for encoding.
Knowing the transaction, you know how much bits to use for encoding the TxOut index.

So if we take 24 bit for Block height (block 16 777 216), the transaction count is currently 10 bits, the TxOut index will be 1 bit most of the time. (OP_RETURN + Change)

So this gives us
24 + 10 + 1 + 16 = 51 bits for a 16 bit checksum, (5 words)

I would like to find a more efficient way of encoding the block though. (instead of hardcoding 24 bits)
I suspect I can fit some information about the block in the checksum.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote

Encoding of private key and public key are different concepts. If you made a small mistake in private key, it is very likely to recover it by some burst force. If you sent you bitcoin to a wrong public key, it is totally irreversible.

Case 1: Alice would like to transmit the private key "ABCD1234" to Bob. The private key has no checksum and is carrying 100BTC on the blockchain. However, due to unstable network, Bob received "ABBD1234". Bob will immediately find that the private key carries no bitcoin and will ask Alice to resubmit. Even if he can't reach Alice again, he can still try to burst force it with the available information.

Case 2: Alice would like to transmit the public key "ABCD1234" to Bob. The public key has no checksum. However, due to unstable network, Bob received "ABBD1234". Bob sent 100BTC to "ABBD1234" and screwed up

It makes sense, thanks for your input, will try to make something out of all of that !

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: Nicolas Dorier on March 03, 2015, 07:32:42 AM

Quote

My proposal will insert bits derived from the block-hash everywhere in the address, so it is not possible to mine a vanity address without discarding a successful block.

So your proposal of adding the blockhash in the checksum would do the trick for preventing vanity addresses ?

And the checksum has to be broken down into pieces and inserted in every 11-bit block to make sure every word is pseudo-random. Vanity addresses are still possible, just by pure luck. No mining of vanity address is possible.

Quote

I see your point with the checksum, I'll try to find a good way of encoding all of that. I came up with the 5 bit from the BIP39 checksum, but such checksum would be less used than a "brain address".

Encoding of private key and public key are different concepts. If you made a small mistake in private key, it is very likely to recover it by some burst force. If you sent you bitcoin to a wrong public key, it is totally irreversible.

Case 1: Alice would like to transmit the private key "ABCD1234" to Bob. The private key has no checksum and is carrying 100BTC on the blockchain. However, due to unstable network, Bob received "ABBD1234". Bob will immediately find that the private key carries no bitcoin and will ask Alice to resubmit. Even if he can't reach Alice again, he can still try to burst force it with the available information.

Case 2: Alice would like to transmit the public key "ABCD1234" to Bob. The public key has no checksum. However, due to unstable network, Bob received "ABBD1234". Bob sent 100BTC to "ABBD1234" and screwed up

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote

My proposal will insert bits derived from the block-hash everywhere in the address, so it is not possible to mine a vanity address without discarding a successful block.

So your proposal of adding the blockhash in the checksum would do the trick for preventing vanity addresses ?

I see your point with the checksum, I'll try to find a good way of encoding all of that. I came up with the 5 bit from the BIP39 checksum, but such checksum would be less used than a "brain address".

jl2012

legendary

Activity: 1792

Merit: 1111

Let say 1% of the world population will exchange their bitcoin address once per day, which is 70,000,000/day

0.01% of them will make a random mistake. So there will be 7000 mistakes/day.

With 5-bits of checksum, there will be 218 irreversible erroneous transaction/day
With 16-bits of checksum, this reduce to once per day
With 21-bits of checksum, this reduce to once per 300 days
Satoshi uses a 32-bits checksum in bitcoin address. That becomes once per 1681 years

Having an address with only 4 words is great, but it gives a false sense of security. If we are going to establish an industry standard, I believe we have a moral responsibility to lay users. We all make mistakes, and I'm sure 0.01% is way underestimated.

(You may argue that a random mistake may just point to a non-existing output but I've already used conservative parameters in my estimation)

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: Nicolas Dorier on March 02, 2015, 01:05:10 PM

Quote

Why won't they do that? Let say address of the block 987654 txindex 7777 output 888 is "best adult web". This address will be extremely valuable and all miners will try to grab it.

Except the one that discover the vanity address also know the private key. Would you buy the key knowing someone you don't know own it ?

Miners could sell the block space for the vanity address without knowing the private key. All they need to do is to put their client's transactions in the appropriate positions. If the incentive is big enough, strong miners may even try to orphan other people's block to grab a good address.

My proposal will insert bits derived from the block-hash everywhere in the address, so it is not possible to mine a vanity address without discarding a successful block.

Quote

Do you really think 5 bits of checksum is enough? An address with random error will have 3.125% of probability to be valid. It's just unacceptable especially for bitcoin which getting refund is totally impossible.

I think it is enough, at the first place most words in BIP39 are hard to get wrong. But you might be right, I will play a bit with several way of encoding it, and see if I can find a compromise between size (which I would like max 4 words for now) and checksum.
Right now, hhanh00 proposed something with more checksum, I hope finding something more space efficient though.

This is not true. Please don't assume all people are native English speakers. Even for native speakers many words are still very similar

gmaxwell has list some examples of poorly chosen words in BIP39: [choice, choose], [drip, script, trip], [risk, brisk], [load, road]

some more: [awake, aware, away], [stamp, stand], [price, pride], [steak, stick] (frankly speaking, I can't differentiate between "steak" and "stick" at all)

5 bits is unacceptable. I think 16-bits is really minimum, with an error tolerance of 1/65536. 20-bits will give 1 in a million.

Quote

Just use the blockhash as part of the checksum as I suggested. It's quite obvious.

~~I can't add the 32 bytes, it would result in something nobody can remember. How much bit do you think should mitigate the problem enough ?~~
Ok I understood what you meant, great idea.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote

Why won't they do that? Let say address of the block 987654 txindex 7777 output 888 is "best adult web". This address will be extremely valuable and all miners will try to grab it.

Except the one that discover the vanity address also know the private key. Would you buy the key knowing someone you don't know own it ?

Quote

Do you really think 5 bits of checksum is enough? An address with random error will have 3.125% of probability to be valid. It's just unacceptable especially for bitcoin which getting refund is totally impossible.

I think it is enough, at the first place most words in BIP39 are hard to get wrong. But you might be right, I will play a bit with several way of encoding it, and see if I can find a compromise between size (which I would like max 4 words for now) and checksum.
Right now, hhanh00 proposed something with more checksum, I hope finding something more space efficient though.

Quote

Just use the blockhash as part of the checksum as I suggested. It's quite obvious.

~~I can't add the 32 bytes, it would result in something nobody can remember. How much bit do you think should mitigate the problem enough ?~~
Ok I understood what you meant, great idea.

jl2012

legendary

Activity: 1792

Merit: 1111

Quote from: Nicolas Dorier on March 01, 2015, 02:34:10 PM

Quote

1. blockchain reorg

This is the problem of the wallet implementation. Whether they are using my spec or not, they need to manage that correctly (needed for checking a Partial Merkel Tree), this is orthogonal to my spec.
Do you refer to the fact that one "brain address" might point to an invalid TxOut after a reorg ?
One way to mitigate that is to ask for 101 confirmations. (coinbase maturity)
However, the fact that some services will not want waiting so much time, even if my spec ask for it, might be troublesome. I don't have a better solution though.

Just use the blockhash as part of the checksum as I suggested. It's quite obvious.

Quote

2. Miners may try to fill up a block with garbage for a vanity address

Why would they do that ? there is nothing to win by doing that.

Why won't they do that? Let say address of the block 987654 txindex 7777 output 888 is "best adult web". This address will be extremely valuable and all miners will try to grab it.

Quote

I will not use your encoding technique.
The reason is that there is no reason to choose the number of words in a "brain address". As opposed to a private key, the fewer the better.
The best way to encode a "Brain Address" is to use the VarInt encoding that all bitcoin wallet already implement. (or the less supported CompactVarInt internal to bitcoin core)

Also, encoding the TxIndex is inferior to encoding the PathToLeaf, for 2 reasons : it takes more space, and more than a simple Partial Merkle Tree would be required as proof. (whose proof checks are already implemented in all SPV wallets)
Code reuse would be maximized, code error minimized, so adoption by wallet providers should be better.

Moreover, each words represents 11 bits of information.
If we need 23 bit for encoding an address, then we have 10 bit that serve for nothing. I propose to fit the checksum inside.
The size of the checksum in bit would be something like Max(5, UnusedBitCount).

This should represent all current payment destination on 3 or 4 words. And it will goes up very very slowly.

[/quote]

Do you really think 5 bits of checksum is enough? An address with random error will have 3.125% of probability to be valid. It's just unacceptable especially for bitcoin which getting refund is totally impossible.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote from: hhanh00 on March 01, 2015, 09:57:51 PM

A bip 32 seed is 128 bit long and takes 12 words. How did you figure out to use only 4?

I did not, I just explained him that my goal was NOT about encoding a private key Wink

hhanh00

sr. member

Activity: 467

Merit: 267

A bip 32 seed is 128 bit long and takes 12 words. How did you figure out to use only 4?

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote from: hhanh00 on March 01, 2015, 09:27:50 PM

Quote from: Nicolas Dorier on March 01, 2015, 08:22:29 PM

Initially I wanted to encode BlockHeight,TxIndex,TxoutIndex as you said.
If Alice asks to Malicious to resolve BlockHeight,TxIndex,Txout Index and Malicious resolve to address X, then Alice need a proof that Malicious is not lying.
If you use TxIndex, this proof must be the whole block at BlockHeight (which is big) OR the blockheader + list of all transaction id inside. (which is not supported by the bitcoin network)

No, you take the txindex and interpret it as a position in the merkle tree. The tree has all the tx ordered by index as its leaf. It's the same value.

Yes you are right, geez I made it more complicated than what it should be.
Also, I was wrong about using VarInt. Since it works only on byte boundaries, we'll loose space with it.
Your proposition for encoding seems better in fact even though 5-8 words seems a lot to remember though. Sad

~~Given 300 000 blocks, 1000 tx, 10 TxOut, 256 bit of checksum, I calculated 4 words though.~~

A way of serializing the data similar to VarInt (but with different boundaries), might better adapted to earlier and future address (early address would be small, and older bigger).
I'll try to propose something.

hhanh00

sr. member

Activity: 467

Merit: 267

Quote from: Nicolas Dorier on March 01, 2015, 08:22:29 PM

Initially I wanted to encode BlockHeight,TxIndex,TxoutIndex as you said.
If Alice asks to Malicious to resolve BlockHeight,TxIndex,Txout Index and Malicious resolve to address X, then Alice need a proof that Malicious is not lying.
If you use TxIndex, this proof must be the whole block at BlockHeight (which is big) OR the blockheader + list of all transaction id inside. (which is not supported by the bitcoin network)

No, you take the txindex and interpret it as a position in the merkle tree. The tree has all the tx ordered by index as its leaf. It's the same value.

jeffhuys

sr. member

Activity: 252

Merit: 250

Quote

This is called BIP39, and it deals with a private key. There is no way to get a secure key with less than 12 words.
The goal of my proposition is not about encoding a private key, but encoding a payment destination in 3 or 4 words. (BIP70 or address)

Ah, I see. I completely misunderstood, then.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote from: hhanh00 on March 01, 2015, 07:47:56 PM

I'm completely confused by what you are saying now. The txindex = path. How can one be better than the other unless you are adding data.
I guess we will see once you implement it.
The worst that can happen is people losing their coins Tongue

Initially I wanted to encode BlockHeight,TxIndex,TxoutIndex as you said.
If Alice asks to Malicious to resolve BlockHeight,TxIndex,Txout Index and Malicious resolve to address X, then Alice need a proof that Malicious is not lying.
If you use TxIndex, this proof must be the whole block at BlockHeight (which is big) OR the blockheader + list of all transaction id inside. (which is not supported by the bitcoin network)

If instead of TxIndex, you encode PathToLeaf, then Alice can ask the Transaction + Merkle Tree of it as proof, which is both : compact and already supported by SPV wallets and the bitcoin network.

Quote from: jeffhuys on March 01, 2015, 08:09:25 PM

Very cool what you guys are trying to do.

However, isn't this the wrong way around? Maybe use the brainwallet method: creating a key BASED on a certain seed (that's easy to remember)?

This is called BIP39, and it deals with a private key. There is no way to get a secure key with less than 12 words.
The goal of my proposition is not about encoding a private key, but encoding a payment destination in 3 or 4 words. (BIP70 or address)

jeffhuys

sr. member

Activity: 252

Merit: 250

Very cool what you guys are trying to do.

However, isn't this the wrong way around? Maybe use the brainwallet method: creating a key BASED on a certain seed (that's easy to remember)?

hhanh00

sr. member

Activity: 467

Merit: 267

I'm completely confused by what you are saying now. The txindex = path. How can one be better than the other unless you are adding data.
I guess we will see once you implement it.
The worst that can happen is people losing their coins Tongue

Topic: An easy way to remember a bitcoin address - page 7. (Read 15145 times)