Pages:
Author

Topic: An easy way to remember a bitcoin address - page 6. (Read 15158 times)

hero member
Activity: 714
Merit: 662
I see some inconsistency, can you tell me if I reformulate right ?

Quote
Checksum: 54-x-y-z bits
I assume you mean 55. There is 55 bits in 5 words.


So

Checksum = SHA256(Concat(blockhash,scriptPubKey))

raw address = blockheight(22)-txindex(1)-outputindex(1)-checksum(8*32)

Then you truncate checksum to get a raw address of 55 bit. (so you keep only the 33 first bit)

raw address = blockheight(22)-txindex(1)-outputindex(1)-checksum(21)

The checksum meet the minimum size, so you keep 5 words.

Now you define

EncryptionKey = checksum(20)

So we have :
EncryptedAddress = raw address(35) - EncryptionKey

legendary
Activity: 1792
Merit: 1121
To make it more compact, if there is only one tx in a block (i.e. the coinbase), the txindex will take 0 bit. Similarly, if there is only one output in a tx, the outputindex will take 0 bit.

In this case, a 5-word address with 22-bit height and minimum 20-bit checksum may encode up to 212=4096 tx in a block, if the user is referring to the first output.
legendary
Activity: 1792
Merit: 1121
I think there is a better way to encode the address. Assuming a 5 words address, just serialize the following:

Height: x bits
TxIndex: y bits
OutputIndex: z bits
Checksum: 54-x-y-z bits

(If 54-x-y-z < c, the address is invalid and more words are required. c is the minimum checksum size)

The result is called the "raw address".

Take the last c bits of the raw address (the "encryption key")

Use the encryption key to XOR the first 54-c bits of the raw address (the "encrypted address").

The final address is [encrypted address-encryption key-0] (The last 0 is the version bit)

-----------------------
Example: the output in the genesis block:

height: 0000000000000000000000
txindex: 0
outputindex: 0

blockhash: 0x000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
scriptPubKey: 0x4104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4 cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac
SHA256(blockhash-scriptPubKey) = afb4b496b902df5adc89d9d89080f6dc80383dbe6e1067140721ed53e2ccbedb
Checksum = First 30 bits* of SHA256 hash = 101011111011010010110100100101

So the raw address is 0000000000000000000000-0-0-101011111011010010110100100101

Assuming c is 20, the encryption key is 11010010110100100101. It is repeated in order to encrypt the first 34 bits of the raw address

XOR (00000000000000000000-00001010111110) with (11010010110100100101-11010010110100) = 11010010110100100101-11011000001010

Final address = 11010010110100100101-11011000001010-11010010110100100101-0

Decode of the address is trivial.

-----------------------------------------------------------------------

Rationale:

1. This introduces entropy to every bit of the address. Even the raw address of 2 outputs are very similar, the output will be statistically independent

2. This could be easily extended to any size of c and number of words used

3. Since the BIP39 word list sorts alphabetically, putting the version bit at last make sure that the first letter of the last word could cover a-z (thus less error-prone because human tends to recognize vocabulary with the first and last letter: see: http://www.ecenglish.com/learnenglish/lessons/can-you-read). Otherwise it will only cover a-l (0-1023)

4. It looks more tidy than my previous proposal

hero member
Activity: 714
Merit: 662
Quote
I didn't know this. Then why couldn't we just number all transactions consecutively and use it as the index? Therefore, we don't need to worry for inadequate bits for txIndex. Moreover, as most blocks are not full, this must be much more efficient than indexing by height-txIndex.
Because of the size of the proof.
Right now the proof is Transaction + Merkle Block.
With your transaction count, the proof would be Transaction + Merkle Blocks until genesis.
To prevent any tampering, the SPV client would also need to store all those blocks, and check them every time.
It is sure more efficient, but the burden of proof is too big.

Quote
I know the transaction count in merkleblock may not be trustworthy
It is. Without the right count, you can't build the PartialMerkleTree correctly.(you get the wrong shape)
And with the wrong PartialMerkleTree, you end up with the wrong merkle root.
Is it possible to build a valid PartialMerkelTree with the wrong TransactionCount ? I think it is not but sadly I have no proof. But as you said, if I get the wrong transaction, the checksum would catch the error.

Quote
I did it !!! SPV Compatible.
In my code snippet, I assumed that the SPV client know the Transaction before hand. Which is false.
But with a pure SPV, it is still possible to get it:

-Set Bloom Filter to match everything.
-Ask the filtered block
-Get the transaction that interest you.

I don't like this solution because it would ask the SPV to download the whole block.
A better way, with the same problem is to:

-Clear the Bloom Filter
-Ask the whole block
-Get the transaction that interest you.

But as I said, I don't like the idea of a SPV having to download the whole block.

A better way would be to solve the brain address thanks to a third party service, and just ask for the proof along with the resolution.
I don't expect wallet in the future (nor now) will be pure SPV anyway.

I believe the future is centralized bitcoin services (hostable by yourself if you are concerned about privacy), where the server delivers compact proof, and client are responsible of checking them.
I don't believe 100% SPV client will spread on the long term.
legendary
Activity: 1792
Merit: 1121


Moreover, the transaction count is included into the merkleblock.

I didn't know this. Then why couldn't we just number all transactions consecutively and use it as the index? Therefore, we don't need to worry for inadequate bits for txIndex. Moreover, as most blocks are not full, this must be much more efficient than indexing by height-txIndex.

Right now there is about 61,000,000 tx from the genesis block, and is growing by 100,000 tx/day. Assuming a 100x growth, we would have 36,500,000,000 tx in 10 years. This could be encoded with 36bits. Using the height-txIndex index it would take 22+16=38bits, assuming 70,000tx/block. With 10,000x growth, that will be 42bits and 45bits respectively.

I know the transaction count in merkleblock may not be trustworthy. However, as the calculation of checksum involves the concerned scriptPubKey, such attack should not be possible.

However, this will require SPV nodes to keep a record of transaction count in all blocks in the history.
hero member
Activity: 714
Merit: 662
I did it !!! SPV Compatible.

https://github.com/NicolasDorier/NBitcoin/blob/master/NBitcoin/BrainAddress.cs#L110

The Brain Address can be verified just with : Chain Headers + Transaction + MerkleBlock. Smiley

More test vector + better spec later !
hero member
Activity: 714
Merit: 662
It works.
The SPV client will receive : BlockHeader + Partial Merkle Tree + Transaction.

The length of the partial merkle tree tell you the transaction count upper bound, and this is what we need to define how much bit it takes. (Bit Count = TreeDepth)
We don't need the precise number, just the upper bound.

Moreover, the transaction count is included into the merkleblock.
legendary
Activity: 1792
Merit: 1121

For example, if a block have 512 transaction, I'll use only 9 bits for encoding the tx index.
If it has 513, I use 10 bits.


I just find a problem: how could a SPV client count the number of transactions in a block, without downloading the whole block?

Related discussion: https://bitcointalksearch.org/topic/counting-the-number-of-tx-in-a-block-without-downloading-the-whole-block-975648

If this could not be done, your proposal won't work for SPV clients without trusting a centralized service

If this could be done, there is an even more compact way to encode transactions: just number all transactions chronologically. According to blockchain.info

https://blockchain.info/charts/n-transactions-total?timespan=all&showDataPoints=false&daysAverageString=1&show_header=true&scale=0&address=

we have about 61,000,000 transactions. This could be encoded with only 26 bits. Since most blocks are not full, this must be much more efficient.
hero member
Activity: 714
Merit: 662
Quote
Thanks for your coding but I can't read C. I'd like to illustrate with some examples:
Anyway, I'll provide test vectors soon, with a spec better redacted.
I'll post the test vectors here, and try out with yours.

Quote
with up to 32 outputs would be a better trade-off
The way it currently works, it automatically adapt to the size of block/transaction.
For example, if a block have 512 transaction, I'll use only 9 bits for encoding the tx index.
If it has 513, I use 10 bits.
Same principle with TxOut : if the transaction has 2 outputs, I encode it on 1 bit.

The checksum is in fact a little bigger than 16 bits. The reason is that I plan to use the spare bits to fit more checksum.
16 bits would be the minimum up to 26. (depending on the number of spare bit)

I'm playing with all those observations a little bit, post some result here, and if we seem to agree, I'll redact a proper specification on github.

Quote
Why cant we just , be able to register our adresses in the bitcoin wallet , that would then give our adress a unique name , that you could alternatively send to this would be pretty easy to make right ? :
Because it does not fix the problem I try to resolve : storing a payment destination in your head which you can freely give orally to someone you just met, without any third party trust relationship involved (which would be subject to attacks/bug).
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
^^^ this is why "ad sig" campaigns suck...
hero member
Activity: 700
Merit: 500
Why cant we just , be able to register our adresses in the bitcoin wallet , that would then give our adress a unique name , that you could alternatively send to this would be pretty easy to make right ? : >
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
I would think 16 bits of checksum would be reasonable enough (in which case you'd be able to handle up to 128 outputs which I would think should cover the vast majority of transactions in the blockchain).

But perhaps 18 bits of checksum with up to 32 outputs would be a better trade-off (some blockchain stats would be helpful to know how likely it is that the first occurrence of an address is part of a transaction with X outputs).

You could also shrink the amount of checksum bits according to this (as it isn't so likely to normally occur anyway).

I actually quite like this idea compared to firstbits - hopefully this can be added to a block explorer when you've finalised it.

My own thinking is that this could be handy for remembering a cold storage address when say you are out without your own computer but need to find an address for someone to send to.
legendary
Activity: 1792
Merit: 1121
Example 2: the 4th output (sending to 17SEHdQNxrP64M6aUkDJtoUKAhSMujamhE) of 9abcc1383bcc5eb980dd263837c7bd368d0fd5d1696a89e49dc9e63c82e7ce16

The block has 460 transactions so 9 bits is needed for txindex
The tx has 21 outputs so 5 bits is needed to encode all outputs. However, with only 5 words and 20bits of checksum, only 3 bits is left. That means we could only encode the first 8 outputs.

version: 0
height: 0001010100011111100110 (346086)
txindex: 101111010 (378, or the 379th tx)
outputindex: 011 (3, or the 4th output)

information: 000010101-000111111-001101011-11010011

blockhash: 0x000000000000000000e83a55db617b5db59ce161f8ef4d0989b8c6d1a3e82102
scriptPubKey: 0x76a91446963eb29098f58519f94bd8c3130f5b5166123488ac
SHA256(blockhash-scriptPubKey) = 34e17a91bcccfb02b4eec27d035c3f5c2150675ede100d1be7549f2629b93a8a
Checksum = First 20 bits of SHA256 hash = 0011-0100-1110-0001-0111

With 2-bit checksum and 9-bit information for each word, it becomes:
00(c)
000010101(i)
11(c)
000111111(i)
01(c)
001101011(i)
00(c)
11010011(i)
111000010111(c)

So the 5 words address will be
00000010101-11000111111-01001101011-00110100111-11000010111
i.e. actor-side-estate-crunch-seed
legendary
Activity: 1792
Merit: 1121
Personally I think five words would be easy enough and if the checksum was spread across all five then that should help to prevent the five word sets from looking too similar to others that are from the same block.


In the example above I use 1 checksum + 10 information for each word. Maybe 2+9 is better to make the words look more "random"

With 2-bit checksum and 9-bit information for each word, it becomes:
10(c)
000000000(i)
10(c)
000000000(i)
11(c)
0000000(i)
111011010010110100100101(c)

So the 5 words address will be
10000000000-10000000000-11000000011-10110100101-10100100101
i.e. length-length-scatter-regret-pigeon
legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
Personally I think five words would be easy enough and if the checksum was spread across all five then that should help to prevent the five word sets from looking too similar to others that are from the same block.
legendary
Activity: 1792
Merit: 1121
I implemented it exactly as you proposed.
https://github.com/NicolasDorier/NBitcoin/blob/master/NBitcoin/BrainAddress.cs

We are already at 5 words though. (16 bit checksum)
I also check, if there is unused bits, that they are all to 0.

I still think 16 bits of checksum is overkill.

If there is a mistake, the info should not be : out of the chain, out of the block, out of the transaction, but also encoded correctly. (correct number of bit for encoding transaction index and txout index)

We need some independent security experts to suggest the appropriate number of checksum bits.

Thanks for your coding but I can't read C. I'd like to illustrate with some examples:

---------------------
The only output in the genesis block:

version: 0
height: 0000000000000000000000
txindex: 0
outputindex: 0

information: 0000000000-0000000000-00000

blockhash: 0x000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
scriptPubKey: 0x4104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4 cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac
SHA256(blockhash-scriptPubKey) = afb4b496b902df5adc89d9d89080f6dc80383dbe6e1067140721ed53e2ccbedb
Checksum = First 30 bits* of SHA256 hash = 1010-1111-1011-0100-1011-0100-1001-01

With 1-bit checksum and 10-bit information for each word**, it becomes:
1(c)
0000000000(i)
0(c)
0000000000(i)
1(c)
00000(i)
011111011010010110100100101(c)

So the 5 words address will be
10000000000-00000000000-10000001111-10110100101-10100100101
i.e. length-abandon-limit-regret-pigeon


* I still assume minimum 20-bit checksum. It could be encoded in 4 words if 16-bit checksum is used.
** Since you are using 16-bit checksum instead of 20-bit as I suggest, there might not be enough checksum bits for each word if 4 checksum bits is used. So I'd like to standardize it to be 1+10 instead of 4+7 as I suggested before.
hero member
Activity: 714
Merit: 662
I implemented it exactly as you proposed.
https://github.com/NicolasDorier/NBitcoin/blob/master/NBitcoin/BrainAddress.cs

We are already at 5 words though. (16 bit checksum)
I also check, if there is unused bits, that they are all to 0.

I still think 16 bits of checksum is overkill.

If there is a mistake, the info should not be : out of the chain, out of the block, out of the transaction, but also encoded correctly. (correct number of bit for encoding transaction index and txout index)
legendary
Activity: 1792
Merit: 1121
It makes sense, the only thing I would fix is the size of the checksum to make it minimum 16, rounded up to fill the unused avalaible bits.
If we have 6 words but actually use 56 bits, then we can add 10 bits to the checksum.

Do you see a way of improving hardcoding the size of the block height ? 22 bits is only ok until block height of 4 million... I agree that we don't care right now, but if we find a nice way to increase without doing a new spec, it would be nice.

Actually I believe we will have some better way to encode before block 4M. It is also likely that there will be significant change in the blockchain structure and this scheme will obsolete. In such case, probably 21 bits is already enough as I expect such change would happen in 30 years.

There is always a big temptation to reduce the number of checksum bits but each bit you cut halves the security.

Wow, just now, the block 346019 has 2064 transactions.
hero member
Activity: 714
Merit: 662
What about having 4 bit that give information about the number of bit of the block height ? (0 = 19, 4 = 23)
stupid idea :p

I'm coding it the way you proposed. We'll see if I find an idea for encoding the block height in a more efficient and general manner. (it is sad that we also use 22 bit even for the first blocks)
hero member
Activity: 714
Merit: 662
It makes sense, the only thing I would fix is the size of the checksum to make it minimum 16, rounded up to fill the unused avalaible bits.
If we have 6 words but actually use 56 bits, then we can add 10 bits to the checksum.

Do you see a way of improving hardcoding the size of the block height ? 22 bits is only ok until block height of 4 million... I agree that we don't care right now, but if we find a nice way to increase without doing a new spec, it would be nice.
Pages:
Jump to: