Pages:
Author

Topic: An easy way to remember a bitcoin address - page 5. (Read 15145 times)

legendary
Activity: 1792
Merit: 1111
Right, I changed that to match your value but there is no need for it. Use tx output = 1 then. The point is that w gets incremented after y was chosen based on w=1.

See my BIP above. After increasing w, you should go back to step 3. In your example, you only go back to step 4
hero member
Activity: 714
Merit: 662
jl2012, I'll create a BIP doc on github, it will be more easy to collaborate this way.
Can you give your github id ? hhanh00 also if you want to participate ? so I can give write rights.
hero member
Activity: 714
Merit: 662
jl2012, I think your spec is correct but can be explained more easily.
I'm trying to implement that, once I am done, I will try to reformulate that without changing the meaning.

I overviewed what you said, hhanh00, I'll find ambiguity by implementing it if there is.
sr. member
Activity: 467
Merit: 267
Right, I changed that to match your value but there is no need for it. Use tx output = 1 then. The point is that w gets incremented after y was chosen based on w=1.
legendary
Activity: 1792
Merit: 1111


Topic Summary
Posted on: Today at 06:23:06 PM Posted by: hhanh00
Insert Quote
(It looks better in markdown)

Using the same values for `x=3`, `c=4` and the blockchain:
I want to encode a tx at index 3, output index = 0
My tx index is 11b. `y3=2` and `z3=1`


As output index=0,
z3=log(0+1)=0

Try again with the correct z3
sr. member
Activity: 467
Merit: 267
(It looks better in markdown)

Using the same values for `x=3`, `c=4` and the blockchain:
I want to encode a tx at index 3, output index = 0
My tx index is 11b. `y3=2` and `z3=1`
Now I go through the encoding steps.
- `w = ceil(3+2+0+4+1)/11 = 1`
- There are 40 tx, so `y1=6`
- `y2=11-1-3-4 = 3`
- `y=min(3,6)=3`
- `y3>y`? no. don't increase `w` but use 3 bits. txindex is encoded as `011`.

Next phase. Let's say there are 2 outputs in that tx.
- `z1=1`
- `z2=11-1-3-4-4 = 0`
- `z=min(1,0) = 0`
- `z3>z` ? yes. increase w. `w=2` and retry
- `z2=22-1-3-4-4 = 11`
- `z=min(1,22)=1`
- `z3>z` ? no. use `z=1`. utxo index is encoded as `0`.
- checksum takes `22-1-3-3-1 = 14` bits

My result is
`1010110xxxx-xxxxxxxxxx0`

Yours is
`10101100110-00110101010`

The beginning is the same. Let's say the checksum is 0110-xxxxxxxxxx
I get `1010110110-xxxxxxxxxx0`

The decoder will think the txindex is yours and fail to validate.
In fact, my address will fail for any value of the checksum because the decoder will look at a different tx than mine.



legendary
Activity: 1792
Merit: 1111
Too much idea. I'd consolidate everything in one post. I hope this will become a BIP.

BIP: xxxx
Title: Mnemonic code for referencing transaction output in the blockchain
Authors: Nicolas Dorier, jl2012
Status: Draft
Type: Standards Track
Created: 2015

==Abstract==

This BIP describes the implementation of a mnemonic code for referencing any transaction output in the blockchain.

==Motivation==

==Generating the mnemonic address==

Definitions:

blockHeight: the height of the referenced block. Genesis block is 0
blockHash: the hash of the referenced block
txIndex: index of the referenced transaction in the block. Index of the first transaction in a block is 0
outputIndex: index of the referenced output in the transaction. Index of the first output in a transaction is 0
scriptPubKey: the scriptPubKey of the referenced output
c: a fixed integer, the minimal number of bits for checksum (to be defined later)
ymin = ceiling(log(txIndex + 1, 2))
zmin = ceiling(log(outputIndex + 1, 2))

Step1: Determine the number of bits and encoding of blockHeight
blockHeight takes x bits and is encoded as follow:
  • For height =< 1,048,575 (0-1111-1111-1111-1111-1111), blockHeight is the height as 21bit interger
  • For 1,048,575 < height =< 8,388,607, blockHeight is Concat(1, height as 23 bit integer), which totally takes 24bit. For example, block 1234567 is 1001-0010-1101-0110-1000-0111
  • For height > 8,388,607, it is undefined and returns error

Step2: Determine the lower bound of words required for the mnemonic address
  • w = ceiling((x + ymin + zmin + c + 1)/11)

Step3: Determine the number of bits for txIndex
txIndex takes y bits:
  • y1 = ceiling(log(total number of transactions in the block, 2))
  • y2 = 11w-1-x-c
  • y = min(y1, y2))
  • If (ymin > y), increase w by 1 and go back to Step 3 (This condition is not possible as the initial value of w must provide enough space for ymin)

Step4: Determine the number of bits for outputIndex
outputIndex takes z bits:
  • z1 = ceiling(log(total number of output in the transaction, 2))
  • z2 = 11w-1-x-y-c
  • z = min(z1, z2)
  • If (zmin > z), increase w by 1 and go back to Step 3

Step5: Calculate the checksum
  • rawChecksum = SHA256(SHA256(Concat(blockHash-txIndex-outputIndex-scriptPubKey)))    (I don't know why but Satoshi never used single SHA256)
  • (for simplicity, the length of txIndex and outputIndex here is same as those determined in Step 3 and 4 respectively)
  • checksum = rawChecksum(11w-1-x-y-z) (the most significant 11w-1-x-y-z bits of rawChecksum)

Step6: Get the rawAddress = Concat(blockHeight-txIndex-outputIndex-checksum)

Step7: Get the encryptionKey = least significant c bits of rawAddress

Step8: Get the finalEncryptionKey
  • If 11w-1-c =< c, finalEncryptionKey = encryptionKey(11w-1-c)
  • If 2c >= 11w-1-c > c, finalEncryptionKey = Concat(EncryptionKey, EncryptionKey(11w-1-2c))
  • If 3c >= 11w-1-c > 2c, finalEncryptionKey = Concat(EncryptionKey, EncryptionKey, EncryptionKey(11w-1-3c))
etc

Step9: Get the encryptedAddress = XOR(rawAddress(11w-1-c),finalEncryptionKey)

Step10: Get the finalAddress = Concat(encryptedAddress,encryptionKey,0)
The last 0 denotes the address version

Step11: Convert finalAddress into mnemonic code following BIP-0039


==Decoding the mnemonic address==

Step1: Determine w = number of words in the mnemonic code

Step2: Convert mnemonic code into finalAddress following BIP-0039

Step3: If the last bit of finalAddress is 0, drop it. If it is 1, return error.

Step4: Get the encryptionKey = least significant c bits of finalAddress

Step5: Get the finalEncryptionKey
  • If 11w-1-c =< c, finalEncryptionKey = encryptionKey(11w-1-c)
  • If 2c >= 11w-1-c > c, finalEncryptionKey = Concat(EncryptionKey, EncryptionKey(11w-1-2c))
  • If 3c >= 11w-1-c > 2c, finalEncryptionKey = Concat(EncryptionKey, EncryptionKey, EncryptionKey(11w-1-3c))
etc

Step6: Get the decryptedAddress = XOR(finalAddress(11w-1-c),finalEncryptionKey)

Step7: Get the rawAddress = Concat(decryptedAddress, encryptionKey)

Step8: Determine the blockHeight
  • If the first bit of rawAddress is 0, take the first 21 bit out of rawAddress. This is the blockHeight
  • If the first bit of rawAddress is 1, drop it. Take the first 23 bit out of rawAddress. This is the blockHeight
  • If the block does not exist, return error

Step9: Determine the txIndex
  • Look up the blockchain for the referenced block and determine the total number of transactions in the block
  • y1 = ceiling(log(total number of transactions in the block, 2))
  • y2 = 11w-1-x-c
  • y = min(y1, y2))
  • Take the first y bits out of rawAddress. This is the txIndex
  • If the transaction does not exist, return error

Step10: Determine the outputIndex
  • Look up the referenced transaction and determine the total number of outputs in the transaction
  • z1 = ceiling(log(total number of outputs in the transaction, 2))
  • z2 = 11w-1-x-y-c
  • z = min(z1, z2))
  • Take the first z bits out of rawAddress. This is the outputIndex
  • If the output does not exist, return error

Step11: Calculate the checksum
  • Determine the blockHash and scriptPubKey
  • rawChecksum = SHA256(SHA256(Concat(blockHash-txIndex-outputIndex-scriptPubKey)))
  • checksum = rawChecksum(11w-1-x-y-z)
  • If checksum equals to the remaining bits of rawAddress, the address is successfully resolved. Otherwise, return error.

==SPV client compatibility==

==Test vectors==

EDIT: I corrected a bug in the steps 3 and 4 of the generation code
legendary
Activity: 1792
Merit: 1111
You are only decoding to one of the possible pairs.
Just walk through the steps with my example if you don't believe me. I could be wrong, so it's good to have another person check. Better yet, post a reference implementation and anyone can verify.

Let me have a better example.

First of all, the number of bits must be a multiple of 11. To make it simple, let me use 22.

let's set x = 3, c = 4

let's say there is an address: 10101100110-00110101010 (w=22)

Since x=3, the block height is 101 = 5

So we look up the blockchain for block 5. Let's assume that the block has 40 transactions.

So y1 = ceiling(log(40,2)) = 6

We also know y2 = 11w-1-x-c = 22-1-3-4=14

So y = min(y1,y2) = 6

So txIndex is the next 6 bits: 011001 = 25

So we retrieve the 26th transaction in the block 5. Let's assume the transaction has 5 outputs.

So z1 = ceiling(log(5,2)) = 3

We also know z2 = 11w-1-x-y-c = 22-1-3-6-4 = 8

So z = min(z1,z2) = 3

So outputIndex is the next 3 bits: 100 = 4.

So the following bits: 011010101 are the checksum

The last bit is version bit and must be 0

So this address refers the the 5th output of the 26th transaction in the block 5. It is the only possible interpretation

=====

EDIT: it may not be very obvious but with w, x, c, and blockhash all determined, y is a fixed value. Similarly, with w, x, y, c, and txid all determined, z is a fixed value.
sr. member
Activity: 467
Merit: 267
You are only decoding to one of the possible pairs.
Just walk through the steps with my example if you don't believe me. I could be wrong, so it's good to have another person check. Better yet, post a reference implementation and anyone can verify.
legendary
Activity: 1792
Merit: 1111
It's ambiguous because given an encoded value, you don't know where the separation between txindex and outputindex is.

To keep things simple, let's say
- x = c = 0
- w = 1
- and 8 bits per word instead of 11.

if i have 0101 0000, is it
- txindex = 0 and outputindex = 1010000 or
- txindex = 01 and outputindex = 010000 or
- txindex = 010 and outputindex = 10000 or
- txindex = 0101 and outputindex = 0000 etc

the checksum introduces another issue. Since it has variable length, where does it start?


There is no ambiguity. Just a repost:

To decode, we can determine w with number of words used, so we will know y2 = 11w-1-x-c. We will also know y1 by the block height. Therefore, we know y=min(y1, y2)

With y confirmed, we can determine z2 = 11w-1-x-y-c, and z1 by retrieving the tx. Therefore, we know z = min(z1, z2)

With x, y, z fixed, the rest will be checksum
legendary
Activity: 1792
Merit: 1111

[UPDATE]One reason to include txindex in the hash : if someone wants to communicate the transaction id by using a "brain address"[/UPDATE]

Then you may also want to include outputindex as different output may have different meaning, even if they use the same scriptPubKey

So checksum = SHA256(blockhash-txindex-outputindex-scriptPubKey)

And I think x could be a VarInt

For height =< 1048576 (0-1111-1111-1111-1111-1111), it is encoded as usual with x = 21

For height > 1048576, the binary code will start as 1, followed by the height as 23 bit integer. Therefore x = 24. For example, block 1234567 is 1001-0010-1101-0110-1000-0111

Assuming 144 block/day and no hardfork to decrease block interval, we can keep x=21 until 2028. By 2028, I believe the block size will be very big and 6-word address will be the norm.

And the whole scheme is valid until 2168. The timestamp in block header will overflow in 2106 so I don't care anything after that.

sr. member
Activity: 467
Merit: 267
It's ambiguous because given an encoded value, you don't know where the separation between txindex and outputindex is.

To keep things simple, let's say
- x = c = 0
- w = 1
- and 8 bits per word instead of 11.

if i have 0101 0000, is it
- txindex = 0 and outputindex = 1010000 or
- txindex = 01 and outputindex = 010000 or
- txindex = 010 and outputindex = 10000 or
- txindex = 0101 and outputindex = 0000 etc

the checksum introduces another issue. Since it has variable length, where does it start?
legendary
Activity: 1792
Merit: 1111
It's 54 because 1 is left for the version bit

To formal define:

Definitions:
The address takes w-words
Height will take x bits, where x is fixed.
Checksum will take at least c bits
TxIndex and OutputIndex are counted starting from 0.

We first set w = 1 (we may have a better initial value if x and c are known)
TxIndex will take y bits, where
  • y1 = ceiling(log(tx count in the block, 2))
  • y2 = 11w-1-x-c
  • y3 = ceiling(log(TxIndex of the interested tx + 1, 2)
  • y = max(y3, min(y1, y2))
OutputIndex will take z bits, where
  • z1 = ceiling(log(output count in the tx, 2))
  • z2 = 11w-1-x-y-c
  • z3 = ceiling(log(OutputIndex of the interested output + 1, 2)
  • z = max(z3, min(z1, z2))
Checksum will take 11w-1-x-y-z bits. If 11w-1-x-y-z < c, increase w by 1 and repeat until we find the smallest solution for w.

In English:

If there is enough space, I'll use y1 bit to encode TxIndex. Otherwise, I'll use y2, given that it's enough to encode my interested tx. Otherwise, we need more words
If there is enough space after y is fixed, I'll use z1 bit to encode OutputIndex. Otherwise, I'll use z2, given that it's enough to encode my interested output. Otherwise, we need more words


Hmmm ... The English description doesn't match the pseudo code (a bug?). Moreover, it looks like the intended algorithm has collisions (different pairs of inputs that lead to the same encoded value pre checksum).


I think it's correct but I can present in a different way:

The address takes w-words
Height will take x bits, where x is fixed.
Checksum will take at least c bits
TxIndex and OutputIndex are counted starting from 0.

Define
y3 = ceiling(log(TxIndex of the interested tx + 1, 2))
z3 = ceiling(log(OutputIndex of the interested output + 1, 2))
w = ceiling((x + y3 + z3 + c + 1)/11)

TxIndex will take y bits, where
  • y1 = ceiling(log(tx count in the block, 2))
  • y2 = 11w-1-x-c
  • y = min(y1, y2))
If (y3 > y), increase w by 1 and start again from the beginning

OutputIndex will take z bits, where
  • z1 = ceiling(log(output count in the tx, 2))
  • z2 = 11w-1-x-y-c
  • z = min(z1, z2)
If (z3 > z), increase w by 1 and start again from the beginning

Checksum will take 11w-1-x-y-z bits.
sr. member
Activity: 467
Merit: 267
It's 54 because 1 is left for the version bit

To formal define:

Definitions:
The address takes w-words
Height will take x bits, where x is fixed.
Checksum will take at least c bits
TxIndex and OutputIndex are counted starting from 0.

We first set w = 1 (we may have a better initial value if x and c are known)
TxIndex will take y bits, where
  • y1 = ceiling(log(tx count in the block, 2))
  • y2 = 11w-1-x-c
  • y3 = ceiling(log(TxIndex of the interested tx + 1, 2)
  • y = max(y3, min(y1, y2))
OutputIndex will take z bits, where
  • z1 = ceiling(log(output count in the tx, 2))
  • z2 = 11w-1-x-y-c
  • z3 = ceiling(log(OutputIndex of the interested output + 1, 2)
  • z = max(z3, min(z1, z2))
Checksum will take 11w-1-x-y-z bits. If 11w-1-x-y-z < c, increase w by 1 and repeat until we find the smallest solution for w.

In English:

If there is enough space, I'll use y1 bit to encode TxIndex. Otherwise, I'll use y2, given that it's enough to encode my interested tx. Otherwise, we need more words
If there is enough space after y is fixed, I'll use z1 bit to encode OutputIndex. Otherwise, I'll use z2, given that it's enough to encode my interested output. Otherwise, we need more words


Hmmm ... The English description doesn't match the pseudo code (a bug?). Moreover, it looks like the intended algorithm has collisions (different pairs of inputs that lead to the same encoded value pre checksum).
hero member
Activity: 714
Merit: 662
ok perfect for me, I prefer this proposal, I'll give it a try.
legendary
Activity: 1792
Merit: 1111
I think this proposal is better than the previous.
The only thing that is strange is :

Quote
FinalAddress = Concat(EncryptedAddress,EncryptionKey,0)
What this final 0 is for ?

Quote
I think there is a better way to encode the address.
Just to be clear, "better" is not "more efficiently" ? but easier to implement right ?

I don't see any flaw right now, I'll sleep on it tonight and implement it if I don't find any problem.

Also you don't include TxIndex in the checksum. I think we should.
If a bit is flipped on the TxIndex, then the checksum will not see it. Agreed that we don't care, since the ScriptPubKey will still be the same.
However, I see no reason why not to include it.

Yes, just easier and tidier.

0 is the address version. We have to spend at least one bit to denote the version. Otherwise, we need an entirely different word list to upgrade the scheme in the future (e.g. due to some major hardfork in the blockchain)
hero member
Activity: 714
Merit: 662
I think this proposal is better than the previous.
The only thing that is strange is :

Quote
FinalAddress = Concat(EncryptedAddress,EncryptionKey,0)
What this final 0 is for ?

Quote
I think there is a better way to encode the address.
Just to be clear, "better" is not "more efficiently" ? but easier to implement right ?

I don't see any flaw right now, I'll sleep on it tonight and implement it if I don't find any problem.

Also you don't include TxIndex in the checksum. I think we should.
If a bit is flipped on the TxIndex, then the checksum will not see it. Agreed that we don't care, since the ScriptPubKey will still have to be the same.
However, I see no reason why not to include it.

[UPDATE]One reason to include txindex in the hash : if someone wants to communicate the transaction id by using a "brain address"[/UPDATE]
legendary
Activity: 1792
Merit: 1111
It's 54 because 1 is left for the version bit

To formal define:

Definitions:
The address takes w-words
Height will take x bits, where x is fixed.
Checksum will take at least c bits
TxIndex and OutputIndex are counted starting from 0.

We first set w = 1 (we may have a better initial value if x and c are known)
TxIndex will take y bits, where
  • y1 = ceiling(log(tx count in the block, 2))
  • y2 = 11w-1-x-c
  • y3 = ceiling(log(TxIndex of the interested tx + 1, 2)
  • y = max(y3, min(y1, y2))
OutputIndex will take z bits, where
  • z1 = ceiling(log(output count in the tx, 2))
  • z2 = 11w-1-x-y-c
  • z3 = ceiling(log(OutputIndex of the interested output + 1, 2)
  • z = max(z3, min(z1, z2))
Checksum will take 11w-1-x-y-z bits. If 11w-1-x-y-z < c, increase w by 1 and repeat until we find the smallest solution for w.

In English:

If there is enough space, I'll use y1 bit to encode TxIndex. Otherwise, I'll use y2, given that it's enough to encode my interested tx. Otherwise, we need more words
If there is enough space after y is fixed, I'll use z1 bit to encode OutputIndex. Otherwise, I'll use z2, given that it's enough to encode my interested output. Otherwise, we need more words


To decode, we can determine w with number of words used, so we will know y2 = 11w-1-x-c. We will also know y1 by the block height. Therefore, y=min(y1, y2)

With y confirmed, we can determine z2 = 11w-1-x-y-c, and z1 by retrieving the tx. Therefore, z = min(z1, z2)

With x, y, z fixed, the rest will be checksum

I don't have the prove but I believe this is the most compact way to present an address

This leaves only 2 parameters to set: x and c.
x=22 for 70 years, 21 for 30 years, and 20 for 10 years
c maybe somewhere between 16-20. I prefer the upper bound.
legendary
Activity: 1792
Merit: 1111
I see some inconsistency, can you tell me if I reformulate right ?

Quote
Checksum: 54-x-y-z bits
I assume you mean 55. There is 55 bits in 5 words.


So

Checksum = SHA256(Concat(blockhash,scriptPubKey))

raw address = blockheight(22)-txindex(1)-outputindex(1)-checksum(8*32)

Then you truncate checksum to get a raw address of 55 bit. (so you keep only the 33 first bit)

raw address = blockheight(22)-txindex(1)-outputindex(1)-checksum(21)

The checksum meet the minimum size, so you keep 5 words.

Now you define

EncryptionKey = checksum(20)

So we have :
EncryptedAddress = raw address(35) - EncryptionKey



1 bit is deducted for version bit

Checksum = SHA256(Concat(blockhash,scriptPubKey))

RawAddress = blockheight(x)-txindex(y)-outputindex(z)-checksum(8*32)

Then you truncate checksum to get a RawAddress of 11w-1 bit.

RawAddress = blockheight(x)-txindex(y)-outputindex(z)-checksum(11w-1-x-y-z)

The checksum has to meet the minimum size c.

Now you define

EncryptionKey = least significant c bits of RawAddress
If 11w-1-c =< c, FinalEncryptionKey = EncryptionKey(11w-1-c)
If 2c >= 11w-1-c > c, FinalEncryptionKey = Concat(EncryptionKey, EncryptionKey(11w-1-2c))
If 3c >= 11w-1-c > 2c, FinalEncryptionKey = Concat(EncryptionKey, EncryptionKey, EncryptionKey(11w-1-3c))
etc

So we have :
EncryptedAddress = XOR(RawAddress(11w-1-c),FinalEncryptionKey)
FinalAddress = Concat(EncryptedAddress,EncryptionKey,0)

To decode, just derive the FinalEncryptionKey with EncryptionKey, and XOR with EncryptedAddress, concat the result with EncryptionKey, and the result will be RawAddress
legendary
Activity: 1792
Merit: 1111
It's 54 because 1 is left for the version bit

To formal define:

Definitions:
The address takes w-words
Height will take x bits, where x is fixed.
Checksum will take at least c bits
TxIndex and OutputIndex are counted starting from 0.

We first set w = 1 (we may have a better initial value if x and c are known)
TxIndex will take y bits, where
  • y1 = ceiling(log(tx count in the block, 2))
  • y2 = 11w-1-x-c
  • y3 = ceiling(log(TxIndex of the interested tx + 1, 2)
  • y = max(y3, min(y1, y2))
OutputIndex will take z bits, where
  • z1 = ceiling(log(output count in the tx, 2))
  • z2 = 11w-1-x-y-c
  • z3 = ceiling(log(OutputIndex of the interested output + 1, 2)
  • z = max(z3, min(z1, z2))
Checksum will take 11w-1-x-y-z bits. If 11w-1-x-y-z < c, increase w by 1 and repeat until we find the smallest solution for w.

In English:

If there is enough space, I'll use y1 bit to encode TxIndex. Otherwise, I'll use y2, given that it's enough to encode my interested tx. Otherwise, we need more words
If there is enough space after y is fixed, I'll use z1 bit to encode OutputIndex. Otherwise, I'll use z2, given that it's enough to encode my interested output. Otherwise, we need more words
Pages:
Jump to: