Author

Topic: Step by step guide to go from public key to a Bech32 encoded address (Read 1448 times)

sr. member
Activity: 1190
Merit: 469

It does. Bech32 encoding was introduced to let users use the new SegWit outputs that is was a "leap forward in technology".

 a leap forward in error checking maybe but just saying that if you use this address type then we won't make you pay as many fees doesn't mean any great technological achievement. it's kind of arbitrary. Grin

thanks for the comment about bip38 but i'm not willing to accept a modification to it that would use any of the flagbyte that they have reserved for future possible uses.
legendary
Activity: 3472
Merit: 10611
I don't really see the benefit of new address types that represent the same old public key.
Addresses have never represented public keys, they always represent a script. When a new address is introduced that means a new standard script was introduced (P2PKH > P2SH > P2WPKH/P2WSH > P2TR).

Quote
they incentived using bech32 but that doesn't mean really that its better. or represents any kind of leap forward in technology.
It does. Bech32 encoding was introduced to let users use the new SegWit outputs that is was a "leap forward in technology".

Quote
but for example you can't do bip38 with bech32 as that is not defined in the bip38 standard. so if you do try and do it then you're not going by any bip you're making something up that no one else might know how you did it or have to follow if they tried to implement the same thing.  which is kind of a shame because bip38 was pretty cool. but you definitely can't do that with bech32.
It depends on what you are doing. If it is for personal encryption then you can easily make a tiny modification in BIP38 and use bech32 address. If it is a wallet that people use, again you can add that option and people will both see the algorithm and can always use it to encrypt/decrypt their keys using that wallet.
sr. member
Activity: 1190
Merit: 469
Quote
That's the inherent property of mnemonics that lack address type indication (and a lot more such as derivation path). Existence of different address types is not a problem here.

I don't really see the benefit of new address types that represent the same old public key. they incentived using bech32 but that doesn't mean really that its better. or represents any kind of leap forward in technology. i'm not worried about error checking because i know how to use a clipboard and qrcode reader/scanner.

but for example you can't do bip38 with bech32 as that is not defined in the bip38 standard. so if you do try and do it then you're not going by any bip you're making something up that no one else might know how you did it or have to follow if they tried to implement the same thing.  which is kind of a shame because bip38 was pretty cool. but you definitely can't do that with bech32.
legendary
Activity: 3472
Merit: 10611
The more address types there are, the greater the chance someone messes something up when trying to create some 3rd party application, or otherwise. If you are a user of this 3rd party application, this will negatively affect you, and if you are a bitcoin user, this has the potential to reflect negatively on bitcoin.
First of all saying "3rd party application" suggests there is a centralized authority that produces bitcoin related applications and anybody else is a third party which doesn't make any sense.
Secondly if a developer is not capable of correct implementation of a simple encoding and by extension address validation, then you have so much more serious things to worry about in all parts of that application than existence of multiple address types and encoding algorithms!

Quote
Also, if you are trying to recover your wallet from their seed, having multiple address types adds complexity to the process, and may make some people erroneously believe there is a problem with their seed when they use their seed backup to generate the wrong address type.
That's the inherent property of mnemonics that lack address type indication (and a lot more such as derivation path). Existence of different address types is not a problem here.
copper member
Activity: 1666
Merit: 1901
Amazon Prime Member #7
i dont understand why they needed to have bech32m if they already have bech32. the more address types that get created i think the worse that is for bitcoin not better. i dont know if anyone else feel that way or not especially though if the address type that is being changed had some type of issue with it that motivated the creation of a related address typing. but i guess they'll keep going. watch out for bech32P anytime soon.
You don't have to worry about anything, in the end you are just copying a string that if nobody had mentioned you wouldn't even know it is a modified encoding scheme! Your wallet does everything behind the scene for you. If it were necessary we would introduce bech32 a to z.
The more address types there are, the greater the chance someone messes something up when trying to create some 3rd party application, or otherwise. If you are a user of this 3rd party application, this will negatively affect you, and if you are a bitcoin user, this has the potential to reflect negatively on bitcoin.

Also, if you are trying to recover your wallet from their seed, having multiple address types adds complexity to the process, and may make some people erroneously believe there is a problem with their seed when they use their seed backup to generate the wrong address type.
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
The actual checksum used in encoding is the result of "polymod" XORed with a constant. This constant is 0x01 for Bech32 (BIP-173) encoding just flipping the least significant bit and is 0x2bc830a3 for Bech32m (BIP-350) encoding. The former is used for witness version 0 addresses and the later for 1+. This is basically the only difference that was introduced in BIP-350.

Thanks for the explanation, it's simpler than i think. Guess i missed it because BIP 173 never use word "constant" when they mention 0x01.



@Coding Enthusiast do you have any plan to create Bech32m version of this guide since the BIP already created and few wallet already implement it?

i dont understand why they needed to have bech32m if they already have bech32. the more address types that get created i think the worse that is for bitcoin not better. i dont know if anyone else feel that way or not especially though if the address type that is being changed had some type of issue with it that motivated the creation of a related address typing. but i guess they'll keep going. watch out for bech32P anytime soon. Angry

Here's why. On a side note, most people won't be able to tell difference between Bech32 and Bech32m. At least unless they know Bech32m is used since witness version 1 which indicated by Bech32 address with prefix bc1p.

Motivation

BIP173 defined a generic checksummed base 32 encoded format called Bech32. It is in use for segregated witness outputs of version 0 (P2WPKH and P2WSH, see BIP141), and other applications.

Bech32 has an unexpected weakness: whenever the final character is a 'p', inserting or deleting any number of 'q' characters immediately preceding it does not invalidate the checksum. This does not affect existing uses of witness version 0 BIP173 addresses due to their restriction to two specific lengths, but may affect future uses and/or other applications using the Bech32 encoding.

This document addresses that by specifying Bech32m, a variant of Bech32 that mitigates this insertion weakness and related issues.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
The more address types that get created i think the worse that is for bitcoin not better.

It only really matters to developers. Normal users who are just sending and receiving bitcoin will not care much about the address as long as it is accepted at the places they need to pay (so could be a problem in the early months of deployment but as time passes it should largely become a non-issue).
legendary
Activity: 3472
Merit: 10611
i dont understand why they needed to have bech32m if they already have bech32. the more address types that get created i think the worse that is for bitcoin not better. i dont know if anyone else feel that way or not especially though if the address type that is being changed had some type of issue with it that motivated the creation of a related address typing. but i guess they'll keep going. watch out for bech32P anytime soon.
You don't have to worry about anything, in the end you are just copying a string that if nobody had mentioned you wouldn't even know it is a modified encoding scheme! Your wallet does everything behind the scene for you. If it were necessary we would introduce bech32 a to z.
sr. member
Activity: 1190
Merit: 469
@Coding Enthusiast do you have any plan to create Bech32m version of this guide since the BIP already created and few wallet already implement it?

i dont understand why they needed to have bech32m if they already have bech32. the more address types that get created i think the worse that is for bitcoin not better. i dont know if anyone else feel that way or not especially though if the address type that is being changed had some type of issue with it that motivated the creation of a related address typing. but i guess they'll keep going. watch out for bech32P anytime soon. Angry
full member
Activity: 161
Merit: 168
How do I compute the checksum? The bip 173 page has the code in python which I do not understand. Can you explain the process so I could code in Java/Kotlin?

Here it is in Java
https://github.com/MrMaxweII/Bitcoin-Address-Generator/blob/master/src/BTClib3001/Bech32Address.java
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
How do I compute the checksum? The bip 173 page has the code in python which I do not understand. Can you explain the process so I could code in Java/Kotlin?
I haven't seen a Java/Kotlic implementation but apart from the C implementation there is JavaScript and my own C# implementations that are similar languages.

The process is:
"Expand HRP" by converting its N base-256 (8-bit) representation to 2N+1 base-32 (5-bit).
For example bc which is 0x6263 or 0b01100010_01100011 becomes 0x0303000203
In this process for each octet the highest 3 bits are placed in first half of the result and the remaining (low) 5 bits in second half. In the example above:
b=0x62=0b01100010 -> 0x0303000203 and
c=0x63=0b01100011 -> 0x0303000203
The two halves are separated with a zero (the value at the middle is always 0) -> 0x0303000203

Now we can compute checksum of [expanded HRP] | [base32 data] | [6x zeros] (note that "|" is concatenation).
The process is best explained by code
Code:
private static uint Polymod(byte[] data)
{
    uint chk = 1;
    foreach (byte b in data)
    {
        uint temp = chk >> 25;
        chk = ((chk & 0x1ffffff) << 5) ^ b;
        for (int i = 0; i < 5; i++)
        {
            if (((temp >> i) & 1) == 1)
            {
                chk ^= generator[i];
            }
        }
    }
    return chk;
}

The actual checksum used in encoding is the result of "polymod" XORed with a constant. This constant is 0x01 for Bech32 (BIP-173) encoding just flipping the least significant bit and is 0x2bc830a3 for Bech32m (BIP-350) encoding. The former is used for witness version 0 addresses and the later for 1+. This is basically the only difference that was introduced in BIP-350.
legendary
Activity: 4522
Merit: 3426
How do I compute the checksum? The bip 173 page has the code in python which I do not understand. Can you explain the process so I could code in Java/Kotlin?

The BIP also links to a C version if that helps: https://github.com/sipa/bech32/tree/master/ref/c
newbie
Activity: 1
Merit: 0
How do I compute the checksum? The bip 173 page has the code in python which I do not understand. Can you explain the process so I could code in Java/Kotlin?
newbie
Activity: 2
Merit: 0
Thank you for this! It's now working  Cheesy

I'm trying to decode a bech32 address tb1qwm3dqje4wc7cs2u9sv39yh2as8ae0ntzqkjunw into the h160 of the public key

Same process steps in reverse:
Map each character to its index in Bech-32 charset after removing hrp (q=0; w=14,...):
Code:
0 14 27 17 13 0 18...
The result is in base-32 (5-bits group) and has to be converted back to base-256 (8-bits or 1 octet/byte group).
Also since this is an address, the first item is the witness version that has to be dropped and evaluated separately.
Code:
14 = 01110
27 = 11011
17 = 10001
13 = 01101
0  = 00000
18 = 10010
25 = 11001
...
Select 8-bits at a time:
Code:
01110110 11100010 11010000 01001011 ...
118      226      208      75 ...
which is  0x76e2d0... in hex and the same result that the site you linked gives.

PS. The additional steps to expand H.R.P and compute and verify checksum are skipped for simplicity but are mandatory.

How do I go from the array of numbers into the hex version of the h160 which according to https://slowli.github.io/bech32-buffer/ should be: 751e76e8199196d454941c45d1b3a323f1433bd6
Try again, when decoding Bech32=tb1qwm3dqje4wc7cs2u9sv39yh2as8ae0ntzqkjunw it returns 0x76e2d04b35763d882b858322525d5d81fb97cd62 as data (hexadecimal).
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
I'm trying to decode a bech32 address tb1qwm3dqje4wc7cs2u9sv39yh2as8ae0ntzqkjunw into the h160 of the public key

Same process steps in reverse:
Map each character to its index in Bech-32 charset after removing hrp (q=0; w=14,...):
Code:
0 14 27 17 13 0 18...
The result is in base-32 (5-bits group) and has to be converted back to base-256 (8-bits or 1 octet/byte group).
Also since this is an address, the first item is the witness version that has to be dropped and evaluated separately.
Code:
14 = 01110
27 = 11011
17 = 10001
13 = 01101
0  = 00000
18 = 10010
25 = 11001
...
Select 8-bits at a time:
Code:
01110110 11100010 11010000 01001011 ...
118      226      208      75 ...
which is  0x76e2d0... in hex and the same result that the site you linked gives.

PS. The additional steps to expand H.R.P and compute and verify checksum are skipped for simplicity but are mandatory.

How do I go from the array of numbers into the hex version of the h160 which according to https://slowli.github.io/bech32-buffer/ should be: 751e76e8199196d454941c45d1b3a323f1433bd6
Try again, when decoding Bech32=tb1qwm3dqje4wc7cs2u9sv39yh2as8ae0ntzqkjunw it returns 0x76e2d04b35763d882b858322525d5d81fb97cd62 as data (hexadecimal).
newbie
Activity: 2
Merit: 0
Hey, great post! It's really useful.

I'm trying to decode a bech32 address tb1qwm3dqje4wc7cs2u9sv39yh2as8ae0ntzqkjunw into the h160 of the public key, which is step 3 in your guide, using the reference implementation in python.

I end up with an array of integers: [118, 226, 208, 75, 53, 118, 61, 136, 43, 133, 131, 34, 82, 93, 93, 129, 251, 151, 205, 98] which I think corresponds with step 4 of your guide.

How do I go from the array of numbers into the hex version of the h160 which according to https://slowli.github.io/bech32-buffer/ should be: 751e76e8199196d454941c45d1b3a323f1433bd6

Any help would be appreciated!

legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
What am I doing wrong?

The encoding is Bech-32 not Base-58 and there is no SHA256 in this encoding, there is only playing around with bits. Read BIP-173 for details of how you should compute the checksum.
full member
Activity: 161
Merit: 168
I have now successfully reached step 5.
I have: 000e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16

Now I do not come to step 6.
When I calculate: SHA256 (SHA256 (000e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16)) = 103ede5cc41abfd088a368bb40df3d52e5614460521d88daa202917c7f3de88d.

The first 6 bytes are then with me: 10 3e de 5c c4 1a
And this is a different number than yours.
What am I doing wrong?
newbie
Activity: 3
Merit: 0
Thank you because I could understand until the last soft fork.
I explain here how a segwit address in P2WSH format is derived so as not to mix the segwit between P2WPKH and P2WSH.
Note that the guide explained above generates in P2WPKH format about 42 characters but P2WSH format about 62 characters.

https://bitcointalksearch.org/topic/step-by-step-guide-to-go-from-public-key-to-a-p2wsh-bech32-encoded-address-5227953
newbie
Activity: 2
Merit: 0
Here's a simple way to understand it...

Convert the first value from a hex value to an array of bits where: 0 = 0000, 1=0001, 2=0010, 3-0011, 4=0100, etc
   7    5    1    e    7    6    e ...
0111 0101 0001 1110 0111 0110 1110 ...


Change the spacing of the 1's and zeros so that they are grouped 5 in a set instead of 4:
01110 10100 01111 00111 01101 110 ...

Convert each set of 5 into a hex value:
01110 10100 01111 00111 01101 110...
   0e    14    0f    07    0d    ...   



Thank You Man! You best!  Cool
legendary
Activity: 3528
Merit: 4945
3. Perform RIPEMD-160 hashing on the result of SHA-256:
Code:
751e76e8199196d454941c45d1b3a323f1433bd6

4. The result of step 3 is an array of 8-bit unsigned integers (base 2^8=256) and Bech32 encoding converts this to an array of 5-bit unsigned integers (base 2^5=32) so we "squash" the bytes to get:
in hex:
Code:
0e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16
Hey.

Please tell me, how did you get the string "0e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16" from the string "751e76e8199196d454941c45d1b3a323f1433bd6"? What command do you need to execute for this?


Here's a simple way to understand it...

Convert the first value from a hex value to an array of bits where: 0 = 0000, 1=0001, 2=0010, 3-0011, 4=0100, etc
   7    5    1    e    7    6    e ...
0111 0101 0001 1110 0111 0110 1110 ...


Change the spacing of the 1's and zeros so that they are grouped 5 in a set instead of 4:
01110 10100 01111 00111 01101 110 ...

Convert each set of 5 into a hex value:
01110 10100 01111 00111 01101 110...
   0e    14    0f    07    0d    ...   
newbie
Activity: 2
Merit: 0

3. Perform RIPEMD-160 hashing on the result of SHA-256:
Code:
751e76e8199196d454941c45d1b3a323f1433bd6

4. The result of step 3 is an array of 8-bit unsigned integers (base 2^8=256) and Bech32 encoding converts this to an array of 5-bit unsigned integers (base 2^5=32) so we "squash" the bytes to get:
in hex:
Code:
0e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16



Hey.

Please tell me, how did you get the string "0e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16" from the string "751e76e8199196d454941c45d1b3a323f1433bd6"? What command do you need to execute for this?
full member
Activity: 203
Merit: 168
nice.  why not add to bitcoin wiki?
sr. member
Activity: 310
Merit: 727
---------> 1231006505
If you want to play around with this using Python you can check: https://github.com/mcdallas/cryptotools

Example:
Code:
>>> from ECDSA.secp256k1 import CURVE, PrivateKey

>>> private = PrivateKey.random()
>>> private.int()
8034465994996476238286561766373949549982328752707977290709076444881813294372

>>> public = private.to_public()
>>> public
PublicKey(102868560361119050321154887315228169307787313299675114268359376451780341556078, 83001804479408277471207716276761041184203185393579361784723900699449806360826)

>>> public.point in CURVE
True

>>> public.to_address('P2WPKH')
'bc1qh2egksgfejqpktc3kkdtuqqrukrpzzp9lr0phn'
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
I feel that your description is confusing. You write "array of 5-bit integers", but displaying the results as a hex string implies that it is a string of 8-bit values.
I get what you are talking about but base-16 is just another representation of an array of "numbers", it doesn't really make a difference if I write 14 12 15 with spaces or 0e 14 0f with or without spaces, they are both representing the same set of numbers in base-256. The only possible way to clarify things is if I start typing them in binary like this but that's just impossible to read:
Code:
01110 10100 01111 ...


Additionally hex or base-16 is a very easy and convenient way to transfer arrays of "numbers". For instance you can not input each of those "numbers" (14, 20, 15...) one by one in an array when coding, it would take a long time and it is easy to make a mistake. But you can very easily give your code the hexadecimal string representation of it and decode it into the array of "numbers" then treat those "numbers" however you like.

I am going to add both numbers and binary, maybe that helps visualizing it better.

I recommend removing "byte" since the witness version is not a byte. Note that bip-173 also calls it a "byte" when it isn't.
Well, "version byte" is the name of the "0" we are appending to it, I can't just change that name:
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#witness-program
https://github.com/bitcoin/bips/blob/master/bip-0142.mediawiki#rationale
legendary
Activity: 4522
Merit: 3426
4. The result of step 3 is an array of 8-bit unsigned integers (base 2^8=256) and Bech32 encoding converts this to an array of 5-bit unsigned integers (base 2^5=32) so we "squash" the bytes to get:

I feel that your description is confusing. You write "array of 5-bit integers", but displaying the results as a hex string implies that it is a string of 8-bit values. I recommend inserting spaces between each value to emphasize that each element is distinct, and perhaps using decimal values to avoid implying that the values could lie outside of the range 0 - 31.

For example:

5. Add the witness version byte in front of the step 4 result (current version is 0):
Code:
000e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16

I recommend removing "byte" since the witness version is not a byte. Note that bip-173 also calls it a "byte" when it isn't.

Quote
5. Add the witness version in front of the step 4 result (current version is 0):
Code:
0 14 20 15 7 13 26 0 25 18 6 11 13 8 21 4 20 3 17 2 29 3 12 29 3 4 15 24 20 6 14 30 22
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
Great guide and mostly it's easy to understand. But i don't understand the 8th steps, is it encode/decode result from Bech32 characters?

Yes. Note that step7 is the hexadecimal representation of an array of 5-bit integers {0, 14, 20, 15, 7, ..., 11, 21} so 0 is item at index 0 of B32Chars or the letter q and 14 is the character at index 14 or w, 20 is 5 and so on.
 In C♯
Code:
string B32Chars = "qpzry9x8gf2tvdw0s3jn54khce6mua7l";
StringBuilder result = new StringBuilder();
foreach (byte item in step7Array)
{
   result.Append(B32Chars[item]);
}

Basically this:
.join in python implementation (https://github.com/sipa/bech32/blob/master/ref/python/segwit_addr.py#L59)
or the for loop in JavaScript implementation (https://github.com/sipa/bech32/blob/master/ref/javascript/bech32.js#L74-L76)
legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
Bitcoin wiki has a pretty good step-by-step explanation of how to go from a public key to a base58_encoded address which contains the values for each step of the way[1]. But unfortunately I could not find anything similar for Bech32_encoding. Additionally I found the reference implementations a bit confusing[2]! The information is out there[3] but I feel like having it step-by-step like "[1]" can make it a lot easier specially for developers. For example during unit testing I was getting a different address (bc1qp63uahgrxged4z5jswyt5dn5v3lzsem6c0qqhg8) for below public key and I wasn't sure where the bug was coming from, this visualization helped me [4] realize I was appending the version byte before converting the bits instead of after. So hopefully these steps can help someone like me looking for them.


How to create a Bech32 address from a public key:

1. Having a compressed[5] public key (0x02 or 0x03 followed by 32 byte X coordinate):
Code:
0279be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798

2. Perform SHA-256 hashing on the public key:
Code:
0f715baf5d4c2ed329785cef29e562f73488c8a2bb9dbc5700b361d54b9b0554

3. Perform RIPEMD-160 hashing on the result of SHA-256:
Code:
751e76e8199196d454941c45d1b3a323f1433bd6

4. The result of step 3 is an array of 8-bit unsigned integers (base 2^8=256) and Bech32 encoding converts this to an array of 5-bit unsigned integers (base 2^5=32) so we "squash" the bytes to get:
in hex:
Code:
0e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16
in numbers:
Code:
14 20 15 07 13 26 00 25 18 06 11 13 08 21 04 20 03 17 02 29 03 12 29 03 04 15 24 20 06 14 30 22
5 bits binary:
Code:
01110 10100 01111 00111 01101 11010 00000 11001 10010 00110 01011 01101 01000 10101 00100 10100 00011 10001 00010 11101 00011 01100 11101 00011 00100 01111 11000 10100 00110 01110 11110 10110

5. Add the witness version byte in front of the step 4 result (current version is 0):
Code:
000e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e16

6. Compute the checksum by using the data from step 5 and the H.R.P (bc for MainNet and tb for TestNet)
Code:
0c0709110b15

7. Append the checksum to result of step 5 (we now have an array of 5-bit integers):
Code:
000e140f070d1a001912060b0d081504140311021d030c1d03040f1814060e1e160c0709110b15

8. Map each value to its corresponding character in Bech32Chars (qpzry9x8gf2tvdw0s3jn54khce6mua7l) 00 -> q, 0e -> w,...
Code:
qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4

9. A Bech32_encoded address consists of 3 parts: HRP + Separator + Data:
Code:
bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4

The final result from step 9 is the same as example in BIP173[6]

References:
[1] https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_addresses
[2] https://github.com/sipa/bech32
[3] https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki
[4] https://en.bitcoin.it/w/images/en/4/48/Address_map.jpg
[5] Only compressed public keys are allowed: https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki#restrictions-on-public-key-type
[6] https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#examples
Jump to: