Author

Topic: weird bitcoin addresses and bech32 address format prefix (Read 351 times)

hero member
Activity: 630
Merit: 731
Bitcoin g33k
I did forget to reply on this thread, this is solved long time ago. However thank you all.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Thanks again. Makes sense now, ok.

Does anybody know if we can use pythons' bit (by ofek) to compute the hash160 of an arbitrary address ? I'd like to see a python example which is very fast and can handle nicely all possible address types.

Hope you don't mind my replying - it's very easy actually:

Code:
import bit
pubkey = '0326fa519713da024a6bc0eb4977b77e82602f66a5b16c7960958af998ee6055a0'
base = b'\x00' + bit.crypto.ripemd160_sha256(bytes.fromhex(pubkey))
bit.format.b58encode_check(base)
# '112Fjgiwnk7WGkaRTz97pDmUUhXNqBpmef'

Don't calculate the checksum manually because all base58check algorithms on Python I checked (pun not intended) all apply it at the end for you, so it will actually make a wrong address if you do that.
legendary
Activity: 3472
Merit: 10611
Yea, my point here is the 3 address has a fixed length, you don't even tell it is multisig, single sig or some other scripts at all. But with bech32, you can easily tell apart a single sig from a multi sig. Right?
That is correct but it is also not any kind of meaningful advantage or disadvantage to be able to tell what locking mechanism is used to lock up those coins.
legendary
Activity: 2380
Merit: 5213
Yea, my point here is the 3 address has a fixed length, you don't even tell it is multisig, single sig or some other scripts at all. But with bech32, you can easily tell apart a single sig from a multi sig. Right?
Right.
Both nested segwit and legacy multi-signature addresses start with 3 and you can't know whether the address is segwit or not if the owner of the address has never make any transaction from that.
An address starting with bc1q is a single-signature address, if it includes 42 characters and is a multi-signature address if it includes 62 characters.
member
Activity: 162
Merit: 65
Adding to the above, bc1p is multisig iirc.

Wrong, bc1p only refer to Taproot address. Multi-sig is just one of many things you could do with Taproot script. And FWIW, bc1p actually use Bech32m[1].

--snip--
So, i think the P2SH "3" addresses at least win this round: until it spends, you don't tell at all whether it is a single sig or multi sig.
I think I am right on this point. What do you think?

Don't forget SH in P2SH and P2WSH refer to Script Hash, so it's possible the spend condition isn't N-of-M signature.

[1] https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki

Yea, my point here is the 3 address has a fixed length, you don't even tell it is multisig, single sig or some other scripts at all. But with bech32, you can easily tell apart a single sig from a multi sig. Right?
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Adding to the above, bc1p is multisig iirc.

Wrong, bc1p only refer to Taproot address. Multi-sig is just one of many things you could do with Taproot script. And FWIW, bc1p actually use Bech32m[1].

--snip--
So, i think the P2SH "3" addresses at least win this round: until it spends, you don't tell at all whether it is a single sig or multi sig.
I think I am right on this point. What do you think?

Don't forget SH in P2SH and P2WSH refer to Script Hash, so it's possible the spend condition isn't N-of-M signature.

[1] https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki
member
Activity: 162
Merit: 65
Bech32 encoding has 3 parts:
(a) a human readable part that is "bc" for bitcoin (other chars used for other coins/chains)
(b) a separator which is always "1"
(c) the data part that is at least 6 characters long. In bitcoin (SegWit) addresses the first character is the witness version.

Bitcoin's Bech32 addresses are only defined for 2 witness versions and a total of 3 address types:
* P2WPKH that is using witness version 0 and is a short address encoding 20 bytes of data (pubkey hash) and starts with bc1q (q being the witness version)
* P2WSH that is also using witness version 0 and is a longer address encoding 32 bytes of data (script hash) and starts with bc1q (q being the witness version)
* P2TR that is using witness version 1 and is as long as P2WSH address encoding 32 bytes of data (tweaked pubkey) and starts with bc1p (p being the first byte of the data encoding the witness version).

The reason why you find so many addresses starting with bc1p is because Taproot was activated a while ago and people are using it hence the addresses.

Any other address you see that is using a different length or a different witness version (bc1sw50qgdz25j with s being witness version 16) is non-standard and anybody can spend the coins sent to these addresses because there are no consensus rules defined for them yet.

Quote
but no bc1p1 prefixed addresses.
1 is only used as the separator not as part of the bech32 charset for encoding which is why you will never see it as part of the encoded data.
Note: If you see a string like this the human readable part is actually "bc1p" and the second 1 is considered as the separator and everything after that 1 is the data.

Quote
What cirumstance led to this fact that no such bc1p1 prefixed addresses were generated and seen out there ?
A separator was needed, symbols would have made copying harder so "1" was chosen. When it is used as separator it should be removed from the charset. More here.


So, i think the P2SH "3" addresses at least win this round: until it spends, you don't tell at all whether it is a single sig or multi sig.
I think I am right on this point. What do you think?
hero member
Activity: 630
Merit: 731
Bitcoin g33k
Thanks again. Makes sense now, ok.

Does anybody know if we can use pythons' bit (by ofek) to compute the hash160 of an arbitrary address ? I'd like to see a python example which is very fast and can handle nicely all possible address types.
legendary
Activity: 3472
Merit: 10611
could you please explain to me why I can't determine the hash160 of a long P2WSH bc1 address ?
In short P2WSH is encoding a 256-bit hash that is the result of SHA256.

P2WPKH and P2WSH are different.
To create P2WPKH we compute HASH160 of public key (so we have a small 160-bit digest) which is then encoded using Bech32 to get that short address you see.
To create P2WSH we have a "redeem script" which we compute its SHA256 hash (so we have a 256-bit digest) which is then encoded using Bech32 to get that longer address you see.

Quote
7250d91085a77a4568fa4cfd5bebb59f0b9cb3530f8154cd4fab6d28abd548fe
Is this the correct hash160 for the P2WSH address mentioned in the example ?
Yes this is the correct SHA256 digest for that address.

The ice library thing you used doesn't seem to support decoding P2WSH addresses for some reason.
https://github.com/iceland2k14/secp256k1/blob/691e238c4a05c4bc93959fa339df341f24113919/secp256k1.py#L400

The address_to_h160 you called using the P2WSH address should have thrown an exception instead of returning a value since it is meant to decode P2PKH (Base58 encoded) addresses and not anything else.
hero member
Activity: 630
Merit: 731
Bitcoin g33k
@pooya87 or @all:

could you please explain to me why I can't determine the hash160 of a long P2WSH bc1 address ? Here is an example of a short (P2WPKH) bc1 address and a long (P2WSH) one

Quote
P2WPKH:
bc1qyllkgldjcwmeqvd39tt5wsymd9kg595su2tnp8

P2WSH:
bc1qwfgdjyy95aay2686fn74h6a4nu9eev6np7q4fn204dkj3274frlqrskvx0

pycoin library outputs the hash160 correctly (27ff647db2c3b79031b12ad747409b696c8a1690) but only for the short P2WPKH address, no output for the P2WSH. Why?

So I tried iceland2k14/secp256k1
Code:
ice.address_to_h160('bc1qwfgdjyy95aay2686fn74h6a4nu9eev6np7q4fn204dkj3274frlqrskvx0')
which returns
Quote
7687937601fb5a0bf6ed61fb1e69743abf0e7716904fa6dc15de32233fc059b45d7ee22a4987a74 7a9

which doesn't seem to be valid. Icelands' secp256k1 generates an incorrect hash160 for the short bc1qyllkgldjcwmeqvd39tt5wsymd9kg595su2tnp8 address. Seems like secp256k1 is not bech32 capable so better not to use it for such things.
I was able to successfully determine the correct hash160 of the short P2WSH address just by using bech32 python library and a small python program, but I am not sure if the result is correct.
Quote
7250d91085a77a4568fa4cfd5bebb59f0b9cb3530f8154cd4fab6d28abd548fe
Is this the correct hash160 for the P2WSH address mentioned in the example ?

I was trying to use python's bit library, I see there is a function defined for hash160 generation, it's located in bit.crypto
Code:
~/.local/lib/python3.10/site-packages/bit$ cat crypto.py
Quote
from hashlib import new, sha256 as _sha256

from coincurve import PrivateKey as ECPrivateKey, PublicKey as ECPublicKey

def sha256(bytestr):
    return _sha256(bytestr).digest()

def double_sha256(bytestr):
    return _sha256(_sha256(bytestr).digest()).digest()

def double_sha256_checksum(bytestr):
    return double_sha256(bytestr)[:4]

def ripemd160_sha256(bytestr):
    return new('ripemd160', sha256(bytestr)).digest()


hash160 = ripemd160_sha256


but I was not able to utilize it correctly.
legendary
Activity: 3472
Merit: 10611
did you mistype and meant rather "no one can spend the coins" ?
No. For forward compatibility the consensus rules have to assume any transaction spending any of the future witness versions (currently anything >=2) are valid by only performing minimal checks. Otherwise any future witness program has to be activated through a hard fork instead of a soft fork.

If the output script is OP_NUM + a single data push with total size between 4 and 42 it is considered a witness program and the interpreter will only check the following 2 if the OP_NUM is anything except OP_0 and OP_1:
1. Signature script of that input is empty
2. Program doesn't evaluate to OP_FALSE

bc1sw50qgdz25j is OP_16 <751e> and since 0x751e is not equal OP_FALSE anybody can spend any coins sent to this non-standard address without needing to provide any signature or anything since witness version 16 is not yet defined (we only have version 0 and 1).
Although because of it being non-standard, almost all full nodes reject such transactions and user has to contact a miner to include them in a block.
hero member
Activity: 630
Merit: 731
Bitcoin g33k
Any other address you see that is using a different length or a different witness version (bc1sw50qgdz25j with s being witness version 16) is non-standard and anybody can spend the coins sent to these addresses because there are no consensus rules defined for them yet.
did you mistype and meant rather "no one can spend the coins" ?

Thanks a bunch pooya87
legendary
Activity: 3472
Merit: 10611
Bech32 encoding has 3 parts:
(a) a human readable part that is "bc" for bitcoin (other chars used for other coins/chains)
(b) a separator which is always "1"
(c) the data part that is at least 6 characters long. In bitcoin (SegWit) addresses the first character is the witness version.

Bitcoin's Bech32 addresses are only defined for 2 witness versions and a total of 3 address types:
* P2WPKH that is using witness version 0 and is a short address encoding 20 bytes of data (pubkey hash) and starts with bc1q (q being the witness version)
* P2WSH that is also using witness version 0 and is a longer address encoding 32 bytes of data (script hash) and starts with bc1q (q being the witness version)
* P2TR that is using witness version 1 and is as long as P2WSH address encoding 32 bytes of data (tweaked pubkey) and starts with bc1p (p being the first byte of the data encoding the witness version).

The reason why you find so many addresses starting with bc1p is because Taproot was activated a while ago and people are using it hence the addresses.

Any other address you see that is using a different length or a different witness version (bc1sw50qgdz25j with s being witness version 16) is non-standard and anybody can spend the coins sent to these addresses because there are no consensus rules defined for them yet.

Quote
but no bc1p1 prefixed addresses.
1 is only used as the separator not as part of the bech32 charset for encoding which is why you will never see it as part of the encoded data.
Note: If you see a string like this the human readable part is actually "bc1p" and the second 1 is considered as the separator and everything after that 1 is the data.

Quote
What cirumstance led to this fact that no such bc1p1 prefixed addresses were generated and seen out there ?
A separator was needed, symbols would have made copying harder so "1" was chosen. When it is used as separator it should be removed from the charset. More here.
legendary
Activity: 2352
Merit: 6089
bitcoindata.science
There is some discussion related to those addresses here, in this project of LoyceV.

https://bitcointalk.org/index.php?topic=5254914.0;all

If you add up the addresses starting with 1, 3 and bc1q, you'll notice 16 addresses are missing. Those are:
Code:
bc1p23jk6urvv96x2gp3yqszqgpqyqszqgqa6qtuj
bc1p8qsysgrgypgjqufqtgs85gpcyqjzqsqfrw0l9
bc1p8ysyjgrfypfzqu3q9usrqgpeyqnzqfgexpv74
bc1pmfr3p9j00pfxjh0zmgp99y8zftmd3s5pmedqhyptwy6lm87hf5ss52r5n8
bc1pq2kqvpm76ewe20lcacq740p054at9sv7vxs0jn2u0r90af0k633322m7s8v
bc1pqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqs3wf0qm
bc1pv22mcnt30gwvk8g72szz700n4tkkx2qur2adj6pt8hl37hcf9dascxyf42
bc1px5sy2gr9yp8zqm3q2us8wgp4yq4jq0guggdp8
bc1pxcsyvgrxyp8jqmeqtqs8sgpkyq7zq0snaecz5
bc1pxgsyygrzyp9jq6eq2ss8ggpjyq5zq2gqvjed5
bc1pxqsrzgpjyqejqdpqx5srvgphyquzqwgdd7yg9
bc1pxssyggryypxjqmfq2cs8vgp5yqsjq0c760r6g
bc1pxusywgr8ypgzqupqtys8jgphyq4zqgcwqe32u
bc1pxvsyxgrrypxzqmpq25s82gpnypajqlgtqkfun
bc1pxysyzgrpyp9zq63q2vs8xgp3ypdjqhguvkagn
bc1zqyqsywvzqe
I don't know the story behind them, someone has been creating non-standard outputs. See txid 8bb2ce18914cfcb68e21686362b879396c2c27b51f1ec4be25c064f48f848f2d for most of them.


They are very few, and problably those funds are lost. Probably wrong generated addresseS?
copper member
Activity: 2856
Merit: 3071
https://bit.ly/387FXHi lightning theory
Adding to the above, bc1p is multisig iirc.

Thanks for the link, I already have this tab opened in my browser and red it. Unfortunately I wasn't able to answer my questions hence I did put them onto this thread.

Was there any other question? I read the link and found "1" is excluded from the address because it's excluded from base58 - that was done for readability I think as "o" was also removed.

Also blockchain explorers not picking up new addresses is because their developers are lazy. 5 years down the line and they've still not worked out how to do it...
hero member
Activity: 630
Merit: 731
Bitcoin g33k
Thanks for the link, I already have this tab opened in my browser and red it. Unfortunately I wasn't able to answer my questions hence I did put them onto this thread.
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
until now I thought that all bech32 addresses always start with the prefix 'bc1q'

Actually bech32 addresses start with bc1 and the next character comes from the witness version.
More of your questions may also get answered after you read this: https://en.bitcoin.it/wiki/BIP_0173
hero member
Activity: 630
Merit: 731
Bitcoin g33k
Hello everybody,

stumbled over a few weird addresses and looking for an explanation. As you see in the block explorer, the addresses had incoming transactions and UTXOs because the coins were never spent.

Quote

Question 1)
how does it come such short and valid bech32 addresses exist ?



until now I thought that all bech32 addresses always start with the prefix 'bc1q'. The mentioned examples above don't fit into this scheme as we see. On many info pages 'bc1' is mentioned as prefix. Out of curiosity, I looked at all the funded addresses on the blockchain to see what exist. There are thousands of addresses with prefix 'bc1p' and I immediately recognize that they have more characters than the usual ones with bc1q.

Here some examples:

Code:
bc1p000374fnuxfvhps6dvtut65g8kq7flhyy9zpltz6v07m8auxw8xsxmxr4q
bc1p000cavtlnpt9796j5rqzuc97mvr0z7dv7n63mvvxl9eeyqp6unrsjr8278
[...]
bc1pyu76wn0njsllkz39y2tkq4328k438rgzjvft2c26ucnq2gqle45qp2v8dy
bc1pzsh8nhmushx3gxyvgq2xwue8tqsdznhgtz2amtyr35anxhsrf97qay78he
bc1pzzzzmqpddscxwteulsaqha5t0sddvsuldcgzh096kg2gtfanxyds4gh798
bc1q00000002ltfnxz6lt9g655akfz0lm6k9wva2rm
bc1q0000005259fzmctd5xp7afs07dec5kxmp3fwgw
bc1q000000k2k2g4quxj7j3ddcz6kq69xa0xw8p2xx
bc1q00006zht44ezvmwjdymj9fnvcm0fmlwhdup55q
bc1q00007snl2y8ad3etca6mwzl86jvmk7av6u0772
bc1q0000et27ajp0vtyx5cqem3ngsjd447adp4wcu2
bc1q0000kvvqam2as2c3g26n7g3an596vuqnke64ew
bc1q0000qnrfhr5lf5a6v4042cwmdlkec9cpp7fr4m
bc1q0000sje0xgys2trk9972tudl4l220tyqttcr72
bc1q0000syek2pcnm5mupfg8uvew2lchgqyjswcyvf8d8exre3kw4assak2zgl
bc1q0000vcc9ypqa2ddlhj9vvz9rs5zpfzpr5ws4zn4uzkk28xezxh5qr55xjs
bc1q0000xx2wqshearravpm53nyy66e5fgu8erjxl3
bc1q0002chp04efzmcj7lv0tmem33v7ag3pv6rjv5z
bc1q0002gkkxx5mscfwtr0mthqnxyuwy0kgelgezpk
bc1q0002lrgfa8tkcz3ezspg2de4ss52n0sv72uxc0
bc1q0002q8zqngk6vv7pf594n4yrts8cgj7gw0w90n
[...]

Question 2)
I guess the longer addresses are P2WSH and the short ones are P2WPKH ?



Question 3)
how can it be that so many 'bc1p' addresses exist but this prefix is not mentioned at all in any known blockchain info site?



what's also interesting to see is the fact that there are many bc1p0 and bc1p2 and bc1p3 ...etc. addresses but no bc1p1 prefixed addresses.

Question 4)
Why? Although the chart '1' is allowed in the base58 charset. What cirumstance led to this fact that no such bc1p1 prefixed addresses were generated and seen out there ?

Jump to: