Author

Topic: What kinds of encoding/decoding for scriptSig of p2pkh? (Read 201 times)

brand new
Activity: 0
Merit: 0
Thank you everyone for contributing because it solved my problem in no time.
legendary
Activity: 3472
Merit: 10611
Thanks for all ur help.

BTW, ECDA (particularly secp256k1 that is for bitcoin) ...is there any good tutorial that I can read and study about?
I've googled, and I see several tutorials, but there is not much for a beginner though...

If you know any of resources for the ECDA that explains easier and with some implementation code, please let me know.

Thanks,
You have to start by understanding what Elliptic Curve Cryptography is before reading about signature algorithms. Check out this page[1] which has made it relatively easy to understand. You can also read the wikipedia article about ECC[2]. Then you can read ECDSA here[3] which has some pseudo codes too.
For actual implementation of ECDSA and ECSDSA (Schnorr) for bitcoin on curve secp256k1 you can see the library that bitcoin core uses here[4]

[1] https://blog.cloudflare.com/a-relatively-easy-to-understand-primer-on-elliptic-curve-cryptography/
[2] https://en.wikipedia.org/wiki/Elliptic-curve_cryptography
[3] https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm
[4] https://github.com/bitcoin-core/secp256k1
newbie
Activity: 29
Merit: 3
Thanks for all ur help.

BTW, ECDA (particularly secp256k1 that is for bitcoin) ...is there any good tutorial that I can read and study about?
I've googled, and I see several tutorials, but there is not much for a beginner though...

If you know any of resources for the ECDA that explains easier and with some implementation code, please let me know.

Thanks,
legendary
Activity: 3472
Merit: 10611
I also have seen someone starts 0x47.....Any difference?
BTW, Is there any technical detailed doc about this signature encoding/decoding so I can take look at?
As it was said, these numbers are the size. You can read more about bitcoin scripts on bitcoin wiki: https://en.bitcoin.it/wiki/Script
The data is pushed to the stack using a length/size OP code which you can find in the Constants table, the second row named N/A is covering any length between 1 byte to 75 bytes.
For example to push 1 byte we use 0x01 to push 60 bytes we use 0x3c ,... And to push 80 bytes we use 0x4c50 where 0x4c is OP_PUSHDATA1

Think of these are some sort of commands that the computer reads and decides what to do, one byte at a time. It works by starting at the first byte of the script, reads 1 byte and decides what to do next. For example when the first byte is 0x48 it knows that it has to read 72 raw bytes, so it does that. The next single byte it reads is 0x21 so again it knows it has to read 33 raw bytes. Each byte it reads it pushes them to a "stack" ie. first in last out array, however it doesn't know what these bytes are. It has to continue evaluating the script and the following bytes (the OP codes) can interpret these bytes any way they want.
For example the last byte that the interpreter reads is 0xac or OP_CHECKSIG which pops 2 items from that stack and evaluates them as public key and signature respectively.

You can't assume the first item is a signature even if it is pretty much always true. Take this tx for example: https://live.blockcypher.com/btc-testnet/tx/0895e97e9c4ce7ebe04e15e0835bb0788053fbfdbbb2f3f25f81631687d7b857/
The first item is a public key, the second is a signature and the third is the redeem script that is
OP_DUP OP_HASH160 OP_EQUALVERIFY OP_SWAP OP_CHECKSIG
Because of the OP_SWAP the two items are "swapped" before entering CheckSig.
legendary
Activity: 1918
Merit: 1728
Ok..let me make sure I understood correctly.

Say this is an example..

483045022100a3658e7cedeab2800add38516aa711de2f98259df55319a92a2fa73cbaf15d94022 049ed36b8c83b386e2ae4bef82328848d06e8f0a783a25203b573942f7a5c8c48012102296038d0 cba420126d7dc75fe7edc8f5604a5ef6874b034ac65b761a73faf503
~~
~
My understanding is correct, isn't it?

Thanks,

Yes! That's correct.

~~
I also have seen someone starts 0x47.....Any difference?
BTW, Is there any technical detailed doc about this signature encoding/decoding so I can take look at?

Thanks,

ECDSA algorithm takes two inputs, digest (double-SHA256 hash) of the transaction serialization and private key. It then produces 2 values as output - r and s. r and s are usually 32-byte values. Depending upon the values of R and S, we need to follow certain rules while encoding the raw transaction. BIP-66 has defined the rules for strict DER-encoding of signatures. Due to these rules, the length and bytes of the signature may differ from transaction to transaction. Some of the rules are:

1. Value of S - The value of s as yielded by ECDSA algorithm can be very large. But as per the current standard, its value should be less than: 0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5D576E7357A4501DDFE92F46681B20A0. If it is higher than that then it is required to be converted into low s (s'). In order to find low s, we have to subtract s from n (parameter in secp256k1 curve). Hence,

Code:
s' = n - s 
s' = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141 - s


2. Null byte if R as 32-byte signed integer is negative - This rule directly effects the length of signature. If the value of r as 32-byte signed integer is negative, we would append null byte i.e. 0x00 in front of it. This would make the length of r equals to 33 bytes.

The format of DER-encoding (in hexadecimal) is: 30 + length of signature + 02 + length of R + R + 02 + length of S + S

If R as 32-byte signed int is negative, signature will look like this:
Code:
3045022100XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0220YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

where 0x45 (69 in dec) is the length of signature
0x21 (33 in dec) is the length of R
XX...denotes 32 bytes of R
0x20 (32 in dec) is the length of S
YY...denotes 32 bytes of S

If R as 32-byte signed int is positive, signature will look like this:
Code:
30440220XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0220YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

where 0x44 (68 in dec) is the length of signature
0x20 (32 in dec) is the length of R
XX...denotes 32 bytes of R
0x20 (32 in dec) is the length of S
YY...denotes 32 bytes of S

Signature is then followed by an additional byte known as sighash flag denoting which part of the transaction is signed. The most common flag is 0x01 which depicts that all the inputs/outputs are signed. This is then followed by one more byte which denotes the length of public key. If the public key is compressed then its length will be 33 bytes which is denoted by 0x21 followed by public key.

The whole signature and public key serialization is preceded by a byte which denotes the length of the serialization. It can be either 0x48 or 0x47 depending upon the value of R. Hence, whole serialization will look like this:

Code:
If R as 32-byte signed integer is negative:
483045022100XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0220YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY0121ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

If R as 32-byte signed integer is positive:
4730440220XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0220YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY0121ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

where ZZ.. denotes 33 bytes of compressed public key (note: public key can be uncompressed as well)
newbie
Activity: 29
Merit: 3
Ok..let me make sure I understood correctly.

Say this is an example..

483045022100a3658e7cedeab2800add38516aa711de2f98259df55319a92a2fa73cbaf15d94022 049ed36b8c83b386e2ae4bef82328848d06e8f0a783a25203b573942f7a5c8c48012102296038d0 cba420126d7dc75fe7edc8f5604a5ef6874b034ac65b761a73faf503

0x48 -> 72 bytes
So 3045022100a3658e7cedeab2800add38516aa711de2f98259df55319a92a2fa73cbaf15d9402204 9ed36b8c83b386e2ae4bef82328848d06e8f0a783a25203b573942f7a5c8c4801
-> This is DER encoded signature.
0x21 -> 33 bytes
So 02296038d0cba420126d7dc75fe7edc8f5604a5ef6874b034ac65b761a73faf503
-> This is public key. In this case, it starts 02..so this is compressed state of the public key.

My understanding is correct, isn't it?

Thanks,
legendary
Activity: 3038
Merit: 4418
Crypto Swap Exchange
I also have seen someone starts 0x47.....Any difference?
Yes. 0x47 and 0x21 defines the size of the data to be pushed. 0x47 is the hex of 71 and 0x48 is the hex of 72, which indicates either a signature with a size of 71 bytes or 72 bytes respectively.
newbie
Activity: 29
Merit: 3
I looked at this field of payload...there r asm and hex ... It looks like signature and public key r encoded to store here...
Once u have signature and public key, then what kinds of encoding is used to get that hex one?
Quote
Is this just simply ToHex(ToByes((BobSigned + BobPubicKey)))?? I don't think so though...
You should think in terms of raw bytes then encode the final result to hex if you want to look at it. There are 2 parts in a P2PKH signature script:
1. Signature (r and s) which is encoded using an encoding called DER
2. Public key which is encoded as 33 or 65 bytes depending on its compressed state.
These two are placed inside a script as "data" and each data has to be pushed to the stack so they are preceded with a length: 0x480x21

I also have seen someone starts 0x47.....Any difference?
BTW, Is there any technical detailed doc about this signature encoding/decoding so I can take look at?

Thanks,
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Is this just simply ToHex(ToByes((BobSigned + BobPubicKey)))?? I don't think so though...

Alright so that means your asking about which parts of the raw transaction needs to be signed, to get from raw transaction to signed transaction.

First we must make the sighash which we get by running SHA256d on a certain structure we assemble from the raw transaction called a "preimage"

0100000001449d45bbbfe7fc93bbe649bb7b6106b248a15da5dbd6fdc9bdfc7efede83235e010000001976a914f86f0bc0a2232970ccdf4569815db500f126836188acffffffff014062b007000000001976a914f86f0bc0a2232970ccdf4569815db500f126836188ac0000000001000000

It is almost similar to the raw transaction except we put the Locking script where the sigscript Length is supposed to be, and at the end we append the sighash value that tells how the transaction should be signed which in the case is 1 (SIGHASH_ALL).

So we SHA256(SHA256()) the above hex to get the sighash.

After this we run ECDSA to sign the sighash using the private key to get our DER-encoded signature.

Then we can insert our sighash, DER signature, signing method type, public key and other needed info to the raw transaction.

So as you can see we don't sign the entire transaction just the sighash hashed from certain parts of the transaction.

The sigScript hex is as I described in my first post.
legendary
Activity: 3472
Merit: 10611
I looked at this field of payload...there r asm and hex ... It looks like signature and public key r encoded to store here...
Once u have signature and public key, then what kinds of encoding is used to get that hex one?
Quote
Is this just simply ToHex(ToByes((BobSigned + BobPubicKey)))?? I don't think so though...
You should think in terms of raw bytes then encode the final result to hex if you want to look at it. There are 2 parts in a P2PKH signature script:
1. Signature (r and s) which is encoded using an encoding called DER
2. Public key which is encoded as 33 or 65 bytes depending on its compressed state.
These two are placed inside a script as "data" and each data has to be pushed to the stack so they are preceded with a length: 0x480x21
newbie
Activity: 29
Merit: 3
Ok...my question was...
Here is my understanding, BTW,

According to https://developer.bitcoin.org/devguide/transactions.html,

"As illustrated in the figure above, the data Bob signs includes the txid and output index of the previous transaction, the previous output’s pubkey script, the pubkey script Bob creates which will let the next recipient spend this transaction’s output, and the amount of satoshis to spend to the next recipient. In essence, the entire transaction is signed except for any signature scripts, which hold the full public keys and secp256k1 signatures.

After putting his signature and public key in the signature script..."

So basically, in this case,

Bob.Sign(txID + output index of the previous transaction + previous pubkey script + next pubkey script + amount to spend to next one .., I believe like it says above ..the entire transaction ).....so...one that I am using is to return base64 encoded signed message.(say this stored in BobSigned variable)..then somehow I guess....I would like to create a function named scriptSig so that it should return scriptSig hex value..so..like... scriptSig(BobSigned,BobPublicKey) should return something like....."483045022100a3658e7cedeab2800add38516aa711de2f98259df55319a92a2fa73cbaf15d94022 049ed36b8c83b386e2ae4bef82328848d06e8f0a783a25203b573942f7a5c8c48012102296038d0 cba420126d7dc75fe7edc8f5604a5ef6874b034ac65b761a73faf503" .....

Is this just simply ToHex(ToByes((BobSigned + BobPubicKey)))?? I don't think so though...

Thanks,

legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Raw transactions are encoded into hex as a serialization format to compactly represent them. The hex format of a p2pkh has byte fields for:

- the version
- the number of inputs
- For each input:
-- the transaction ID, & output number
-- the length of scriptSig
-- a bunch of bytes called "sequence"
- the number of outputs
- For each output:
-- the amount to send
-- the locking script
- the locktime.

An example will greatly aid the presentation of what a raw P2PKH transaction is composed of, like the one located at https://medium.com/@bitaps.com/exploring-bitcoin-signing-the-p2pkh-input-b8b4d5c4809c :

They have a raw transaction that looks like this:



01000000

01

449d45bbbfe7fc93bbe649bb7b6106b248a15da5dbd6fdc9bdfc7efede83235e0100000000ffffffff

01

4062b007000000001976a914f86f0bc0a2232970ccdf4569815db500f126836188ac

00000000

This is the raw transaction for a 1 P2PKH input, 1 P2PKH output transaction.

We have a version of 1, number of inputs which is also 1, the transaction ID of the input we want to spend (it looks like it's written backwards but more on that later) as well as the output number within the transaction that, along with the transaction ID, fully identifies the input, then we have he sigscript length which is 0, because we didn't sign the transaction yet, our sequence number is 0xffffffff which disables Locktime [more on that below], then there's the number of outputs which is 1, followed by the amount to spend, and this particular value is hex for 1.29 BTC, then we have our locking script (OP_DUP OP_HASH160... but in hex) and finally we have the Locktime value which is completely zero here because it's turned off.

The reason why the above is relevant is because after we derive the sighash from it and sign the transaction, we fill in the sigScript part of the raw transaction that we left empty.

So what used to be 00 is now:

6b

483045022100e15a8ead9013d1de55e71f195c9dc613483f07c8a0692a2144ffa9050643682202206 2bc9466b9e1941037fc23e1cfadf24c8833f96942beb8f4340df60d506f784b

01

2103969a4ac9b1521cfae44a929a614193b0467a20e0a15973cae9ba1efb9627d830

Let's break this down:

The sigscript length is now 107 bytes. Our signature length is 72, and this measures the blue value which is our DER-encoded signature with r and s values and whatnot, then there's our sighash type which here is 01 or SIGHASH_ALL, followed by the length of the public key and the public key itself.

The reason why many of the values in the raw transaction (not the sigscript though) look "reversed" is because they are stored in little-endian format, so if the true value is 0x12345678, we will actually store it in the hex as 0x78563412. That is what you can see in the transaction ID, script version and output numbers.

newbie
Activity: 29
Merit: 3
Hello,

When the coin is spent of P2PKH, signature and public key is stored in scriptSig...if I am not mistaken. I looked at this field of payload...there r asm and hex ... It looks like signature and public key r encoded to store here...

Once u have signature and public key, then what kinds of encoding is used to get that hex one?

Thanks
Jump to: