Question on the scriptSig and scriptPubKey

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: gmaxwell on May 06, 2014, 03:40:45 PM

Quote from: DeathAndTaxes on May 06, 2014, 09:30:33 AM

restrict Bitcoin keys to a subset of secp256k1 keys (i.e. Bitcoin keys must be odd or they are invalid),A

And seriously disrupt all the kinds of clever derivation schemes that now exist, e.g. blinding for reality keys, etc. I'm glad Bitcoin was not hyperoptimized in that particular way.

You have a link/example. Always willing to learn but that not knowing the effect of the constraints is why I suggested two alternatives.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: DeathAndTaxes on May 06, 2014, 09:30:33 AM

restrict Bitcoin keys to a subset of secp256k1 keys (i.e. Bitcoin keys must be odd or they are invalid),A

And seriously disrupt all the kinds of clever derivation schemes that now exist, e.g. blinding for reality keys, etc. I'm glad Bitcoin was not hyperoptimized in that particular way.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: fbueller on May 06, 2014, 07:03:31 AM

The scripting system & wallet would be far more complicated if it had to anticipate if the public key was compressed or not..

Well the post was about how Bitcoin could have been done, not necessarily how it could be changed. There was no reason for it to support uncompressed keys to begin with. Without them everything would be simpler and smaller (including key recovery). As TN pointed out even for compressed keys it still can be either even or odd and there are three ways to handle it a) include a flag to indicate if the key is even/odd, b) restrict Bitcoin keys to a subset of secp256k1 keys (i.e. Bitcoin keys must be odd or they are invalid), c) try both potential keys.

Quote

This ensures that the state of the scriptSig and scriptPubKey that lead to a successful redeeming transaction (which can be anticipated by other people when doing verification) is captured by signing that data. Input scripts have no concept of other input's scripts.

That is correct this IS how Bitcoin works but it also makes the signing more complex and is the root cause for a large portion of the txid malleability. It isn't required for each key to sign a unique subset of the same tx in order to ensure that "scriptSig and scriptPubKey that lead to a successful redeeming transaction (which can be anticipated by other people when doing verification) is captured by signing that data".

You are right in that Bitcoin uses placeholders to construct a modified version of the transaction for each input to sign. The modified form is hashed and each input signs a different has because they are each signing a subset of the full transaction. Then the placeholders and replaced with the actual signatures (and redundant pubkeys). Finally yet another hash is taken to serve as the txid.

My point is all of that is just complexity for complexity sake. There is no gain compared to how just about any other digital signature system works. Take GPG for example you have a message which needs to be signed by one or more entities. The message in its entirety is hashed only once. If you wanted to index these message this single canonical hash would make a perfect "message id". This single hash is then signed by one or more keys which simply append to the signed message.

Message
Sig_0 of H(Message) using key_0
Sig_1 of H(Message) using key_1
....
Sig_n of H(Message) using key_n

A transaction is simply a message in an agreed upon format used to communicate the transfer of value.

Construct the transaction message in a canonical form. One example would be
Header
ListOfInputs
ListOfOutputs

Treat it just like a message and take a single hash of it. This single hash is the "txid" and the only digest signed by all signatures.

Sig_0 of Hash(TxId) using key(s) required for input_0
Sig_1 of Hash(TxId) using key(s) required for input_1
....
Sig_n of Hash(TxId) using key(s) required for input_n

Now append the signatures to the rest of the transaction.

Header
List of Inputs
List of Outputs
List of Signatures

To verify remove the signatures from the end of the message (tx).

Header
List of Inputs
List of Outputs

Verify the tx has valid form, meets magic numbers of protocol, outputs are unspent, etc.
Verify the TxId is correct by hashing the message (tx).
Take the stack of signatures and verify each one using the sig, the pubkey (explicit or recovered) and the TxId.

So I am not suggesting we change Bitcoin but if there is something I am missing let me know. Satoshi was a genius at the high level (forcing consensus using proof of work) but he made some questionable choices at the nuts and bolts level. This makes transactions more complex, harder to ensure they are immutable, and larger than necessary. For that added complexity, size, and risk I don't see anything in return. If there is a specific detail/example which requires the convoluted system Satoshi came up with regarding signatures please let me know.

Quote

It makes sense that the script should contain all and everything which is needed to verify the transaction was valid.

The PubKey can be deterministically recovered. We have unit tests for valid wallet operations which are far more complex than key recovery. I would point out the signed message system in Bitcoin does use key recovery. This is how you can verify a message with just the message and signature (and optionally the address but not PubKey for added transparency). What is a transaction other than message about the transfer of value?

TN point out the true "issue" and that is that it is a computing vs space tradeoff. Transactions are smaller (less storage, less bandwidth requirements, less propagation delay) but they would take longer to verify. The rate of computing power has grown faster than the rate of end user bandwidth and the range of computing power is much smaller than the range of available bandwidths (think ARM processor vs highest XEON & dialup/sat vs gigabit fiber). Still it is a valid debate to have.

Anyways it is water under the bridge now. I harbor no illusions that Bitcoin will ever be changed this radically as the consensus system makes it very conservative to optional breaking changes. Still it would be an interesting area for an altcoin (of course that assumes they are actually about innovation not just stales copies of the original).

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: fbueller on May 06, 2014, 07:03:31 AM

I'm rather certain that Bitcoin-qt's wallet searches for relevant transactions by checking each tx's scriptPubKey for a 20 byte hash that matches a hashed public key or script in the wallet

Pretty sure too. I think it also scans for P2SH hashes too. (assuming OP_CHECKSIG).

Quote

I wonder if asking people to reconstruct the public key from the signature lead to problematic scenarios of clients not getting it right (compressed keys?), as they could reconstruct it, see it does not hash to the one in the scriptPubKey, and fail validation? It makes sense that the script should contain all and everything which is needed to verify the transaction was valid.

The script is

OP_DUP OP_HASH160

and spent with

There was a thread on it here.

The script would be

OP_GENPUBKEY OP_HASH160

OP_EQUALVERIFY

and spent with

(This assumes OP_GENPUBKEY leaves the signature on the stack)

The sig type encodes information required to determine which public key was used. There are 4 possible keys (presumably, odd/even and compressed/uncompressed).

When the wallet is scanning the block chain, it can still scan for the address hex value, so it is still fast.

Recovering the key requires an EC point multiply. This would increase (maybe double) signature verification time.

fbueller

sr. member

Activity: 412

Merit: 287

Quote from: DeathAndTaxes on May 04, 2014, 10:42:19 AM

Quote from: telepatheic on May 04, 2014, 08:05:26 AM

See this wiki page for more details, the data which is signed is effectively:

Code:

SHA256(SHA256(modified_transaction))

The modified transaction is very complicated to construct and removes the signature and public key and inputs that are not being signed.

It would have been far simpler to do something like.

Step 1) Construct entire transaction (minus signature) in a canonical form
Step 2) Hash the entire transaction. This becomes the tx_id as well as the digest for the signature
Step 3) Sign the hash in step 2 with the private key(s) and append to signature body.

You would end up with something like:
tx header
in[n] (list of inputs)
out[n] (list of outputs)
sign[n] (list of signatures)

The message digest (the tailored transaction) which is signed when adding a signature to an input is unique for that input. The message is the current transaction, with all inputs scriptSigs set to empty but the one you're trying to sign, which must contain the scriptSig, and the scriptPubKey (or redeemScript in P2SH). Since each hash is different this could not be used to identify the transaction.

This ensures that the state of the scriptSig and scriptPubKey that lead to a successful redeeming transaction (which can be anticipated by other people when doing verification) is captured by signing that data. Input scripts have no concept of other input's scripts.

I'm rather certain that Bitcoin-qt's wallet searches for relevant transactions by checking each tx's scriptPubKey for a 20 byte hash that matches a hashed public key or script in the wallet; I wonder if asking people to reconstruct the public key from the signature lead to problematic scenarios of clients not getting it right (compressed keys?), as they could reconstruct it, see it does not hash to the one in the scriptPubKey, and fail validation? It makes sense that the script should contain all and everything which is needed to verify the transaction was valid.

The scripting system & wallet would be far more complicated if it had to anticipate if the public key was compressed or not..

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: gmaxwell on May 04, 2014, 08:10:48 PM

Quote from: TierNolan on May 04, 2014, 07:02:15 PM

The basic signing process should just use hash(transaction without inputs | hash_type) as the signing hash. The signing hash should be used to refer to the previous transaction.

Oh no, that wouldn't be good in general, at least unless you could opt out of it.

You would still be able to select a hash_type.

By "without inputs", I meant without the input scripts. Copying the scriptPubKey to the input script is related to OP_CODESEPERATOR and isn't actually of any benefit for standard transactions right?

Quote

Consider, You pay Alice. The transaction isn't confirming because your fees were not competitive. So you double spend its inputs in a new transaction with better fees in order to achieve atomic exclusion. Oops: Moments after your replacement transaction a prior payer, Peggy, poses a parallel payment and in your present position this is no perk since her payments were paired: price and pubkey parroted. Preclusion prevented by a profusion of parallel property, both payments are processed and Alice, pleased with her profit, parts leaving you peevish.

Heh, though loss of clarity due to the alliteration.

gmaxwell

staff

Activity: 4326

Merit: 8951

Quote from: TierNolan on May 04, 2014, 07:02:15 PM

The basic signing process should just use hash(transaction without inputs | hash_type) as the signing hash. The signing hash should be used to refer to the previous transaction.

Oh no, that wouldn't be good in general, at least unless you could opt out of it.

Consider, You pay Alice. The transaction isn't confirming because your fees were not competitive. So you double spend its inputs in a new transaction with better fees in order to achieve atomic exclusion. Oops: Moments after your replacement transaction a prior payer, Peggy, poses a parallel payment and in your present position this is no perk since her payments were paired: price and pubkey parroted. Preclusion prevented by a profusion of parallel property, both payments are processed and Alice, pleased with her profit, parts leaving you peevish.

There certantly are cases where it would be good to be able to mask the inputs— generally where you're doing something interesting where you'd be absolutely sure to never reuse a public key as part of your protocol— but in the common case, the addition control precision is very important, not just against preventing stupidity but to avoid suffering losses due to inconsistency which is inherent in a distributed system.

And fengshu, it's generally preferred that people not bump old threads.

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: DeathAndTaxes on May 04, 2014, 06:32:08 PM

Or if like any other digital signature system the entire message (in this case the complete tx minus the siganture(s) was hashed and signed then there would be no difference between the tx id (hash) and the digest of the signature (the exact same hash).

There are some benefits in being able to blank parts of the transaction out.

Malleability itself could have been fixed, if the tx-id of the input wasn't included in the signing process.

Having said that, I broadly agree.

The basic signing process should just use hash(transaction without inputs | hash_type) as the signing hash. The signing hash should be used to refer to the previous transaction.

The extra complexities could be added with different hash_types, if absolutely necessary.

Everything, including the signatures, should be included in the hash for the block merkle root though. That makes it much easier for archiving purposes and initial download.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: TierNolan on May 04, 2014, 05:50:53 PM

It already works that way, except the sign[n] values are added to the inputs after signing.

Which is pointless but the reality is that isn't correct. Part of the inputs ARE included and part of the inputs are placeholders and then this convoluted mess is arranged into a modified transaction and then signed. Then after the fact the signature is dumped back into where the placeholders are.

Quote

If the tx-id wasn't affected by the transaction inputs, then malleability would not be an issue.

Or if like any other digital signature system the entire message (in this case the complete tx minus the siganture(s) was hashed and signed then there would be no difference between the tx id (hash) and the digest of the signature (the exact same hash).

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: telepatheic on May 04, 2014, 05:59:46 PM

It isn't that simple, the current input, with the signature and anything before the last OP_CODESEPARATOR removed, are signed (In fact even that is a gross simplification). See https://en.bitcoin.it/wiki/OP_CHECKSIG

For transaction malleability, that doesn't matter. If the script for the output being sent doesn't have an OP_CODESEPARATOR in it, then you can ignore that effect.

Assuming the spender is spending standard transaction coins, there won't be any OP_CODESEPARATORS to deal with.

With normal transactions, to get the signature hash

- you blank out all the inputs
- copy the scriptPubKey of the output you are spending into the input you are signing
- add the hash_type to the end of the transaction (expanded to 4 bytes)

Everything that is signed is locked-down.

The problem is the "blank all the inputs" step, that means that the inputs don't affect the signing hash but they do affect the txid hash.

If the tx-id wasn't affected by the transaction inputs, then malleability would not be an issue.

You send funds to a particular tx-id, and then the transaction is changed, and so the funds are credited to a different transaction output. This breaks refund transactions which assumed a particular tx-id.

telepatheic

jr. member

Activity: 56

Merit: 1

Quote from: TierNolan on May 04, 2014, 05:50:53 PM

The inputs aren't signed (since they are set to zero length arrays), so you can add the signatures to the transaction without breaking the signature.

It isn't that simple, the current input, with the signature and anything before the last OP_CODESEPARATOR removed, are signed (In fact even that is a gross simplification). See https://en.bitcoin.it/wiki/OP_CHECKSIG

TierNolan

legendary

Activity: 1232

Merit: 1094

Quote from: DeathAndTaxes on May 04, 2014, 10:42:19 AM

Sadly it is probably one of the worst decisions Satoshi made. The complexity adds nothing and is one of the root causes for tansaction malleability. Not really sure what Satoshi was trying to acheive. Normally the signature is OUTSIDE of the payload and trying to stick it back inside the input adds no value.

The signature is outside the payload. You wouldn't be able to sign it otherwise.

To calculate hash for signing, you set all the input scripts to length zero (and copy some other stuff around) and then get the hash of the result.

This "locks" the transaction in place and prevents any changes without breaking the signature.

The inputs aren't signed (since they are set to zero length arrays), so you can add the signatures to the transaction without breaking the signature.

The tx-id hash depends on the signed part of the transaction and the added signatures.

Malleability is caused by the fact that you can encode the signature in many ways.

In psuedo code:

Code:

signing hash = Hash(transaction with inputs deleted)

signature = sign(signing hash, private key)

final transaction = transaction with signature added to the inputs

tx-id = hash(final tranasction)

The signature is basically two numbers. It would be like encoding 123, 456 as 0123, 0456. They both represent the same pair of number, so are both valid signatures.

The ideal solution would be to use the signing hash[ * ] to refer to previous inputs and the (current) tx-id just for computing the merkle tree in the blocks.

[ * ] The signing hash is the hash of the transaction with all inputs set to zero

Quote from: DeathAndTaxes on May 04, 2014, 10:42:19 AM

You would end up with something like:
tx header
in[n] (list of inputs)
out[n] (list of outputs)
sign[n] (list of signatures)

To verify:
Step 1) Remove the signatures from the tx body and save.
Step 2) Hash the remaining tx. This is the tx id and the digest for the signature verification.
Step 3) Verify each of the signatures using the pubkey(s) and the transaction hash.

It already works that way, except the sign[n] values are added to the inputs after signing.

Nicolas Dorier

hero member

Activity: 714

Merit: 662

Quote from: telepatheic on May 04, 2014, 03:02:26 PM

Your code looks really good. What unit tests do you use (how do you make sure you don't break compatibility with bitcoin core) ?

I ported the unit tests of bitcoin core, along with their own data driven tests. (Actually, I had to implement some bugs of the core implementation to not break compatibility, like the openssl bug I documented in the ScriptEvaluationContext.CheckSig method)
Clone my project, I gave the category "Core" to the tests coming from the core implementation.

In fact I ported their tests before implementing them.

My Node Server implementation is not 100% done yet, but you can start to talk with the network.
The crypto part is entirely ported though.

telepatheic

jr. member

Activity: 56

Merit: 1

Your code looks really good. What unit tests do you use (how do you make sure you don't break compatibility with bitcoin core) ?

Nicolas Dorier

hero member

Activity: 714

Merit: 662

I posted an article yesterday where I explain that
Please, take a look, if you like it, vote http://www.codeproject.com/Articles/768412/NBitcoin-The-most-complete-Bitcoin-port-Part-Crypt Wink

Satoshi's code is somewhat hard to read for someone not used to C++ dev,
My port is more easy to understand : https://github.com/NicolasDorier/NBitcoin]https://github.com/NicolasDorier/NBitcoin]https://github.com/NicolasDorier/NBitcoin (Script class)
A signature is represented by the type TransactionSignature.

The process to determine the hash that you need to sign with your key is specified in the Script.SignatureHash method.
https://github.com/NicolasDorier/NBitcoin/blob/master/NBitcoin/Script.cs#L358

telepatheic

jr. member

Activity: 56

Merit: 1

Reading Satoshi's original code is extremely insightful. The script.cpp file has basically not changed since Satoshi wrote it. Everyone has been too scared to suggest changing it to something logical. I've yet to see any real use of the scripting functionality beyond multi-signatures. There must have been some big idea behind it but nobody knows what, even Gavin doesn't have a clue why Satoshi wrote it like he did.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: telepatheic on May 04, 2014, 08:05:26 AM

See this wiki page for more details, the data which is signed is effectively:

Code:

SHA256(SHA256(modified_transaction))

The modified transaction is very complicated to construct and removes the signature and public key and inputs that are not being signed.

Sadly it is probably one of the worst decisions Satoshi made. The complexity adds nothing and is one of the root causes for tansaction malleability. Not really sure what Satoshi was trying to acheive. Normally the signature is OUTSIDE of the payload and trying to stick it back inside the input adds no value.

It would have been far simpler to do something like.

Step 1) Construct entire transaction (minus signature) in a canonical form
Step 2) Hash the entire transaction. This becomes the tx_id as well as the digest for the signature
Step 3) Sign the hash in step 2 with the private key(s) and append to signature body.

You would end up with something like:
tx header
in[n] (list of inputs)
out[n] (list of outputs)
sign[n] (list of signatures)

To verify:
Step 1) Remove the signatures from the tx body and save.
Step 2) Hash the remaining tx. This is the tx id and the digest for the signature verification.
Step 3) Verify each of the signatures using the pubkey(s) and the transaction hash.

Honestly I have no idea what Satoshi was trying to accomplish with the overly complicated mess that is Bitcoin tx signatures but given a few other questionable decisions (using uncompressed pubkeys, non-canonical signatures, including pubkey in inputs when they could be reconstructed from the signature, etc) I believe as smart as Satoshi was ECDSA wasn't his strong suit. He used it but he wasn't an expert at it.

telepatheic

jr. member

Activity: 56

Merit: 1

See this wiki page for more details, the data which is signed is effectively:

Code:

SHA256(SHA256(modified_transaction))

The modified transaction is very complicated to construct and removes the signature and public key and inputs that are not being signed.

fengshu

newbie

Activity: 14

Merit: 0

Quote from: Gavin Andresen on February 07, 2013, 10:38:03 PM

Quote from: BitcoinScholar on February 07, 2013, 08:42:07 PM

The other component is an ECDSA signature over a hash of a simplified version of the transaction.

The magic of public key crypto is that you can give somebody your public key, some data, and a signature, and they can be certain that:

a) that particular signature could only have been created by somebody that has the private key that corresponds to the public key
b) the data hasn't been changed in any way

They don't need to know the private key-- you keep it secret.

The "hash over..." bit is the way digital signatures work-- you sign a hash of the data, and not the data itself, because the hash is much smaller.

The "...simplified version of the transaction" bit is complicated. The data signed is the transaction minus all it's scriptSig signatures, plus (almost always) the previous transaction's scriptPubKey. See the OP_CHECKSIG page on the wiki for all the gory details.

does the signature in the transaction like this:
ECDSASignature(Hash(Transaction-scriptSig)+PreTransaction_scriptPubKey)

?

BitcoinScholar

newbie

Activity: 32

Merit: 0

I understand the scriptSig and scriptPubKey mostly now. What happens in a transaction is A private key goes through the process of ultimately proving that the address the BTC was "sent" to matches. This initially "signs" it but then it must be verified with the public-key. The public-key then also verifies the signature and provides further evidence that the signature is valid. This is now open to go to the scriptPubKey portion of the script that ultimatly just specifies which address is the recipient of the BTC or signature. Within the output is also the quantity sent. Then when this is received by the new owner they have to go through the same input process, prove possession of the address storing the value, etc., etc., thus the system acts as a series of electronic signatures.

Looking at the necessary things for the input portion of a transaction, I see that they are 1) the previous tx 2) the index and 3) the scriptSig. I understand the scriptSig now(In a basic way I believe) and the index, which just refers to the specific output of the tx in question. I don't understand however how the "previous tx" represents the previous transaction(sounds strange but let me explain). Does the previous tx represent a hash of the finished signature of the previous tx? And how is this used as an essential part of the input.

I have a few little theories. Maybe the "previous tx" section of an input is just a hash of the referenced output address? Again, maybe it's just a hash of the previous tx signature? But then, I think, scriptPubKey only gives the output address. If scriptPubKey gives more than just the output address I think it would explain this last piece of the puzzle and I'd understand the basics of the script process. My impression is that scriptPubKey just contains the value and the output address.

Topic: Question on the scriptSig and scriptPubKey (Read 6324 times)