Author

Topic: Best way to extract addresses from sigScript and PkScript? (Read 1611 times)

member
Activity: 116
Merit: 10
Yes, exactly.

Understanding this is the key to making sense of compressed keys and WIFs too.

I understand compressed keys and WIF's Wink

Anyway, I've decided not to do a simple memcmp, but rather something more generic:

Code:
        internal static Hash160 GetAddress(byte[] aScript)
        {
            Hash160 hash = null;
            ParseState parseState = ParseState.OP_ANY;
            ScriptParser parser = new ScriptParser(aScript, delegate(Opcode aOpcode, byte[] aData)
            {
                switch (parseState)
                {
                    case ParseState.OP_ANY:
                        if (aOpcode == Opcode.OP_HASH160)
                            parseState = ParseState.OP_20;

                        if ((aOpcode == (Opcode)33) || (aOpcode == (Opcode)65))
                        {
                            if (hash != null)
                                return false;

                            if (Utils.VerifyPublicKey(aData, aData.Length) != 1)
                                return false;

                            hash = new Hash160(Utils.RIPEMD160SHA256(aData));
                        }
                        break;

                    case ParseState.OP_20:
                        if (aOpcode != (Opcode)20)
                            return false;

                        if (hash != null)
                            return false;

                        hash = new Hash160(aData);
                        parseState = ParseState.OP_EQUALVERIFY;
                        break;

                    case ParseState.OP_EQUALVERIFY:
                        if (aOpcode != Opcode.OP_EQUAL && aOpcode != Opcode.OP_EQUALVERIFY)
                            return false;

                        parseState = ParseState.OP_ANY;
                        break;
                    default:
                        return false;
                }
                return true;
            });

            if (parser.Parse())
            {
                return hash;
            }

            return null;
        }

kjj
legendary
Activity: 1302
Merit: 1026
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?

Yes, exactly.

Understanding this is the key to making sense of compressed keys and WIFs too.
legendary
Activity: 2053
Merit: 1356
aka tonikt
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?

yes. but in the first block, they were using different scripts
member
Activity: 116
Merit: 10
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?
kjj
legendary
Activity: 1302
Merit: 1026
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalksearch.org/topic/m.1348297

I have a full script engine, but I'm writing an SPV node now that isn't doing full script evaluation. It just wants to extract the addresses to test against the local wallet. I guess it doesn't really matter too much if I get false positives, since they'll be filtered out by the bloom filter / local db anyway.

In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.
legendary
Activity: 2053
Merit: 1356
aka tonikt
I guess you only need to analyze the data from pk_scripts, from unspent outputs - the input addresses will just be tx ids (who cares).
They (txout scripts) are in general pretty much straight forward,
Recently a hash, with something.
Previously there was the entire public key.
member
Activity: 116
Merit: 10
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalksearch.org/topic/m.1348297

I have a full script engine, but I'm writing an SPV node now that isn't doing full script evaluation. It just wants to extract the addresses to test against the local wallet. I guess it doesn't really matter too much if I get false positives, since they'll be filtered out by the bloom filter / local db anyway.

legendary
Activity: 2053
Merit: 1356
aka tonikt
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?
Ultimately the is no such thing as the address.
Yes you can do these things and they will work in 90+% of cases, but never for all of them.
If you sell it to a miner, you can introduce a tx that can create a new address format - whatever you can think of.
I guess you should be able to name it then Wink

As for extracting the address from the signature - it's somehow possible, though from what I recall the output might be an address that is a false one. So it only makes sense if you have the right one, so you can check.
legendary
Activity: 1120
Merit: 1152
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?

I have a pull-req that might help you: https://github.com/bitcoin/bitcoin/pull/2830

Basically it adds a "decodescript" RPC call that takes a script and decodes it into human readable form. That includes the opcodes as well as the address itself, either pay-to-script-hash (standard addresses) or P2SH. (or non-standard if the script doesn't match a standard address form) It also lets you calculate the P2SH address that would correspond to that script - if you don't understand what I mean by that statement read up on P2SH.

The bigger question though is why exactly do you need to do this? Are you trying to write a library?
kjj
legendary
Activity: 1302
Merit: 1026
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalksearch.org/topic/m.1348297
member
Activity: 116
Merit: 10

In perl regex

^76a914(.{40})88ac$

Great, now there's another thing I have to figure out. Thanks though. Smiley
legendary
Activity: 1792
Merit: 1111
Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?

What would be the criteria for a regular expression then? There's 20 bytes in an address, all of them random. Smiley



In perl regex

^76a914(.{40})88ac$
member
Activity: 116
Merit: 10
Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?

What would be the criteria for a regular expression then? There's 20 bytes in an address, all of them random. Smiley

legendary
Activity: 1792
Merit: 1111
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?

Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?
member
Activity: 116
Merit: 10
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?
Jump to: