Author

Topic: World's fastest and simplest block parser for those who (only) need all HASH160 (Read 337 times)

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
In fact it surprises me how there are any brainwallets left that *haven't* been broken into. A GPU cluster can break into a lot of things nowadays.
That's easy: create something long enough, and it will never be cracked. Add your phone number or something else that's unique to only you and it becomes a lot harder to find with "generic" brute-forcing.
Of course, this is still not recommended and I'm pretty sure many people would lose their money if they attempt this. But I'm equally sure I'd be able to create one. Then again, what's the point if the string to remember becomes longer than a 12 seed word phrase?
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
BTCW:

Have you used this to create a full list of hash160s, in all blocks?

Or at least the ones from the mid-2010s time frame. Since the blockchain only started filling up with blocks from 2017 onwards, and brainwallets stopped being created so much before that, I think the majority of brainwallets can be captured by processing the blocks from that time period.

In fact it surprises me how there are any brainwallets left that *haven't* been broken into. A GPU cluster can break into a lot of things nowadays.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Code:
cat mempool.dat | xxd -p | tr -d '\n' | tr -d '\r'| sed 's/0100000000000000[0-9a-fA-F]\{4\}0000000000000100000000010/0100000000010/g' | sed 's/00000000000000000\([1-2]\)000000/\n0\1000000/g' | sed 's/0\{9,30\}\n/00000000\n/g' > mempool.hex
If anyone wants to play with it without running Bitcoin Core: download mempool.hex. If there's demand, I can do regular updates. I created this dump a few minutes ago from my pruned Bitcoin Core.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
For those interested in mempool analysis (I am), I created a new "bash pipe monster" that isn't pretty but gets the job done.

First, in Bitcoin Core (fully synced and indexed), simply "savemempool", and it will spit out a file called mempool.dat

Along the lines, pun intended, of the OP, this will make it human readable:

Code:
cat mempool.dat | xxd -p | tr -d '\n' | tr -d '\r'| sed 's/0100000000000000[0-9a-fA-F]\{4\}0000000000000100000000010/0100000000010/g' | sed 's/00000000000000000\([1-2]\)000000/\n0\1000000/g' | sed 's/0\{9,30\}\n/00000000\n/g' > mempool.hex

It should work on both Windows and *nix. The output, mempool.hex, will be a text file of all mempool transactions (a snapshot according to your node), in which every row is a Bitcoin raw transaction in serialized hexadecimal code.

It's not 100% perfect; some rows are empty, but ~99% of the rows are transactions that https://live.blockcypher.com/btc/decodetx/ think are OK and/or have already been broadcast.

My last round gave me ~42k candidate transactions (one per row) and the size of mempool.dat was close to 100 MB and the size of the resulting mempool.hex almost double.

It is good enough for my purposes (currently: signature analysis). Do with it as you wish! (Don't ask me to explain the regex; it took some severe trial and error. It helped when I realized 16 zeros in a row are a separator between transactions [undocumented], and it allowed me to insert a newline character at the right spot.)





 

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
On my VirtualBox with Ubuntu, the sort behavior is, strangely, the exact opposite; without a flag, it uses 100% RAM, which is not cool either.
That is what I read indeed: it's supposed to use all RAM. But I observed the opposite, so it may vary from system to system.



I found this:
‘--parallel=n’

Set the number of sorts run in parallel to n. By default, n is set to the number of available processors, but limited to 8, as there are diminishing performance gains after that. Note also that using n threads increases the memory usage by a factor of log n.
I can't find better source, because log 1=0 so it doesn't make sense as a factor. But the increased memory consumption does make sense. I'm not sure how much data this product produces, but for my other projects less RAM means less writing to the (rather slow) HDD, so the total speed may improve by using less threads for sort.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
For all .dat files in one go, start with "cat blk* |" instead
I often use sort on large data sets, and can recommend to change the sort command to sort -u -S60% (assuming you have at least 60% of your RAM available). This largely reduces the need to write /tmp-files. Without this, sort uses only about 0.1% of the system's memory.
Depending on how big the final list is, this may or may not matter for performance.

Nice addition. Yes! Do use the -S flag. (On my VirtualBox with Ubuntu, the sort behavior is, strangely, the exact opposite; without a flag, it uses 100% RAM, which is not cool either. 60% is a sweet spot indeed.)
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
For all .dat files in one go, start with "cat blk* |" instead
I often use sort on large data sets, and can recommend to change the sort command to sort -u -S60% (assuming you have at least 60% of your RAM available). This largely reduces the need to write /tmp-files. Without this, sort uses only about 0.1% of the system's memory.
Depending on how big the final list is, this may or may not matter for performance.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
Minor additions

These are the regex I've fiddled with for the three most common address formats.

Legacy // P2PKH // "1-addresses":
Code:
cat blk* | xxd -p | grep -oE '1976a914[0-9a-f]{40}88ac' | sed 's/1976a914//;s/88ac//' | sort -u > 1.hex

Wrapped Segwit // P2WPKH-P2SH // "3-addresses"
Code:
cat blk* | xxd -p | grep -oE '17a914[0-9a-f]{40}87' | sed 's/17a914//;s/87//' | sort -u > 3.hex

Native Segwit // P2WPKH // Bech32 // "bc1-addresses"
Code:
cat blk* | xxd -p | grep -oE '0000160014[0-9a-f]{40}' | sed 's/0000160014//' | sort -u > bc1.hex

We talked about false positives before. I think P2WPKH-P2SH ("3-addresses") may be somewhat problematic using this method, as we can't know from a simple regex search that they are exactly P2WPKH-P2SH; "3-adresses" may be containers for several other address types. In other words: False positives are expected, but in a brute-force attack, not a show-stopper, since these will cancel themselves in the steps that follow after Brainflayer says yes.

I am very well aware there are several other address types. None of them should be of interest if you're into Brainflayer stuff though? I mean, you could dump all Taproot tx for example, but what would be the use case?

To illustrate a possible variation, changing the prefix/suffix strings to capture all tx hashes is straightforward. I'll leave it to you to create a perfect regex. (Using old tx hashes as private keys has happened hundreds of times before; however weird it sounds and cryptographically idiotic, humans are humans...)
copper member
Activity: 906
Merit: 2258
I found another false positive: you can create a public key, that will contain those bytes. It can be unused, then it is easy, and just equivalent to a stack push. But you can also mine it, and after 2^48 operations, you will get those 6 bytes, where they should be.
Code:
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EE0
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EE1
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EE7
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EE8
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EEA
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EED
02 1976A914108F77AA04C12E58C6C9213FAF20F16668C74AED88AC09B95C709EEE
And then, you can have for example 2-of-3 multisig, where this key is unused, but pushed on the stack. Or you can mine it as a "vanity public key", and then use it.

Edit: another one, probably faster than mining some public key: you can mine transaction hash, like it was done here: https://mempool.space/tx/000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51

Then, doing 2^48 SHA-256 operations is probably faster than checking 2^48 ECDSA public keys. In general, if you have any place, where you can put some 256-bit value, you can create a false positive.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
Assuming you're in the terminal and the ~/blocks folder, just:

Code:
cat blk00000.dat| xxd -p | grep -oE '1976a914[0-9a-f]{40}88ac' | sed 's/1976a914//;s/88ac//' | sort -u > P2PKH-unique00000.txt

It shouldn't take longer than 1-2 seconds per .dat file, even on an old budget computer.

I tried the command and the result seems to be correct from quick check. It's also fast, although i use SSD and my computer isn't very old either. And since i'm not familiar with blk file structure, do you think there's possibility of false positive?

Yeah, about the title. I think this is the world's fastest and simplest solution; I will, of course, change it when/if proven wrong.

Although i bet some low level programmer would take your statement as challenge to build something faster Tongue.

1. One source of false positives I can think of would be if the pattern [ s ] appears elsewhere, such as in pushed arbitrary data (OP_RETURN). From a purely statistical perspective, the chance is virtually 0. Unless some people and/or miners use such for something (maybe Layer 2 stuff, or simply to f* around). I hypothesize false positives on the actual mainnet blockchain are exceptionally rare. Still, I haven't taken the time to quantify the exact number (and I would be surprised if it wasn't very, very close to or exactly 0).

2. The blk files simply the blocks as they come, the tx data bundled into ~130 MB "units" for convenience/standard, and saved as raw bytecode to be space-efficient (using all 256 bits (0x00-0xFF) instead of limiting itself to the number of an alphabet (hexadecimal is only 16 "letters", duh!). That's why they look funny, and maybe get you to think "this gotta be some weird code", if you open them as they are without first converting to something human-readable.

3. Way above my pay grade, but of course, I am all for it! Low-level-geeks, give it a try! I will credit you.
copper member
Activity: 906
Merit: 2258
Quote
And since i'm not familiar with blk file structure, do you think there's possibility of false positive?
Of course. The only thing that happens is that it tries to find "1976a914<20_bytes>88ac" in raw bytes. That means, if you put such value on a stack, it will be reported, even if it is not the real address. The same will happen if you wrap it inside OP_RETURN.

You can try running it on regtest, for example with a transaction similar to this one:
Code:
decoderawtransaction 02000000000101af811dc710445dc189131791c37757783ebd83ff1f73140b5322db9875c585ee0000000000fdffffff0200f2052a0100000005827701198700000000000000001b6a1976a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac02473044022038aa002c72932ce9081943804ad4d61f31affea22a5f6c840da8c5cb7296fe920220316ad5fc8130cc662e3d920ff8118258314ad308ee04f7e80715dfb4683a68280121038469881ff2b56c969b3abc74ed3863cc6836f3cabd107fbe20aeafb61a5a471700000000
{
  "txid": "b60ccfab381f272e6ff8adf761ea8ecf26955527e49423a146d1dd4426f634d6",
  "hash": "d683804ede8342c1658a8ddb502026d09fb8ff3659583f3d5edee60c351385f5",
  "version": 2,
  "size": 210,
  "vsize": 129,
  "weight": 513,
  "locktime": 0,
  "vin": [
    {
      "txid": "ee85c57598db22530b14731fff83bd3e785777c391171389c15d4410c71d81af",
      "vout": 0,
      "scriptSig": {
        "asm": "",
        "hex": ""
      },
      "txinwitness": [
        "3044022038aa002c72932ce9081943804ad4d61f31affea22a5f6c840da8c5cb7296fe920220316ad5fc8130cc662e3d920ff8118258314ad308ee04f7e80715dfb4683a682801",
        "038469881ff2b56c969b3abc74ed3863cc6836f3cabd107fbe20aeafb61a5a4717"
      ],
      "sequence": 4294967293
    }
  ],
  "vout": [
    {
      "value": 50.00000000,
      "n": 0,
      "scriptPubKey": {
        "asm": "OP_SIZE OP_NIP 25 OP_EQUAL",
        "desc": "raw(8277011987)#gm6fdvfy",
        "hex": "8277011987",
        "type": "nonstandard"
      }
    },
    {
      "value": 0.00000000,
      "n": 1,
      "scriptPubKey": {
        "asm": "OP_RETURN 76a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac",
        "desc": "raw(6a1976a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac)#kgcumtgs",
        "hex": "6a1976a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac",
        "type": "nulldata"
      }
    }
  ]
}

Or maybe that one:
Code:
decoderawtransaction 0200000001d634f62644ddd146a12394e427559526cf8eea61f7adf86f2e271f38abcf0cb6000000001a1976a914108f77aa04c12e58c6c9213faf20f16668c74aed88acfdffffff0100f2052a01000000160014108f77aa04c12e58c6c9213faf20f16668c74aed00000000
{
  "txid": "51a53ab132edf078112d98c48a4d3df64061f7689f362a75b5a25712ebc25777",
  "hash": "51a53ab132edf078112d98c48a4d3df64061f7689f362a75b5a25712ebc25777",
  "version": 2,
  "size": 108,
  "vsize": 108,
  "weight": 432,
  "locktime": 0,
  "vin": [
    {
      "txid": "b60ccfab381f272e6ff8adf761ea8ecf26955527e49423a146d1dd4426f634d6",
      "vout": 0,
      "scriptSig": {
        "asm": "76a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac",
        "hex": "1976a914108f77aa04c12e58c6c9213faf20f16668c74aed88ac"
      },
      "sequence": 4294967293
    }
  ],
  "vout": [
    {
      "value": 50.00000000,
      "n": 0,
      "scriptPubKey": {
        "asm": "0 108f77aa04c12e58c6c9213faf20f16668c74aed",
        "desc": "addr(bcrt1qzz8h02sycyh933kfyyl67g83ve5vwjhd82v7z9)#0w4jaj80",
        "hex": "0014108f77aa04c12e58c6c9213faf20f16668c74aed",
        "address": "bcrt1qzz8h02sycyh933kfyyl67g83ve5vwjhd82v7z9",
        "type": "witness_v0_keyhash"
      }
    }
  ]
}
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Assuming you're in the terminal and the ~/blocks folder, just:

Code:
cat blk00000.dat| xxd -p | grep -oE '1976a914[0-9a-f]{40}88ac' | sed 's/1976a914//;s/88ac//' | sort -u > P2PKH-unique00000.txt

It shouldn't take longer than 1-2 seconds per .dat file, even on an old budget computer.

I tried the command and the result seems to be correct from quick check. It's also fast, although i use SSD and my computer isn't very old either. And since i'm not familiar with blk file structure, do you think there's possibility of false positive?

Yeah, about the title. I think this is the world's fastest and simplest solution; I will, of course, change it when/if proven wrong.

Although i bet some low level programmer would take your statement as challenge to build something faster Tongue.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
And we want all hash 160 strings ever used for what?

Ah, sorry. So-called Brainwallets were popular in the mid-2010s. The tool I linked in the OP, Brainflayer (whose author is lurking here Smiley ), was popular for snatching poorly protected addresses generated by a simple SHA256(passphrase).

People still seem to spend a lot of time and effort making a complete HASH160 list - which is what Brainflayer needs as input - but it doesn't tell you how exactly, and the attack is then very similar to what you do with hashcat and leaked, and hashed passwords.

Some were very successful with this vector a decade ago, and judging by how often the HASH160 question pops up, I assume some people aren't ready to give up just yet.

So you'd want to make it easier for thieves to steal from people? Not that thieves don't know how to obtain all rmd160 hashes of all bitcoin addresses ever used, but you mentioning brainflayer specifically is not morally cool! While teaching something new.

Anyways I hope people use standard wallets such as Core, electrum etc, to generate their private key/ addresses. Because using manual human input as a seed/passphrase to generate keys is just simply stupid.


Hey, I am firmly against theft and illegal activities. Playing with Brainflayer and other types of pen testing are essential for the greater goal: constantly improving security.

Please check out "Mr Brainflayer's" presentation. He says and does it much better than I do.

https://www.youtube.com/watch?v=foil0hzl4Pg

We are white hatters and want the overall crypto ecosystem good.
copper member
Activity: 1330
Merit: 899
🖤😏
And we want all hash 160 strings ever used for what?

Ah, sorry. So-called Brainwallets were popular in the mid-2010s. The tool I linked in the OP, Brainflayer (whose author is lurking here Smiley ), was popular for snatching poorly protected addresses generated by a simple SHA256(passphrase).

People still seem to spend a lot of time and effort making a complete HASH160 list - which is what Brainflayer needs as input - but it doesn't tell you how exactly, and the attack is then very similar to what you do with hashcat and leaked, and hashed passwords.

Some were very successful with this vector a decade ago, and judging by how often the HASH160 question pops up, I assume some people aren't ready to give up just yet.

So you'd want to make it easier for thieves to steal from people? Not that thieves don't know how to obtain all rmd160 hashes of all bitcoin addresses ever used, but you mentioning brainflayer specifically is not morally cool! While teaching something new.

Anyways I hope people use standard wallets such as Core, electrum etc, to generate their private key/ addresses. Because using manual human input as a seed/passphrase to generate keys is just simply stupid.
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
And we want all hash 160 strings ever used for what?

Ah, sorry. So-called Brainwallets were popular in the mid-2010s. The tool I linked in the OP, Brainflayer (whose author is lurking here Smiley ), was popular for snatching poorly protected addresses generated by a simple SHA256(passphrase).

People still seem to spend a lot of time and effort making a complete HASH160 list - which is what Brainflayer needs as input - but it doesn't tell you how exactly, and the attack is then very similar to what you do with hashcat and leaked, and hashed passwords.

Some were very successful with this vector a decade ago, and judging by how often the HASH160 question pops up, I assume some people aren't ready to give up just yet.
copper member
Activity: 1330
Merit: 899
🖤😏
And we want all hash 160 strings ever used for what?
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
The question never really dies; since Brainflayer in 2016, the dream seems to live forever, so to speak: "How can I get all HASH160 ever seen on the blockchain in one file?"

There are confusingly many different solutions here and on Github (and shadier places). One more advanced than the other. I went in the other direction, thinking, "It can't be that difficult."

It isn't. It's much easier than you think. Assuming you have a fully synced Bitcoin Core local installation (the .blk files on your hard drive, downloaded with txindex=1). Also, that you are on *nix (or have Ubuntu on a VM or something - more than good enough), this needs no extra downloads, no dependencies, not much extra HDD space, very little CPU and RAM, and no GPU.

Assuming you're in the terminal and the ~/blocks folder, just:

Code:
cat blk00000.dat| xxd -p | grep -oE '1976a914[0-9a-f]{40}88ac' | sed 's/1976a914//;s/88ac//' | sort -u > P2PKH-unique00000.txt

It shouldn't take longer than 1-2 seconds per .dat file, even on an old budget computer.

That's it. Done.

Some comments:

  • For all .dat files in one go, start with "cat blk* |" instead (and remove the numbers in the output file name). I opted for the first file only as the first proof of concept because there is zero waiting until you know it works.
  • What does it do? Briefly and on the fly - and in this order: Reads the blk file[ s ], changes format from binary data to human readable hexadecimal (the blockchain files have no deeper code to crack than that), searches for the P2PKH pattern, selects the "middle part," which is the HASH160, sorts all hits alphabetically and removes duplicates, forgets all else and writes the result to one text file.
  • Want P2SH-P2WPK and native Segwit too ("3" and "bc1q" addresses), maybe even Taproot? No problem, change the prefix and suffix bytes accordingly (very easy to find). I recommend saving these to separate text files since you can't tell "naked" HASH160:es from each other, but... whatever floats your boat.


Yeah, about the title. I think this is the world's fastest and simplest solution; I will, of course, change it when/if proven wrong.

Happy HASH160!
Jump to: