Pages:
Author

Topic: VanitySearch (Yet another address prefix finder) - page 5. (Read 32966 times)

copper member
Activity: 1330
Merit: 899
🖤😏
If I want to run the vanitysearch program to search for private keys from 1 to a specific number and output the results, how can I do that?
Base Key: 0000000000000000000000000000000000000000000000000000000000000001

Difficulty: 1208925819614629174706176
Search: 11111111111 [Compressed]
Start Sun Apr  9 10:13:28 2023
Base Key: 0000000000000000000000000000000000000000000000000000000000000001
Number of CPU thread: 12

Don't use address prefix because it will slow down your search, when you give a prefix of an address the program will generate private keys, derives their public keys, hashes the public key twice and then performs another double hash on the rmd160 hash to derive the address and then checks to see if the prefix matches with your desired prefix.
You should find a tool that searches for rmd160 prefix. I haven't seen any tool searching for such prefixes, they either search for the whole hash or just slow you down by searching for prefix match on addresses. Why are you using CPU? it's a waste of time, you should use GPU only.

jr. member
Activity: 40
Merit: 3
If I want to run the vanitysearch program to search for private keys from 1 to a specific number and output the results, how can I do that?
Base Key: 0000000000000000000000000000000000000000000000000000000000000001

Difficulty: 1208925819614629174706176
Search: 11111111111 [Compressed]
Start Sun Apr  9 10:13:28 2023
Base Key: 0000000000000000000000000000000000000000000000000000000000000001
Number of CPU thread: 12

[/quote]
copper member
Activity: 60
Merit: 2
None of those are Nostr vanity generators. AFAIK, nobody has made one yet, which is not unreasonable when you consider that Nostr is a very recent invention.

[for the record, VanitySearch cannot find Taproot addresses, but it should be as simple as checking for an input prefix that is either bc1q or bc1p to allow that. I was able to use VanitySearch to generate arbitrary prefixes not starting with 1, 3, etc, in my experimental builds.]

This is kind of what I was thinking, is that with some non-standard switches or inputs or something, it could be "convinced" to output something, maybe even just in hex or some other output that can be converted into something that work with nostr?  Pardon my ignorance on it, but is it as simple as just going to npub1 instead of bc1q or similar?  I'm learning a lot, but I'm not well versed on address types and how they interact with the systems.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
I did try to look through this to see if it has been answered, but attempting to find a GPU accelerated nostr vanity address generator, and this is as close as I've found so far.  Obviously not the same thing but, just trying to figure out if there is a branch of this that could do that, or something else made to do so?  Using CPU only is brutal.
You could go to project development : https://bitcointalk.org/index.php?board=12.0 there are some windows GPU versions, some are not open source, so do your research before using.

None of those are Nostr vanity generators. AFAIK, nobody has made one yet, which is not unreasonable when you consider that Nostr is a very recent invention.

[for the record, VanitySearch cannot find Taproot addresses, but it should be as simple as checking for an input prefix that is either bc1q or bc1p to allow that. I was able to use VanitySearch to generate arbitrary prefixes not starting with 1, 3, etc, in my experimental builds.]
copper member
Activity: 1330
Merit: 899
🖤😏
I did try to look through this to see if it has been answered, but attempting to find a GPU accelerated nostr vanity address generator, and this is as close as I've found so far.  Obviously not the same thing but, just trying to figure out if there is a branch of this that could do that, or something else made to do so?  Using CPU only is brutal.
You could go to project development : https://bitcointalk.org/index.php?board=12.0 there are some windows GPU versions, some are not open source, so do your research before using.
copper member
Activity: 60
Merit: 2
I did try to look through this to see if it has been answered, but attempting to find a GPU accelerated nostr vanity address generator, and this is as close as I've found so far.  Obviously not the same thing but, just trying to figure out if there is a branch of this that could do that, or something else made to do so?  Using CPU only is brutal.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
This is what I ended up settling with:

Code:
static bool char16_is_match(RegexNode *node, const char *orig, const char *cur,
                          const char **next)
{
        Char16Node char16 = node->chr16;
        int result;
        __m128i xmm0, xmm1, xmm2;

        *next = cur+16;
        xmm0 = _mm_loadu_si128((__m128i*) char16.chr);
        xmm1 = _mm_loadu_si128((__m128i*) cur);
        xmm2 = _mm_cmpeq_epi8(xmm0, xmm1);
        result = xmm2[0] & xmm2[1] & xmm2[2] & xmm2[3] & xmm2[4] & xmm2[5] & xmm2[6]
            & xmm2[7] & xmm2[8] & xmm2[9] & xmm2[10] & xmm2[11] & xmm2[12]
            & xmm2[13] & xmm2[14] & xmm2[15];

        return result;
}

The ugly AND operators are the direct result of Intel not having any kind of "AND bytes/words/dwords/quadwords inside XMM register" instruction. Indeed, there are no packed bitwise instructions operating on single SIMD registers at all - the kinds of instructions are there are mostly designed with video decoding algorithms in mind.

There's no real performance improvement from using this over the old code.

Yes, I know there is MOVMSKPS, but that is for floats, so it hardly suits byte inputs. Not only would I have to restrict inputs to 4 chars at a time, I would also have to pad them with zeroes which will negatively affect performance.

Hence this code has been pushed right now to the repo, but nothing calls it - this is despite coverage tests saying it got hit 16 million times (over the span of a few minutes) whereas the char_is_match other function would be (and still is) called several hundreds of millions of times.

So the only real performance improvement I pushed in this latest commit is to disable the UTF8 character matching and treat everything as ASCII - that did increase Mkey performance on 8x "Intel(R) Xeon(R) CPU E31240 @ 3.30GHz" by about 0.1MKey/s (up from 0.9 MKey/s average - VanitySearch reports large speeds for the first few seconds but then it comes down as the estimates become more accurate).

Meanwhile the non-regex mode (also in my fork) runs at 2.20MKey/s once the dust is settled - roughly the same as original VanitySearch.
newbie
Activity: 17
Merit: 7
Quote
Is this part of the base standard, or a C++11/14/17/20 extension?

hmm its actually in std::experimental so i guess that means its not officially in the standard yet. it was originally a third-party library that i had used in an old version of a program of mine, then a few years ago it got promoted to get absorbed into the c++ standard library. not sure if its on track to become official with c++23 or not. anyway i just know i really liked how relatively simple it made writing vectorized code 😎
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Quote
maybe put some SSE or AVX in there somehow

https://en.cppreference.com/w/cpp/experimental/simd

A zero-overhead abstraction for the high-level language you are already using is so much nicer than spending a bunch of time writing your own architecture-specific routines in assembly or compiler intrinsics (std::simd itself being a simple template library implemented using intrinsics). Generally speaking at least

This is very useful, thanks!

Is this part of the base standard, or a C++11/14/17/20 extension?

Meanwhile I have already made an optimized matcher for consecutive character matches:

edited for garbage code formatting pasted from my phone

Code:
static bool char16_is_match(RegexNode *node, const char *orig, const char *cur, const char **next)
{
    Char16Node char16 = node->chr16;
    int result;
    __asm__ __volatile__ (
        "movdqa xmm1, [%2] \n"
        "pcmpeqb xmm0, xmm1 \n"
        "movmskps %0, xmm0 \n"
        : "=r"(result)
        : "r" (char16.chr), "r" (cur)
        : "xmm0", "xmm1"
    );
    *next = cur+16;
    return result;
}

This actually only uses SSE2. I'm not sure why JLP compiled with SSSE3, is he using some of those newer instructions in his SECP256k1 class? Huh

Anyway, I haven't benchmarked this yet, but the way this is supposed to work, is that the regular expression is already represented as a linked list like this:


^1tryme.*

start --> char --> char --> char --> char --> char --> char --> quant
                                                                                              v
                                                                                             any

so you can actually go down in some of these nodes if necessary.

Now instead of comparing each of these char nodes as is presently done, which is inefficient, we can take advantage of the fact that XMM registers can hold up to 16 char values. So we can combine sets of 2-16 char nodes into a single char16 node, and pad it (as well as the input string!) with NULs if necessary :

start --> char16 --> quant
                                 v
                               any

This has roughly the same overhead as a single "char" old-school comparison. Considering that the "char" node match function is called hundreds of millions of times in the span of a few seconds, owing to the fact that most people will usually search for just a certain pattern of strings, this will translate to 6-8x less of these calls being made depending on the regex string.
newbie
Activity: 17
Merit: 7
Quote
maybe put some SSE or AVX in there somehow

https://en.cppreference.com/w/cpp/experimental/simd

A zero-overhead abstraction for the high-level language you are already using is so much nicer than spending a bunch of time writing your own architecture-specific routines in assembly or compiler intrinsics (std::simd itself being a simple template library implemented using intrinsics). Generally speaking at least
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
WOW you are really true code master man, how long do you think will this take you to have a working GPU build?

Well, I wouldn't call myself that.  Wink

citb0in actually asked me that question already, and we both thought it would take max. a week or two. This was in the beginning of January. But unfortunately (among other things) I discovered that adding the GPU support wasn't as simple as changing a few lines of code (and I refunded him his money last night).

So as it stands, there's only a working CPU build - just now I pushed the code that will enable you to run queries like "./VanitySearch ^1abc" to search for 1abc prefix at the beginning, and also you could already do ./VanitySearch 1abc$ to search for 1abc at the end of the address, and some other things I will list below.

Wildcard searches with *, +, and {m,n} work but they are not very useful as current hardware can only reasonably search for 6-7 characters at a time:

(difficulty estimation is currently wrong for {m,n} and will report much longer time than actual)


Character classes and quantifier example:

./VanitySearch -t 8 '^1[fF][iI].{2,4}[sS][hH]'

Code:
 time ./VanitySearch -t 8 '^1[fF][iI].{2,4}[sS][hH]'
VanitySearch v1.19
Benchmarking regex matching speed of prefix "^1[fF][iI].{2,4}[sS][hH]" (case sensitive) for 1 second, please wait... done
Difficulty: 9225725494
Search: ^1[fF][iI].{2,4}[sS][hH] [Compressed]
Start Tue Feb  7 12:12:39 2023
Base Key: CA21EDE5142917948E79FEFD98EBBB2E485ECA54E1C814B482C39A2C8B8D5B37
Number of CPU thread: 8

PubAddress: 1FiF7Sh6Ei8ezjMASNdED1ixyTZ9HY4DbP
Priv (WIF): p2pkh:Kxsq34CD5UBCqhRMZi4WNH1oQjnr3FHjzQqS2kEDb6CN3EpZYauo
Priv (HEX): 0x317261FB480D6C92F777B4F3B18176433CAED4D61E8A3E507A0414E3A6BCCAD4

PubAddress: 1FiZ9bmpedPk1ksHRtUtZEDZ2n7fNz68LP
Priv (WIF): p2pkh:L3zdVyH3kWaRtzJ4eLEc6FwcbrzBRpVgoPZjCC3Gq1d2ZCEENuM6
Priv (HEX): 0xCA21EDE5142917948E79FEFD98EBBB31485ECA54E1C814B782C39A2C8B8D6ACE
[0.20 Mkey/s][GPU 0.00 Mkey/s][Total 2^18.63][Prob 0.0%][50% in 08:45:41][Found 2]
PubAddress: 1FidbgsHsR9B2iEu5KkfYxRE7SPjadWSiS
Priv (WIF): p2pkh:L12uW3KSdEkLQzPutC9E6cpGJe6w9R9udHi4tLCrVEJKkCeikoJJ
Priv (HEX): 0x71CA8897DFCA52F8649A489E5C34216517D724D6E9710FE937FEE419F019AA9F
[0.19 Mkey/s][GPU 0.00 Mkey/s][Total 2^19.55][Prob 0.0%][50% in 09:15:06][Found 3]  ^C

real    0m8.447s
user    0m51.466s
sys     0m0.075s

Note: Make sure you put the regex in single quotes '' otherwise bash might mess them up!

Anyway, I'm going to try to optimize the CPU build first before working on the GPU (regex mode is currently 2x slower than non-regex mode), maybe put some SSE or AVX in there somehow.
copper member
Activity: 1330
Merit: 899
🖤😏
I am still working on my VanitySearch build by the way.

Over the past few days, I made the CPU core of the regex support work (It just needs a few lines of code change to allow you to lock the search to prefix or suffix, but that will be done in a few days), but apparently, there's a ton of work necessary to be done on the GPU side.

[Yes, I hate CUDA - what an abomination to debug - but if nobody else is going to do it, why let the idea of regex search go down the drain?]

EDIT: GPUEngine must be almost completely rewritten, including making a regex matcher in CUDA. Apparently, nobody has ever done this before (or at least finished it) so whatever I'm going to implement here might possibly be the first time someone attempts that.

Why does it need to be rewritten?

Because the GPUEngine was based on wildcards. It would basically take the ? and * and then calculate all the different possible combinations of the prefix, and the result would be a gigantic array which is indexed by a unique prefix ID, where ago find a match for an address, you'd run a string comparison on all of the corresponding prefixes, so see if any match.

There is two ways to insert regex at hat point:

1- shoehorn the other regex characters inside the GPUEngine (but this approach really only works with quantifier metacharacters), or
2) break the input string down into a tree, where each regex character corresponds to some part of the tree, and match all those parts in parallel.

2) is the approach I chose to take because it means I can use the existing CPU regex library as a reference for how to match stuff.

Either way, we're on track to make the first regex-enables VanitySearch, eventually.

PS. There is no longer a GPU performance hit, but it just seems to be sitting there idle ever since I ripped out the (pointless for this build, IMO) prefix input's anyway.
WOW you are really true code master man, how long do you think will this take you to have a working GPU build?
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
I am still working on my VanitySearch build by the way.

Over the past few days, I made the CPU core of the regex support work (It just needs a few lines of code change to allow you to lock the search to prefix or suffix, but that will be done in a few days), but apparently, there's a ton of work necessary to be done on the GPU side.

[Yes, I hate CUDA - what an abomination to debug - but if nobody else is going to do it, why let the idea of regex search go down the drain?]

EDIT: GPUEngine must be almost completely rewritten, including making a regex matcher in CUDA. Apparently, nobody has ever done this before (or at least finished it) so whatever I'm going to implement here might possibly be the first time someone attempts that.

Why does it need to be rewritten?

Because the GPUEngine was based on wildcards. It would basically take the ? and * and then calculate all the different possible combinations of the prefix, and the result would be a gigantic array which is indexed by a unique prefix ID, where ago find a match for an address, you'd run a string comparison on all of the corresponding prefixes, so see if any match.

There is two ways to insert regex at hat point:

1- shoehorn the other regex characters inside the GPUEngine (but this approach really only works with quantifier metacharacters), or
2) break the input string down into a tree, where each regex character corresponds to some part of the tree, and match all those parts in parallel.

2) is the approach I chose to take because it means I can use the existing CPU regex library as a reference for how to match stuff.

Either way, we're on track to make the first regex-enables VanitySearch, eventually.

PS. There is no longer a GPU performance hit, but it just seems to be sitting there idle ever since I ripped out the (pointless for this build, IMO) prefix input's anyway.
legendary
Activity: 2618
Merit: 6452
Self-proclaimed Genius
May I?

I have a few questions if you don't mind.

1: how do we know if a custom address could exist and is valid?
As long as the address has a valid checksum, it's valid.
The checksum is the last 4Bytes of the address after decoding it in base58;
To be valid, it should match the first 4Bytes of the SHA256D hash of the rest of the characters (w/o the checksum).

In fact, even without the private key, you can make valid custom address that "could exist".
Here's a tool for example: https://gobittest.appspot.com/ProofOfBurn

Quote from: digaran
2: if I already know a vanity address's private key/ key range, would it be easier to generate that address if I could set a specific range for the tool to search only in that range? Would that reduce the difficulty and the time needed?
Yes, but why search if you already know the private key?

Take note that similar vanity addresses like 1banana12345...... and 1banana12344...... have no correlation to their private key's ranges.
Addresses are just the base58 encoding of the HASH160 (sha256 and then ripemd160) of the public key which is pseudorandom, plus some additional data.
Knowing the former's private key wont trivialize the search for the latter's private key.
copper member
Activity: 1330
Merit: 899
🖤😏
I have a few questions if you don't mind.

1: how do we know if a custom address could exist and is valid?
2: if I already know a vanity address's private key/ key range, would it be easier to generate that address if I could set a specific range for the tool to search only in that range? Would that reduce the difficulty and the time needed?
full member
Activity: 1232
Merit: 242
Shooters Shoot...
Quote
How did you input these prefixes? Using the command-line, or another way?

From input file, the original -i  flag; 1 per line

Good to know. Although it would still be handy to read prefixes from the command line too: VanitySearch 1abcd bc1qabcd for example.
Easy to do, but I wouldn't wanna put 32 million prefixes on cmd line LOL.

Just keep parsing over the CMD line...
hero member
Activity: 1554
Merit: 880
Notify wallet transaction @txnNotifierBot
Thanks for this, i just discovered this thread and eventually generate mine with native segwit wallet address (check my profile). Well it's was generated almost in instant since i only use 3 character in reference with my username. Smiley
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Quote
How did you input these prefixes? Using the command-line, or another way?

From input file, the original -i  flag; 1 per line

Good to know. Although it would still be handy to read prefixes from the command line too: VanitySearch 1abcd bc1qabcd for example.
full member
Activity: 1232
Merit: 242
Shooters Shoot...
Quote
How did you input these prefixes? Using the command-line, or another way?

From input file, the original -i  flag; 1 per line
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
Ultimately, it will not only allow prefix searches, but also suffix searches and also for vanity characters in an arbitrary part of the address.

Does your version still support full address search?

I didn't modify that particular section so that should work just fine.

Quote
also, current program can search up to 32 million prefixes simultaneously (I haven't tried more than 32 million).

How did you input these prefixes? Using the command-line, or another way?

Original VanitySearch only reads the first prefix on the command line, but I tweaked it to read all of them.

Side note: Regex matching on GPU seems to work, but I am going to revert all the changes I made to GPU because there's an alternate way of matching strings using the "|" operator, which makes everything I made in that section obsolete (and is probably the cause of the 5x GPU slowdown anyway).
Pages:
Jump to: