Any ideas?
How do you know if Point X prefix = 02?
you can do this:
if (isOdd == 0) { // 02
_GetHash160Comp(px, isOdd, (uint8_t *)h);
CHECK_POINT(h, incr, 0, true);
}
In addition, you will have to calculate the Y coordinate when adding points. Look at my mod 12. I removed everything unnecessary there in GPU.
And I changed the conditions in GPUEngine.cu - the ComputeKeys() code is executed. But ComputeKeysComp() is not executed - for the reason that the Y coordinate is needed.
It was measured that it is more profitable to add the Y coordinate than to calculate Ripemd160 2 times.
Everything has already been checked, you can only add a condition. Or loop using Spin. So I gained 6.3% in speed. #define NB_SPIN 32
You also need to change the increment index multiplied by the number of Spin rotations and add Load256(sx, px); Load256(sy, py);
Post the code and I'll check it
Thank you for your input, but I think you miss the point where I asked about processing only the expected public keys from the start. Your proposed solution is equal to my second attempt on _GetHash160Comp function.
Let me give you a scenario so you would understand what I mean.
Let's assume the priv key 66 bit range: 3fa62700000000000:3fa627fffffffffff , so you will have to scan ~ 17592186044416 private keys, generate a public key for each key, right? Now let's assume for the sake of the argument that priv key is at 75% of the end of the keyspace and the public key which generates the hashing to obtain the btc adresss starts with "02b7" (the compressed key is: 02b79ba3ab8ca1fd1399e27ce5bf337819ba34320653c7528084a6b52118c17b86).
Now, let's assume that there's an equal parity after you compute all the public keys from the priv key range with pubkeys that start with 02 or 03 and based on that filter from the start 50% of the keys your are not storing anymore and store/load only what you want? Theoretically you will compute less key, therefore the speed should be double.
getGPUStartingKeys code:
int prefix02Count = 0; // Counter for keys starting with '02' //for debug only
int prefix03Count = 0; // Counter for keys starting with '03' //for debug only
for (int i = 0; i < nbThread; i++) {
tRangeEnd2.Set(&tRangeStart2);
tRangeEnd2.Add(&tRangeDiff);
if (rKey <= 0)
keys[i].Set(&tRangeStart2);
else
keys[i].Rand(&tRangeEnd2);
tRangeStart2.Add(&tRangeDiff);
Int k(keys + i);
k.Add((uint64_t)(groupSize / 2)); // Starting key is at the middle of the group
//p[i] = secp->ComputePublicKey(&k); //here we compute the public keys from the priv keys and store them in the p array
Point pubKey = secp->ComputePublicKey(&k); // Compute the public key
// Extract compressed public key bytes
unsigned char publicKeyBytes[33];
secp->GetPubKeyBytes(true, pubKey, publicKeyBytes);
// Check the prefix of the public key
if (publicKeyBytes[0] == 0x02) {
prefix02Count++;
p[i] = pubKey; // here we store in the array only the keys we want
//std::string pubKeyAddr = secp->GetPublicKeyHex(true, p[i]);
//printf("Public key %d: %s\n", i, pubKeyAddr.c_str()); //for debuging
} else if (publicKeyBytes[0] == 0x03) {
prefix03Count++;
}
}
// Calculate percentages
//double totalKeys = nbThread; //for debug only
//double percentage02 = (prefix02Count / totalKeys) * 100.0;
//double percentage03 = (prefix03Count / totalKeys) * 100.0;
//printf("Total number of keys generated: %d\n", nbThread);
//printf("Percentage of keys starting with '02': %.2f%%\n", percentage02);
//printf("Percentage of keys starting with '03': %.2f%%\n", percentage03);
FinKeyGPU code:
...
getGPUStartingKeys(tRangeStart, tRangeEnd, g->GetGroupSize(), nbThread, keys, p);
ok = g->SetKeys(p); //will set only the keys we stored in p
....
How do you know if PubKey prefix = 02?
I think it's a waste of time to guess whether it's 02 or 03 prefix. Whatever the script is, it must pass all the private keys. It is impossible to accelerate this way. It can be filtered, but filtering is not acceleration.
Sorry mate but you don't seem to understand what I asked, read again my post.
I started studying this program in 2020. Now I will try to explain to you what you are doing wrong.
1. In the getGPUStartingKeys function, it forms an array of points with X and Y coordinates. In this function, you do not need to check them for compliance with the prefixes 02 and 03. Because later in the GPU code, when adding any point to the coordinates generated in this function, the new points will be with the prefixes 03 (not even Y). You won't even know it. You need to filter specifically in the GPU code. For this reason, you won't be able to add new cmd argument.
2. There is no need to reduce nbThread > filtredKeys by 50%, the remaining threads are filled with zeros. The entire Points p array must be transferred to the GPU.
I suggested that you check in the GPU code for the parity of the Y coordinate. uint8_t isOdd = (uint8_t)(py[0] & 1); It's simple
It is not entirely clear what you want to increase further. This is the limit
I know that piece of code: uint8_t isOdd = (uint8_t)(py[0] & 1), depending on the parity of Y coordinate if 0 is then the parity will be even and if 1 then it will be odd and it will serve on this line when permutation is done: publicKeyBytes[0] = __byte_perm(x32[7], 0x2 + isOdd, 0x4321);
What I want to increase further, speed of computation
even with 16 x RTX 4090, I get only 76.8Gk/s, is useless to scan at this speed the 66 puzzle.