Pages:
Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 74. (Read 444043 times)

legendary
Activity: 1470
Merit: 1114
cpuminer-opt-3.6.3 released

git: https://github.com/JayDDee/cpuminer-opt

tarball: https://drive.google.com/file/d/0B0lVSGQYLJIZaXBWbS1xeXBKYkk/view?usp=sharing

Windows binaries: https://drive.google.com/file/d/0B0lVSGQYLJIZRFUtb2p1V3FCbzQ/view?usp=sharing

Fixed all known issues with SHA support on AMD Ryzen CPUs, still no Windows binaries.

SHA support on AMD Ryzen CPUs requires gcc version 5 or higher and openssl 1.1
or higher. Additional compile options may also be required such as "-march=znver1" or "-msha".

SHA support only improves algos that use sha256. The following algos have SHA support:
m7m, lbry, sha256t, myr-gr, skein.
legendary
Activity: 1260
Merit: 1046
I believe I have found the root cause of the problems with m7m and SHA. The sph final function
re-initializes the context while the openssl version does not. The code sph in m7m took advantage of
this silent init while openssl was left with stale data.

I have also found that sha256t works at suprnova, but not at yiimp. This looks like a pool issue.

All issues with SHA should be fixed in the next release.
Congratulation : good work :-).
legendary
Activity: 1470
Merit: 1114
I believe I have found the root cause of the problems with m7m and SHA. The sph final function
re-initializes the context while the openssl version does not. The code sph in m7m took advantage of
this silent init while openssl was left with stale data.

I have also found that sha256t works at suprnova, but not at yiimp. This looks like a pool issue.

All issues with SHA should be fixed in the next release.
legendary
Activity: 1470
Merit: 1114
No I have not been able to figure out how to compile it myself, I used a pre-compiled version from a helpful person via youtube.

https://www.youtube.com/watch?v=TA-dvvumwGU

^ the download link is in the video description

There are no issues with xevan on Ryzen.

The only issues are with algos that use sha256 when compiled with SHA support. The prebuilt binaries referred
to in the video are mine. There is no binary compiled with SHA support.
sr. member
Activity: 1246
Merit: 274
I was able to mine with a Ryzen 1700X using v3.6.1 on the Xevan algorithm. It worked quite well based on my limited experience with CPU mining. I ran it for 13 1/2 hours and had only 1 rejected share out of 2003 total.

Did you build it from source?

I should have done that before releasing. M7m fails for me using openssl. I have something to work with now.

Excellent! I'm afraid figuring that one out might be far beyond my knowledge.

No I have not been able to figure out how to compile it myself, I used a pre-compiled version from a helpful person via youtube.

https://www.youtube.com/watch?v=TA-dvvumwGU

^ the download link is in the video description
newbie
Activity: 25
Merit: 0
I was able to mine with a Ryzen 1700X using v3.6.1 on the Xevan algorithm. It worked quite well based on my limited experience with CPU mining. I ran it for 13 1/2 hours and had only 1 rejected share out of 2003 total.

Did you build it from source?

I should have done that before releasing. M7m fails for me using openssl. I have something to work with now.

Excellent! I'm afraid figuring that one out might be far beyond my knowledge.
hero member
Activity: 561
Merit: 500
Qual é a melhor configuração para o Core i7 2630QM
sr. member
Activity: 1246
Merit: 274
I was able to mine with a Ryzen 1700X using v3.6.1 on the Xevan algorithm. It worked quite well based on my limited experience with CPU mining. I ran it for 13 1/2 hours and had only 1 rejected share out of 2003 total.
legendary
Activity: 1470
Merit: 1114
You could remove the __SHA__ check and compile it with openssl sha256 instead of sph_256. It'll still run on other hardware just not accelerated.

I should have done that before releasing. M7m fails for me using openssl. I have something to work with now.
newbie
Activity: 25
Merit: 0
You could remove the __SHA__ check and compile it with openssl sha256 instead of sph_256. It'll still run on other hardware just not accelerated.
legendary
Activity: 1470
Merit: 1114
Yes, non-sha compiles perfectly.

So the SHA code works on some algos but not on m7m, this is going to be a lot of work but I'm stuck
without a Ryzen of my own to test with.
newbie
Activity: 25
Merit: 0
Yes, non-sha compiles perfectly.
legendary
Activity: 1470
Merit: 1114
I guess I wasn't reading it correctly. Anyways I found another algo I forgot about, skein.
It also uses SHA but it doesn't work in 3.6.2.

I think You misunderstood my question I want to know if a non-SHA compile works on Ryzen
mining m7m, assuming the SHA build still fails.

I reviewed the code again and I don't see any difference in the SHA code vs the non-SHA code
after making the changes in the edit of my previous post.

Here is the fix for algo/skein/skein.c

Code:
9c9
< #if defined __SHA__
---
> #if defined (SHA_NI)
17c17
< #if defined __SHA__
---
> #if defined (SHA_NI)
29c29
< #if defined __SHA__
---
> #if defined (SHA_NI)
45c45
< #if defined __SHA__
---
> #if defined (SHA_NI)
newbie
Activity: 25
Merit: 0
It's hard to make sense of your pull request. This is the only error I found:

Code:
-    SHA256_CTX         ctx_fsha256;
+    SHA256_CTX         ctxf_sha256;  

The non-SHA code works for me on 6700K, can you confirm it works on Ryzen?

Yes, all the others I've tried so far (except sha256t) work on Ryzen. The changes I made are here: https://github.com/JayDDee/cpuminer-opt/pull/8/files

Edit: I still get the rejects after those changes. I haven't really dug into it outside of getting it to compile but those changes should have gotten it to work.
legendary
Activity: 1470
Merit: 1114
joblo, submitted a couple of pull requests to get it to compile on Ryzen. m7m is rejecting shares when built for Ryzen.

Code:
cpuminer -a m7m -o stratum+tcp://xmg.suprnova.cc:7128 -u x -p x --cpu-priority 0 --api-bind 127.0.0.1:5500 --cpu-affinity 0x2 -D

         **********  cpuminer-opt 3.6.2  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AES_NI and AVX extensions.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
     Forked from TPruvot's cpuminer-multi with credits
     to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d,
     Wolf0, Jeff Garzik and Optiminer.

CPU: AMD Ryzen 7 1800X Eight-Core Processor
CPU features: SSE2 AES AVX AVX2 SHA
SW built on Apr 16 2017 with GCC 6.3.0
SW features: SSE2 AES AVX AVX2 SHA
Algo features: SSE2 AES AVX SHA
Start mining with SSE2 AES AVX SHA

[2017-04-16 14:23:45] Binding process to cpu mask 2
[2017-04-16 14:23:45] 1 miner threads started, using 'm7m' algorithm.
[2017-04-16 14:23:45] Starting Stratum on stratum+tcp://xmg.suprnova.cc:7128
[2017-04-16 14:23:45] Binding thread 0 to cpu mask 2
[2017-04-16 14:23:45] Stratum session id: deadbeefcafebabe6e55180000000000
[2017-04-16 14:23:45] Stratum difficulty set to 8
[2017-04-16 14:23:48] stratum extranonce subscribe timed out
[2017-04-16 14:23:48] DEBUG: job_id='3fb1' extranonce2=00000000 ntime=aec4f358
[2017-04-16 14:23:48] m7m block 1283282, diff 4.488
[2017-04-16 14:23:55] CPU #0: 131.07 kH, 20.98 kH/s
[2017-04-16 14:24:06] DEBUG: [0 thread] Found share!
data   04000000ce6b8cd52940cbb5c79f0afe74ca0b2d4c5733152a0bdc87614fa230a43cbf7c6919c3c5e047a5ddfd618350d0cb63c87fec6e40538d324958ed2911645013d5aec4f358c709391c60670500
hash   a246467dddbf887e5967eaada783eadbe98042a325e7fdba764274c74c050000
target 000000000000000000000000000000000000000000000000000000e0ff1f0000
[2017-04-16 14:24:06] CPU #0: 223.07 kH, 20.95 kH/s
[2017-04-16 14:24:06] Rejected 1/1 (100.0%), 223.07 kH, 20.95 kH/s
[2017-04-16 14:24:06] reject reason: low difficulty share of 0.000016677225528600396
[2017-04-16 14:24:06] factor reduced to : 0.67

It's hard to make sense of your pull request. This is the only error I found:

Code:
-    SHA256_CTX         ctx_fsha256;
+    SHA256_CTX         ctxf_sha256;  

The non-SHA code works for me on 6700K, can you confirm it works on Ryzen?

Edit: ok I found 2 more bugs, one a run time that would break the hash.

Code:
17c17
< #if defined __SHA__
---
> #if defined (SHA_NI)
188c188
<     SHA256_CTX         ctxf_sha256;
---
>     SHA256_CTX         ctx_fsha256;
273c273
<         SHA256_Update(  &ctxf_sha256, bdata, bytes );
---
>         SHA256_Update(  &ctxf_sha256, bdata_p64, bytes );


Do you still get rejects with these fixed?
newbie
Activity: 25
Merit: 0
joblo, submitted a couple of pull requests to get it to compile on Ryzen. m7m is rejecting shares when built for Ryzen.

Code:
cpuminer -a m7m -o stratum+tcp://xmg.suprnova.cc:7128 -u x -p x --cpu-priority 0 --api-bind 127.0.0.1:5500 --cpu-affinity 0x2 -D

         **********  cpuminer-opt 3.6.2  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AES_NI and AVX extensions.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
     Forked from TPruvot's cpuminer-multi with credits
     to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d,
     Wolf0, Jeff Garzik and Optiminer.

CPU: AMD Ryzen 7 1800X Eight-Core Processor
CPU features: SSE2 AES AVX AVX2 SHA
SW built on Apr 16 2017 with GCC 6.3.0
SW features: SSE2 AES AVX AVX2 SHA
Algo features: SSE2 AES AVX SHA
Start mining with SSE2 AES AVX SHA

[2017-04-16 14:23:45] Binding process to cpu mask 2
[2017-04-16 14:23:45] 1 miner threads started, using 'm7m' algorithm.
[2017-04-16 14:23:45] Starting Stratum on stratum+tcp://xmg.suprnova.cc:7128
[2017-04-16 14:23:45] Binding thread 0 to cpu mask 2
[2017-04-16 14:23:45] Stratum session id: deadbeefcafebabe6e55180000000000
[2017-04-16 14:23:45] Stratum difficulty set to 8
[2017-04-16 14:23:48] stratum extranonce subscribe timed out
[2017-04-16 14:23:48] DEBUG: job_id='3fb1' extranonce2=00000000 ntime=aec4f358
[2017-04-16 14:23:48] m7m block 1283282, diff 4.488
[2017-04-16 14:23:55] CPU #0: 131.07 kH, 20.98 kH/s
[2017-04-16 14:24:06] DEBUG: [0 thread] Found share!
data   04000000ce6b8cd52940cbb5c79f0afe74ca0b2d4c5733152a0bdc87614fa230a43cbf7c6919c3c5e047a5ddfd618350d0cb63c87fec6e40538d324958ed2911645013d5aec4f358c709391c60670500
hash   a246467dddbf887e5967eaada783eadbe98042a325e7fdba764274c74c050000
target 000000000000000000000000000000000000000000000000000000e0ff1f0000
[2017-04-16 14:24:06] CPU #0: 223.07 kH, 20.95 kH/s
[2017-04-16 14:24:06] Rejected 1/1 (100.0%), 223.07 kH, 20.95 kH/s
[2017-04-16 14:24:06] reject reason: low difficulty share of 0.000016677225528600396
[2017-04-16 14:24:06] factor reduced to : 0.67

Edit: groestl, dmd-gr, deep, cryptonight, lbry working on Ryzen.

sha256t not working.
legendary
Activity: 1470
Merit: 1114
cpuminer-opt-3.6.2 is released.

SHA accceleration is now supported on AMD Ryzen CPUs when compiled from source,
  Windows binaries not yet available.
Fixed groestl algo.
Fixed dmd-gr (Diamond) algo.
Fixed lbry compile error on Ryzen.
Added SHA support to m7m algo.
Hodl support for CPUs without AES has been removed, use legacy version.

See RELEASE_NOTES for new compile instructions for Ryzen.

Source code:

git: https://github.com/JayDDee/cpuminer-opt

tarball: https://drive.google.com/file/d/0B0lVSGQYLJIZV01jaFV5enpKcWs/view?usp=sharing

Windows binaries

https://drive.google.com/file/d/0B0lVSGQYLJIZZUdkVGcwdkUyUjg/view?usp=sharing

legendary
Activity: 1470
Merit: 1114
Thanks for the testing.

I found another issue, I forgot to implement SHA for m7m. Will do that and the changes in my previous post
and release 3.6.2.
newbie
Activity: 25
Merit: 0
joblo,

sph512 is definately faster.
Code:
cpuminer-openssl512 -a lbry -o stratum+tcp://yiimp.ccminer.org:3334 -u x --cpu-priority 0 --cpu-affinity 0xAAAA --api-bind 127.0.0.1:5008

         **********  cpuminer-opt 3.6.1  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AES_NI and AVX extensions.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
     Forked from TPruvot's cpuminer-multi with credits
     to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d,
     Wolf0, Jeff Garzik and Optiminer.

CPU: AMD Ryzen 7 1800X Eight-Core Processor
CPU features: SSE2 AES AVX AVX2 SHA
SW built on Apr 15 2017 with GCC 6.3.0
SW features: SSE2 AES AVX AVX2 SHA
Algo features: SSE2 SHA
Start mining with SSE2 SHA

[2017-04-15 19:29:06] Binding process to cpu mask aaaa
[2017-04-15 19:29:06] Starting Stratum on stratum+tcp://yiimp.ccminer.org:3334
[2017-04-15 19:29:06] 8 miner threads started, using 'lbry' algorithm.
[2017-04-15 19:29:07] Stratum difficulty set to 32
[2017-04-15 19:29:16] lbry block 159116, diff 59631.536
[2017-04-15 19:29:16] CPU #5: 131.07 kH, 603.76 kH/s
[2017-04-15 19:29:16] CPU #2: 131.07 kH, 603.76 kH/s
[2017-04-15 19:29:16] CPU #3: 131.07 kH, 603.76 kH/s
[2017-04-15 19:29:16] CPU #6: 131.07 kH, 603.77 kH/s
[2017-04-15 19:29:16] CPU #7: 131.07 kH, 603.77 kH/s
[2017-04-15 19:29:16] CPU #1: 131.07 kH, 599.60 kH/s
[2017-04-15 19:29:16] CPU #0: 131.07 kH, 599.60 kH/s
[2017-04-15 19:29:16] CPU #4: 131.07 kH, 592.80 kH/s
[2017-04-15 19:29:42] CPU #2: 15.52 MH, 602.03 kH/s
[2017-04-15 19:29:42] CPU #6: 15.43 MH, 598.43 kH/s
[2017-04-15 19:29:42] CPU #7: 15.48 MH, 600.32 kH/s
[2017-04-15 19:29:42] CPU #3: 15.51 MH, 601.51 kH/s
[2017-04-15 19:29:42] CPU #5: 15.48 MH, 600.38 kH/s
[2017-04-15 19:29:42] CPU #0: 15.38 MH, 596.39 kH/s
[2017-04-15 19:29:42] CPU #4: 15.15 MH, 587.79 kH/s
[2017-04-15 19:29:42] CPU #1: 15.43 MH, 598.59 kH/s
[2017-04-15 19:29:54] CPU #4: 7437.82 kH, 590.21 kH/s
[2017-04-15 19:29:55] Accepted 1/1 (100%), 115.67 MH, 4787.86 kH/s

cpuminer-sph512 -a lbry -o stratum+tcp://yiimp.ccminer.org:3334 -u x --cpu-priority 0 --cpu-affinity 0xAAAA --api-bind 127.0.0.1:5008

         **********  cpuminer-opt 3.6.1  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AES_NI and AVX extensions.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
     Forked from TPruvot's cpuminer-multi with credits
     to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d,
     Wolf0, Jeff Garzik and Optiminer.

CPU: AMD Ryzen 7 1800X Eight-Core Processor
CPU features: SSE2 AES AVX AVX2 SHA
SW built on Apr 15 2017 with GCC 6.3.0
SW features: SSE2 AES AVX AVX2 SHA
Algo features: SSE2 SHA
Start mining with SSE2 SHA

[2017-04-15 19:30:30] Binding process to cpu mask aaaa
[2017-04-15 19:30:30] Starting Stratum on stratum+tcp://yiimp.ccminer.org:3334
[2017-04-15 19:30:30] 8 miner threads started, using 'lbry' algorithm.
[2017-04-15 19:30:31] Stratum difficulty set to 32
[2017-04-15 19:30:49] lbry block 159117, diff 59236.756
[2017-04-15 19:30:51] CPU #1: 131.07 kH, 610.82 kH/s
[2017-04-15 19:30:51] CPU #2: 131.07 kH, 607.97 kH/s
[2017-04-15 19:30:51] CPU #7: 131.07 kH, 607.97 kH/s
[2017-04-15 19:30:51] CPU #0: 131.07 kH, 603.76 kH/s
[2017-04-15 19:30:51] CPU #5: 131.07 kH, 600.98 kH/s
[2017-04-15 19:30:51] CPU #6: 131.07 kH, 592.81 kH/s
[2017-04-15 19:30:51] CPU #3: 131.07 kH, 587.48 kH/s
[2017-04-15 19:30:51] CPU #4: 131.07 kH, 556.23 kH/s
[2017-04-15 19:31:27] CPU #6: 22.13 MH, 606.61 kH/s
[2017-04-15 19:31:27] CPU #5: 22.28 MH, 610.39 kH/s
[2017-04-15 19:31:27] CPU #2: 22.28 MH, 610.54 kH/s
[2017-04-15 19:31:27] CPU #3: 22.20 MH, 608.49 kH/s
[2017-04-15 19:31:27] CPU #0: 22.10 MH, 605.54 kH/s
[2017-04-15 19:31:27] CPU #4: 21.82 MH, 598.28 kH/s
[2017-04-15 19:31:27] CPU #7: 22.30 MH, 611.05 kH/s
[2017-04-15 19:31:27] CPU #1: 22.32 MH, 611.60 kH/s
[2017-04-15 19:31:28] lbry block 159118, diff 63018.528
[2017-04-15 19:31:28] CPU #4: 596.54 kH, 599.39 kH/s
[2017-04-15 19:31:28] CPU #5: 609.97 kH, 611.66 kH/s
[2017-04-15 19:31:28] CPU #6: 606.28 kH, 607.65 kH/s
[2017-04-15 19:31:28] CPU #3: 608.95 kH, 611.24 kH/s
[2017-04-15 19:31:28] CPU #7: 608.56 kH, 611.78 kH/s
[2017-04-15 19:31:28] CPU #2: 607.96 kH, 609.95 kH/s
[2017-04-15 19:31:28] CPU #0: 603.01 kH, 605.59 kH/s
[2017-04-15 19:31:28] CPU #1: 606.08 kH, 609.90 kH/s
[2017-04-15 19:31:48] CPU #4: 11.87 MH, 608.14 kH/s
[2017-04-15 19:31:48] CPU #2: 11.93 MH, 611.25 kH/s
[2017-04-15 19:31:48] CPU #3: 11.91 MH, 610.20 kH/s
[2017-04-15 19:31:48] CPU #5: 11.95 MH, 612.39 kH/s
[2017-04-15 19:31:48] CPU #0: 11.88 MH, 608.82 kH/s
[2017-04-15 19:31:48] CPU #7: 11.90 MH, 609.79 kH/s
[2017-04-15 19:31:48] CPU #1: 11.78 MH, 603.48 kH/s
[2017-04-15 19:31:48] CPU #6: 11.83 MH, 606.27 kH/s
[2017-04-15 19:32:09] CPU #5: 12.89 MH, 612.63 kH/s
[2017-04-15 19:32:09] CPU #6: 12.80 MH, 608.19 kH/s
[2017-04-15 19:32:09] CPU #0: 12.85 MH, 610.72 kH/s
[2017-04-15 19:32:09] CPU #2: 12.87 MH, 611.55 kH/s
[2017-04-15 19:32:09] CPU #4: 12.62 MH, 599.64 kH/s
[2017-04-15 19:32:09] CPU #1: 12.74 MH, 605.55 kH/s
[2017-04-15 19:32:09] CPU #7: 12.88 MH, 611.98 kH/s
[2017-04-15 19:32:09] CPU #3: 12.84 MH, 610.24 kH/s
[2017-04-15 19:32:30] CPU #1: 12.74 MH, 605.29 kH/s
[2017-04-15 19:32:30] CPU #5: 12.72 MH, 604.45 kH/s
[2017-04-15 19:32:30] CPU #6: 12.87 MH, 611.55 kH/s
[2017-04-15 19:32:30] CPU #7: 12.80 MH, 608.10 kH/s
[2017-04-15 19:32:30] CPU #4: 12.83 MH, 609.77 kH/s
[2017-04-15 19:32:30] CPU #2: 12.86 MH, 611.22 kH/s
[2017-04-15 19:32:30] CPU #0: 12.77 MH, 606.66 kH/s
[2017-04-15 19:32:30] CPU #3: 12.84 MH, 610.10 kH/s
[2017-04-15 19:33:15] lbry block 159119, diff 66103.887
[2017-04-15 19:33:15] CPU #1: 26.64 MH, 590.88 kH/s
[2017-04-15 19:33:15] CPU #6: 27.45 MH, 608.92 kH/s
[2017-04-15 19:33:15] CPU #3: 27.38 MH, 607.36 kH/s
[2017-04-15 19:33:15] CPU #0: 27.35 MH, 606.72 kH/s
[2017-04-15 19:33:15] CPU #5: 27.41 MH, 607.97 kH/s
[2017-04-15 19:33:15] CPU #7: 27.32 MH, 605.98 kH/s
[2017-04-15 19:33:15] CPU #4: 27.45 MH, 608.83 kH/s
[2017-04-15 19:33:15] CPU #2: 26.68 MH, 591.83 kH/s
[2017-04-15 19:33:25] CPU #6: 5977.56 kH, 606.24 kH/s
[2017-04-15 19:33:25] Accepted 1/1 (100%), 196.22 MH, 4825.79 kH/s

The hodl.cpp fix that I used:
Code:
int scanhash_hodl( int threadNumber, struct work* work, uint32_t max_nonce,
                   uint64_t *hashes_done )
{
    unsigned char *mainMemoryPsuedoRandomData = hodl_scratchbuf;
    uint32_t *pdata = work->data;
    uint32_t *ptarget = work->target;

    //retreive target
    std::stringstream s;
    for (int i = 7; i>=0; i--)
      s << strprintf("%08x", ptarget[i]);

    //retreive preveios hash
    std::stringstream p;
    for (int i = 0; i < 8; i++)
      p << strprintf("%08x", swab32(pdata[8 - i]));

    //retreive merkleroot
    std::stringstream m;
    for (int i = 0; i < 8; i++)
      m << strprintf("%08x", swab32(pdata[16 - i]));

    CBlock pblock;
    pblock.SetNull();

    pblock.nVersion=swab32(pdata[0]);
    pblock.nNonce=swab32(pdata[19]);
    pblock.nTime=swab32(pdata[17]);
    pblock.nBits=swab32(pdata[18]);
    pblock.hashPrevBlock=uint256S(p.str());
    pblock.hashMerkleRoot=uint256S(m.str());
    uint256 hashTarget=uint256S(s.str());
    int collisions=0;
    uint256 hash;

//Begin AES Search
        //Allocate temporary memory
uint32_t cacheMemorySize = (1<    uint32_t comparisonSize=(1<<(PSUEDORANDOM_DATA_SIZE-L2CACHE_TARGET)); //2^(30-12) = 256K
                unsigned char *cacheMemoryOperatingData;
                unsigned char *cacheMemoryOperatingData2;
                cacheMemoryOperatingData=new unsigned char[cacheMemorySize+16];
                cacheMemoryOperatingData2=new unsigned char[cacheMemorySize];
                //Create references to data as 32 bit arrays
                uint32_t* cacheMemoryOperatingData32 = (uint32_t*)cacheMemoryOperatingData;
                uint32_t* cacheMemoryOperatingData322 = (uint32_t*)cacheMemoryOperatingData2;

                //Search for pattern in psuedorandom data
                unsigned char key[32] = {0};
                unsigned char iv[AES_BLOCK_SIZE];
                int outlen1, outlen2;

                //Iterate over the data
//                int searchNumber=comparisonSize/totalThreads;
                int searchNumber = comparisonSize / opt_n_threads;
                int startLoc=threadNumber*searchNumber;
EVP_CIPHER_CTX* ctx = NULL;
                  for(int32_t k = startLoc;k                    //copy data to first l2 cache
                    memcpy((char*)&cacheMemoryOperatingData[0], (char*)&mainMemoryPsuedoRandomData[k*cacheMemorySize], cacheMemorySize);
                    for(int j=0;j                        //use last 4 bytes of first cache as next location
                        uint32_t nextLocation = cacheMemoryOperatingData32[(cacheMemorySize/4)-1]%comparisonSize;
                        //Copy data from indicated location to second l2 cache -
                        memcpy((char*)&cacheMemoryOperatingData2[0], (char*)&mainMemoryPsuedoRandomData[nextLocation*cacheMemorySize], cacheMemorySize);
                        //XOR location data into second cache
                        for(uint32_t i = 0; i < cacheMemorySize/4; i++)
                            cacheMemoryOperatingData322[i] = cacheMemoryOperatingData32[i] ^ cacheMemoryOperatingData322[i];
                        memcpy(key,(unsigned char*)&cacheMemoryOperatingData2[cacheMemorySize-32],32);
                        memcpy(iv,(unsigned char*)&cacheMemoryOperatingData2[cacheMemorySize-AES_BLOCK_SIZE],AES_BLOCK_SIZE);
                        EVP_EncryptInit(ctx, EVP_aes_256_cbc(), key, iv);
                        EVP_EncryptUpdate(ctx, cacheMemoryOperatingData, &outlen1, cacheMemoryOperatingData2, cacheMemorySize);
                        EVP_EncryptFinal(ctx, cacheMemoryOperatingData + outlen1, &outlen2);
                        EVP_CIPHER_CTX_free(ctx);
                    }
                    //use last X bits as solution
                    uint32_t solution=cacheMemoryOperatingData32[(cacheMemorySize/4)-1]%comparisonSize;
                    if(solution<1000){
                        uint32_t proofOfCalculation=cacheMemoryOperatingData32[(cacheMemorySize/4)-2];
pblock.nStartLocation = k;
                        pblock.nFinalCalculation = proofOfCalculation;
                        hash = Hash(BEGIN(pblock.nVersion), END(pblock.nFinalCalculation));
collisions++;
if (UintToArith256(hash) <= UintToArith256(hashTarget) && !work_restart[threadNumber].restart){
          pdata[21] = swab32(pblock.nFinalCalculation);
          pdata[20] = swab32(pblock.nStartLocation);
          *hashes_done = collisions;
  //free memory
  delete [] cacheMemoryOperatingData;
                  delete [] cacheMemoryOperatingData2;
          return 1;
    }
                    }
                  }

    //free memory
    delete [] cacheMemoryOperatingData;
    delete [] cacheMemoryOperatingData2;
    *hashes_done = collisions;
    return 0;
}
legendary
Activity: 1470
Merit: 1114

The compile errors are mostly related to openssl 1.1.x. lbry and hodl are the only algos that prevent it from compiling.

Code:
algo/lbry.c:52:4: error: unknown type name 'sph_sha512_context'
    sph_sha512_context      ctx_sha512 __attribute__ ((aligned (64)));
My fix was to implement an openssl compliant sha512 instance. And fiddle with the

Code:
algo/hodl/hodl.cpp:98:18: error: aggregate 'EVP_CIPHER_CTX ctx' has incomplete type and cannot be defined
   EVP_CIPHER_CTX ctx;
My fix was to instantiate the context as a pointer and pass it to each function as is and to change the cleanup routine to EVP_CIPHER_CTX_free(ctx).

The compile log is at https://github.com/coinbutter/cpuminer-opt/blob/master/Compile%20log.txt because it's too long to post here. Some of the other changes I mostly made just for convenience sake on my part. I didn't change any of the min definitions in the uploaded code.

In other news, groestl works in 3.6.1 if the AES portions are commented out. Have you found out anything about DMD not working? I noticed that in cpuminer-multi groestlcoin gets "SHA256(sctx->job.coinbase, (int) sctx->job.coinbase_size, merkle_root);" on line 1697 of cpu-miner.c and DMD gets the default of "sha256d(merkle_root, sctx->job.coinbase, (int) sctx->job.coinbase_size);" on line 1706.

Good stuff, found 2 new bugs and one incompatibility.

Lbry: coding error on my part, sph-sha2.h is still needed for sha512. I have found that the sph implementation of SHA
is faster than openssl (prior to HW SHA) and SHA512 doesn't have HW support. Unless your testing shows openssl SHA512
is faster than sph SHA512 I'll stick with the sph version in 3.6.2.

Hodl: Your compile log had no errors so it didn't help. The error suggests something missing in openssl, maybe #include issue.
I'm tempted to relegate non-aes hodl to the legacy version, it's simpler than trying to fix it.

Groestl: The AES version is broken in 3.6.1, fix already coded for 3.6.2.

Dmd: Good catch, I missed that completely, simple to fix, will split groestl and dmd-gr in 3.6.2.

I will also remove the SHA_NI hook and just rely on the builtin __SHA__ macro.

Edit: will definitely relegate non-aes hodl to the legacy version. It's the only c++ code in the miner and removing
it will make things simpler.
Pages:
Jump to: