Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 145. (Read 444131 times)

legendary
Activity: 1470
Merit: 1114
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.

You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES.
If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also
supports AVX.

I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are
three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos
including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have
AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.

It just occurred to me that you probably did a native compile. Do you know what arch the compiler mapped that to? A Windows
user reported success using the corei7 build on a Nehalem CPU. If yours is different you could try -march=corei7.

This will raise my confidence in the fix since I can't test it on the right HW.

Might have gotten lost in the thread above, but I was able to compile with AVX level features on the westemere based system, the obviously just don't get detected or work.  The errors only seem to occur when I set the march to westmere or lower.

I've been looking through the code for hodl to try to figure out what seems to be causing the problem. It primarily seems to be from different versions of the SHA256CBC algorithm that it is attempting to compile in simultaneously. 

In going through though, I've come across a question for you with regards to your capabilities tests --- you seem to be excluding a lot of the AES_NI optimized code by wrapping it in the AVX segment even though there don't seem to be any AVX instructions in those code segments.  I haven't had a chance to look through it thoroughly, but on a quick scan the only part of wolf's code that utilizes AVX instructions is the SHA512 function used to generate the scratchpad.  The rest of the code should be able to be under #ifndef NO_AES_NI.

Bob

Can you post the actual errors? I've been speculating it was an AVX issue. If you're seeing multiple definitions I may
be going in the wrong direction.

newbie
Activity: 14
Merit: 0
Joblo ---

I flattened the code int algo/hodl/aes.c and algo/hodl/hodl-wolf.c to remove the "non-AVX" code versions for everything but the SHA512 Function at the top of hodl-wolf.c and the code now compiles and runs for -march=westmere.

For cpuminer-corei7.exe from your download mining HODL to nicehash with 12 threads, isolated to the six cores on one CPU I am getting in the 120-130 H/s range performance

For cpuminer-westmere.exe that I compiled using the above modifications using the same configuration on the other CPU in my server I am seeing 240-250 H/s and it indicates AES optimizations ARE enabled.

newbie
Activity: 14
Merit: 0
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.

You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES.
If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also
supports AVX.

I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are
three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos
including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have
AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.

It just occurred to me that you probably did a native compile. Do you know what arch the compiler mapped that to? A Windows
user reported success using the corei7 build on a Nehalem CPU. If yours is different you could try -march=corei7.

This will raise my confidence in the fix since I can't test it on the right HW.

Might have gotten lost in the thread above, but I was able to compile with AVX level features on the westemere based system, the obviously just don't get detected or work.  The errors only seem to occur when I set the march to westmere or lower.

I've been looking through the code for hodl to try to figure out what seems to be causing the problem. It primarily seems to be from different versions of the SHA256CBC algorithm that it is attempting to compile in simultaneously. 

In going through though, I've come across a question for you with regards to your capabilities tests --- you seem to be excluding a lot of the AES_NI optimized code by wrapping it in the AVX segment even though there don't seem to be any AVX instructions in those code segments.  I haven't had a chance to look through it thoroughly, but on a quick scan the only part of wolf's code that utilizes AVX instructions is the SHA512 function used to generate the scratchpad.  The rest of the code should be able to be under #ifndef NO_AES_NI.

Bob
legendary
Activity: 1470
Merit: 1114
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.

You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES.
If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also
supports AVX.

I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are
three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos
including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have
AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.

It just occurred to me that you probably did a native compile. Do you know what arch the compiler mapped that to? A Windows
user reported success using the corei7 build on a Nehalem CPU. If yours is different you could try -march=corei7.

This will raise my confidence in the fix since I can't test it on the right HW.
legendary
Activity: 1470
Merit: 1114
I have an update on supporting cryptonight at nicehash.

I implemented the changes and they seem to work and they don't break other pools so there was no need to
impmement pool-specific code.

My test results on Nicehash are erratic, possibly a pool issue. I was initially submitted 20-25% rejects but that seems
to have stopped. The latest session is up to 36 accepts @ 100%, and counting.

I also experienced periods of extremely frequent thread hashrate output from one or 2 threads, around 100 per second, showing a hash count
of 1 with a normal hashrate. This occurred twice at startup and I killed it. It also happened mid session and cleared itself.
This is not associated with the rejects, I still submit valid shares but they show a lower than normal hashrate.

This is what it looks like:

Code:
[2016-08-25 12:23:28] CPU #0: 1 H, 72.57 H/s
[2016-08-25 12:23:28] CPU #1: 1 H, 56.63 H/s
[2016-08-25 12:23:28] CPU #0: 1 H, 55.92 H/s
[2016-08-25 12:23:28] CPU #1: 1 H, 64.27 H/s
[2016-08-25 12:23:28] CPU #0: 1 H, 67.63 H/s
[2016-08-25 12:23:28] CPU #1: 1 H, 54.73 H/s
[2016-08-25 12:23:28] CPU #0: 1 H, 55.19 H/s
[2016-08-25 12:23:28] CPU #1: 1 H, 71.66 H/s
[2016-08-25 12:23:28] CPU #0: 1 H, 69.21 H/s

More testing to do. 
legendary
Activity: 1470
Merit: 1114
@joblo ---

Mining LYRA2RE to NiceHash:


CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...

Mining HODL to NiceHash:

CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...



Excellent, thanks.
newbie
Activity: 14
Merit: 0
@joblo ---

Mining LYRA2RE to NiceHash:


CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...

Mining HODL to NiceHash:

CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...

legendary
Activity: 1470
Merit: 1114
corei7          nehalem, westmere

I did a test on Nehalem. cpuminer-corei7.exe works well for m7m and cryptonight.
Joblo, if you need other test on my CPU, just PM me, if I don't see your request in thread within 24h.

Thanks for testing. Could you post the capabilities check ouput when the miner starts. I'd like to confirm
it is correct.
legendary
Activity: 1470
Merit: 1114
@joblo --- I'd be happy to test the AES w/o AVX configurations --- do you have a patch file or a link to the downloaded files with the patches?



Thanks for the offer.

I assume you have a Westmere CPU, and you use Windows. You can test with the current cpuminer-corei7 binary. As another user
discovered Hodl algo failed to compile on his Westmere so I don't suggest you attempt hodl. I also have a fix coming shortly for that algo.

Cryptonight also has AES code that may or may not work on a Westmere. There is another block of AES code shared among
many algos including x11. Both of these would be good tests. Don't test cryptonight on Nicehash yet, I'm still working on that.

If you did test either or both of them please include in your report whether the miner chose to use the AES code:

Code:
Start mining with SSE2 AES AVX

It will display both AES and AVX or neither. At this time I'm treating them as a single architecture level. If your tests
prove there is AES code that does not also contain AVX I can split them up and improve performance on Westmere on
some algos. Otherwise Westmere will have to dumb down and use the SSE2 code only as  had to be done with Hodl.
newbie
Activity: 14
Merit: 0
@joblo --- I'd be happy to test the AES w/o AVX configurations --- do you have a patch file or a link to the downloaded files with the patches?

sr. member
Activity: 292
Merit: 250
@Joblo would it be cool with you if I mine to your BTC donation address at nicehash as donation time for your work? I'll mine one full day per week as donation. Every other user can do the same as well.

That's a very generous and motivating offer.

I took a quick look at the mod text you linked. With that and the Nicehash source code there should be no problem
implementing it.

I'll dig into it more later today and provide an update.

I just started mining to your BTC address, just don't expect a lot since I only running 2 cpus. Hope other users would donate some cpu time to you as well.

https://www.nicehash.com/index.jsp?p=miners&addr=12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
corei7          nehalem, westmere

I did a test on Nehalem. cpuminer-corei7.exe works well for m7m and cryptonight.
Joblo, if you need other test on my CPU, just PM me, if I don't see your request in thread within 24h.
legendary
Activity: 1470
Merit: 1114
How much do you get on the G1840 celeron? (Cryptonight)


I don't have one to test with but I don't expect much, it doesn't even have AES_NI.
legendary
Activity: 1470
Merit: 1114
@Joblo would it be cool with you if I mine to your BTC donation address at nicehash as donation time for your work? I'll mine one full day per week as donation. Every other user can do the same as well.

That's a very generous and motivating offer.

I took a quick look at the mod text you linked. With that and the Nicehash source code there should be no problem
implementing it.

I'll dig into it more later today and provide an update.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
How much do you get on the G1840 celeron? (Cryptonight)
sr. member
Activity: 292
Merit: 250
@Joblo would it be cool with you if I mine to your BTC donation address at nicehash as donation time for your work? I'll mine one full day per week as donation. Every other user can do the same as well.
hero member
Activity: 700
Merit: 500
Nicehash just released cryptonight mining however have to use their custom cpuminer which is lots slower. Is it possible to make cpuminer-opt compatible with nicehash?

https://www.nicehash.com/index.jsp?p=news&id=99
https://github.com/nicehash/Specifications/blob/master/NiceHash_CryptoNight_modification_v1.0.txt

For reference (i5-2500k 3-threads, linux)
cpuminer-opt-3.4.3 => 174 H/s
cpuminer-nicehash => 110 H/s

thumbs up
sr. member
Activity: 292
Merit: 250
Nicehash just released cryptonight mining however have to use their custom cpuminer which is lots slower. Is it possible to make cpuminer-opt compatible with nicehash?

https://www.nicehash.com/index.jsp?p=news&id=99
https://github.com/nicehash/Specifications/blob/master/NiceHash_CryptoNight_modification_v1.0.txt

For reference (i5-2500k 3-threads, linux)
cpuminer-opt-3.4.3 => 174 H/s
cpuminer-nicehash => 110 H/s
legendary
Activity: 1470
Merit: 1114
I have fixes for the two compile problems with hodl on Westmere CPUs. Westmere will now use the unoptimized
hodl function. I have also fixed the min/max duplication by making local definitions where required, instead of a global
definition.

I'd like to wait for more test results before building a new release in case more problems are reported, particularly with the
Windows binaries.
legendary
Activity: 1470
Merit: 1114
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.

You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES.
If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also
supports AVX.

I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are
three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos
including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have
AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.
Jump to: