Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 146. (Read 444067 times)

newbie
Activity: 14
Merit: 0
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.
legendary
Activity: 1470
Merit: 1114
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.
legendary
Activity: 1470
Merit: 1114
when it makes 10.3 khash sec yours makes 8.3 khash sec

also i have not compiling,programming skills unfortunately Sad

I tried the core-avx-i build on my haswell and got the same performance as a native build.
Your CPU is definitely underperforming with cpuminer-opt but I have no clue why. Unless
someone else can reproduce your poor results and can provide more data there's nothing more
I can do. I suggest you use whatever works best for you.
newbie
Activity: 14
Merit: 0
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob
sr. member
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
when it makes 10.3 khash sec yours makes 8.3 khash sec

also i have not compiling,programming skills unfortunately Sad
legendary
Activity: 1470
Merit: 1114
http://prntscr.com/c9yfon          this is neoscrypt cpuminer screen of ghostlander
http://prntscr.com/c9ygeh         this is your miner

cpu is i5 3337u

i've used your cpuminer-core-avx-i and 3.4.3 version

so i did it good.

I can't see your images, too many scripts want to run in my browser. Please post the numbers.

This is not a code problem because I compiled both myself on the same CPU. You used a precompiled binary that was
clearly identified as a test release and requested information from users. Had you done so I wouldn't have wasted my
time chasing down a slower fork.

If you go back a few posts and read the release announcement then provide some useful data I'll look at it.
You could start with comparing core-avx-i with corei7-avx. If you can compile your own native, even better.
sr. member
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
http://prntscr.com/c9yfon          this is neoscrypt cpuminer screen of ghostlander
http://prntscr.com/c9ygeh         this is your miner

cpu is i5 3337u

i've used your cpuminer-core-avx-i and 3.4.3 version

so i did it good.
legendary
Activity: 1470
Merit: 1114
i pushed your last versions to https://github.com/tpruvot/cpuminer-opt/tree/upstream

hmage is dead ? :p

Thanks. I'm hoping to get going with git soon. I haven't found any more AVX2 quick kills, so if the Windows
binaries release doesn't have too many problems I'l have some time to explore git in more detail.
legendary
Activity: 1470
Merit: 1114
neoscrypt is slow when compared to neoscrypt cpuminer from ghoslander: https://github.com/ghostlander/cpuminer-neoscrypt

can you take that miner as base for neoscrypt algo?

also can you implement aes_ni avx features for cryptolight ?

I'll look into the ghostlander fork of neoscrypt.

I don't know that cryptolight even works as I am unaware of any coin that uses it. I would need a way to test.
I would also have to examine the code to see if the cryptonight optimisations can be ported to cryptolight.
It's not at the top of my priority list.

aeon coin uses it

https://bitcointalksearch.org/topic/ann-aeon-2019-09-27-upgrade-to-version-01300-asap-hf1146200-oct-25-641696

https://coinmarketcap.com/currencies/aeon/


WTF, ghostlader neoscrypt is half the speed of cpuminer-opt. Do your homework before making silly requests.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
sr. member
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
neoscrypt is slow when compared to neoscrypt cpuminer from ghoslander: https://github.com/ghostlander/cpuminer-neoscrypt

can you take that miner as base for neoscrypt algo?

also can you implement aes_ni avx features for cryptolight ?

I'll look into the ghostlander fork of neoscrypt.

I don't know that cryptolight even works as I am unaware of any coin that uses it. I would need a way to test.
I would also have to examine the code to see if the cryptonight optimisations can be ported to cryptolight.
It's not at the top of my priority list.

aeon coin uses it

https://bitcointalksearch.org/topic/ann-aeon-2019-09-27-upgrade-to-version-01300-asap-hf1146200-oct-25-641696

https://coinmarketcap.com/currencies/aeon/

legendary
Activity: 1470
Merit: 1114
neoscrypt is slow when compared to neoscrypt cpuminer from ghoslander: https://github.com/ghostlander/cpuminer-neoscrypt

can you take that miner as base for neoscrypt algo?

also can you implement aes_ni avx features for cryptolight ?

I'll look into the ghostlander fork of neoscrypt.

I don't know that cryptolight even works as I am unaware of any coin that uses it. I would need a way to test.
I would also have to examine the code to see if the cryptonight optimisations can be ported to cryptolight.
It's not at the top of my priority list.
legendary
Activity: 1246
Merit: 1011
cpuminer-opt-3.4.3 is available for download. It includes faster m7m on most CPUs and Windows binaries.

Source code:

https://drive.google.com/file/d/0B0lVSGQYLJIZM0RJZVZSUnpCR0k/view?usp=sharing

Windows binaries

https://drive.google.com/file/d/0B0lVSGQYLJIZRlVsc3FEVWhYU0U/view?usp=sharing

All supported architectures have seperate binaries, see README.txt for details.

Compiling was done on a i7-4790K (Haswell). AMD amdfam10 failed to compile due to AVX inconsistencies.
AMD btver1 appears to have been compiled without AES and AVX.

As this is the first release with pre-built portable Windows binaries there may be some problems. There are also
some specific questions I have that users may be able to answer. When reporting problems please provide all relevant
information such as CPU architecture, commands used, compile environment, error messages and any other information
that may be useful.

Specific questions:

I was not able to compile for Broadwell/Skylake on my Haswell. Does a native compile on these CPUs perform better than a
core-avx2 compile?

AMD performance is expected to be poor with the pre-built binaries. I suspect compiling for AMD on an Intel CPU may not
produce the optimum code. AMD users that can compile their own can confirm whether this is the case.

The major code optimisations involve AES, AVX and AVX2. The architecture that introduced these individual features should see
the biggest incremental improvement. I would like to know how much of a performance penalty exists if users were forced to
use a lesser compile. For example how much slower is Ivybridge using the corei7-avx vs a native compile or the core-avx-i build.

Thank you very very much  Wink
sr. member
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
neoscrypt is slow when compared to neoscrypt cpuminer from ghoslander: https://github.com/ghostlander/cpuminer-neoscrypt

can you take that miner as base for neoscrypt algo?

also can you implement aes_ni avx features for cryptolight ?
legendary
Activity: 1470
Merit: 1114
cpuminer-opt-3.4.3 is available for download. It includes faster m7m on most CPUs and Windows binaries.

Source code:

https://drive.google.com/file/d/0B0lVSGQYLJIZM0RJZVZSUnpCR0k/view?usp=sharing

Windows binaries

https://drive.google.com/file/d/0B0lVSGQYLJIZRlVsc3FEVWhYU0U/view?usp=sharing

All supported architectures have seperate binaries, see README.txt for details.

Compiling was done on a i7-4790K (Haswell). AMD amdfam10 failed to compile due to AVX inconsistencies.
AMD btver1 appears to have been compiled without AES and AVX.

As this is the first release with pre-built portable Windows binaries there may be some problems. There are also
some specific questions I have that users may be able to answer. When reporting problems please provide all relevant
information such as CPU architecture, commands used, compile environment, error messages and any other information
that may be useful.

Specific questions:

I was not able to compile for Broadwell/Skylake on my Haswell. Does a native compile on these CPUs perform better than a
core-avx2 compile?

AMD performance is expected to be poor with the pre-built binaries. I suspect compiling for AMD on an Intel CPU may not
produce the optimum code. AMD users that can compile their own can confirm whether this is the case.

The major code optimisations involve AES, AVX and AVX2. The architecture that introduced these individual features should see
the biggest incremental improvement. I would like to know how much of a performance penalty exists if users were forced to
use a lesser compile. For example how much slower is Ivybridge using the corei7-avx vs a native compile or the core-avx-i build.
legendary
Activity: 1470
Merit: 1114
Here is a list of CPU architectures I will attempt to compile for. The build machine is a haswell i7-4790K
running Windows 8.1. When v3.4.3 is released shortly I would appreciate some testing of the different
builds, for both compatibility and preformance.

broadwell       skylake
core-avx2       haswell
core-avx-i      ivy
corei7-avx      sandy
corei7          nehalem, westmere
core2
amdfam10        A4, A6, A8
bdver1          FX
legendary
Activity: 1470
Merit: 1114
newbie
Activity: 14
Merit: 0
Joblo ----

Playing around with this again after a while away.  Had to redo my windows build environment and having some compiling issues. 

Using MSYS2 with mingw-w64 that currently installs GCC 6.1.0 and associated tools.  It appears that there are some MIN/MAX Macro issues with building under 6.1.0, specifically on the HODL code:

Code:
gcc -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing  -I. -Iyes/include -Wno-pointer-sign -Wno-pointer-to-int-cast  -Wl,--stack,10485760 -Icompat/pthreads -O3 -march=native -Wall  -Iyes/include -MT algo/cpuminer-hmq1725.o -MD -MP -MF algo/.deps/cpuminer-hmq1725.Tpo -c -o algo/cpuminer-hmq1725.o `test -f 'algo/hmq1725.c' || echo './'`algo/hmq1725.c
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing  -I. -Iyes/include  -O3 -march=native -Wall -std=gnu++11 -fpermissive -MT algo/hodl/cpuminer-hodl.o -MD -MP -MF algo/hodl/.deps/cpuminer-hodl.Tpo -c -o algo/hodl/cpuminer-hodl.o `test -f 'algo/hodl/hodl.cpp' || echo './'`algo/hodl/hodl.cpp
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/char_traits.h:39:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/string:40,
                 from algo/hodl/hodl_uint256.h:13,
                 from algo/hodl/hodl.cpp:3:
C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algobase.h:243:56: error: macro "min" passed 3 arguments, but takes just 2
     min(const _Tp& __a, const _Tp& __b, _Compare __comp)
                                                        ^
C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algobase.h:265:56: error: macro "max" passed 3 arguments, but takes just 2
     max(const _Tp& __a, const _Tp& __b, _Compare __comp)
                                                        ^
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algo.h:60:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/algorithm:62,
                 from algo/hodl/serialize.h:13,
                 from algo/hodl/block.h:9,
                 from algo/hodl/hodl.cpp:5:
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:362:41: error: macro "max" passed 3 arguments, but takes just 2
     max(const _Tp&, const _Tp&, _Compare);
                                         ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:375:41: error: macro "min" passed 3 arguments, but takes just 2
     min(const _Tp&, const _Tp&, _Compare);
                                         ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:403:30: error: macro "min" requires 2 arguments, but only 1 given
     min(initializer_list<_Tp>);
                              ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:413:30: error: macro "max" requires 2 arguments, but only 1 given
     max(initializer_list<_Tp>);
                              ^
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/uniform_int_dist.h:35:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algo.h:66,
                 from C:/msys64/mingw64/include/c++/6.1.0/algorithm:62,
                 from algo/hodl/serialize.h:13,
                 from algo/hodl/block.h:9,
                 from algo/hodl/hodl.cpp:5:

Any thoughts?
hero member
Activity: 700
Merit: 500
i would love to see ivy bridge (i5-3330) and AMD A6-6400K versions of the windows bin, though the AMD part might be tricky

if someone knows a good writeup on compiling on windows with mingw (tried to compiled but failed) im willing to compile thhe latest versions for those targets and make them avaiable on github.

i have gone ahead and created a github repo with an exact copy of the sourcecode for easier deployment on my rigs, everyone feel free to use it while there is no official repo (https://github.com/felixbrucker/cpuminer-opt).

cheers
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
Maybe OP remembers the problems I had a few months ago
...

The new 3.4.1 works on my Nehalem too!
I finally have a miner which hopefully doesn't have the bugs the old one had (frequent disconnects from pools). Thank you.
Jump to: