Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 175. (Read 444040 times)

sr. member
Activity: 312
Merit: 250
New in v3.1.15

  - unified build procedure fixed
     - build.sh now works for CPUs with and without AES_NI
     - it is no longer necessary to add "-DNO_AES_NI" CFLAG to the
       configure command when building for CPUs without AES_NI.
     - The system will automatically compile for the correct architecture

Thanks!

Strange thing - when I compile with
Code:
./autogen.sh && ./configure CFLAGS="-DNO_AES_NI -O3 -march=btver1" --with-curl --with-crypto && make
on AMD Sempron 145, all works like a charm.

When I use ./build.sh I get an error in compile.

Will try on AMD Phenom II X4 940 and see can I reproduce it.

Edit: ./build.sh fails also on AMD Phenom II X4 940
Code:
make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1
make[2]: *** Waiting for unfinished jobs....
mv -f algo/groestl/sse2/.deps/cpuminer-grso-asm.Tpo algo/groestl/sse2/.deps/cpuminer-grso-asm.Po
mv -f algo/argon2/ar2/.deps/cpuminer-opt.Tpo algo/argon2/ar2/.deps/cpuminer-opt.Po
mv -f algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Tpo algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Po
make[2]: Leaving directory `/home/urban/cpuminer-opt-3.1.15'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/*****/cpuminer-opt-3.1.15'
make: *** [all] Error 2
legendary
Activity: 1470
Merit: 1114
cpuminer-opt now supports 31 algorithms on CPUs with at least SSE2
capabilities including Intel Core2 and AMD equivalent.

In addition 13 algorithms have optimizations to take advantage of
CPUs with AES_NI for even greater performance, including the Intel
Core-i 2xxx and AMD equivalent.

See the first post of this thread and the README.md file for details.

It is currently available in source code format compileable
in Linux. Windows source and binary support is planned.

        cpuminer-opt v3.1.15 is available for download.

https://drive.google.com/file/d/0B0lVSGQYLJIZdnI3SG9jNmZNRHM/view?usp=sharing

All users are encouraged to upgrade.

New in v3.1.15

  - unified build procedure fixed
     - build.sh now works for CPUs with and without AES_NI
     - it is no longer necessary to add "-DNO_AES_NI" CFLAG to the
       configure command when building for CPUs without AES_NI.
     - The system will automatically compile for the correct architecture
legendary
Activity: 1470
Merit: 1114
I don't expect any performance improvements out of the box.

My bad, we were talking about this from different angles. Performance-wise there is no good reason for this. This will _not_ improve performance at all, in fact, the code that is in gcc's implementation of unordered_map and in boost is very similar and will likely behave exactly the same.

The only difference is that users won't need to install boost, and on some platforms like Windows boost doesn't come pre-packaged. This harms eventual plans to port this to Windows.

Not requiring boost means people won't have to deal with 210 pages of this -- http://stackoverflow.com/search?q=boost+visual-studio

So, to recap:
 * Performance wise -- there's no difference.
 * Maintenance wise -- there is.

Sorry for misunderstanding.

Many thanks, I appreciate your patience with my stubbornness and picky questions. I't's all clear now.
member
Activity: 83
Merit: 10
I don't expect any performance improvements out of the box.

My bad, we were talking about this from different angles. Performance-wise there is no good reason for this. This will _not_ improve performance at all, in fact, the code that is in gcc's implementation of unordered_map and in boost is very similar and will likely behave exactly the same.

The only difference is that users won't need to install boost, and on some platforms like Windows boost doesn't come pre-packaged. This harms eventual plans to port this to Windows.

Not requiring boost means people won't have to deal with 210 pages of this -- http://stackoverflow.com/search?q=boost+visual-studio

So, to recap:
 * Performance wise -- there's no difference.
 * Maintenance wise -- there is.

Sorry for misunderstanding.
legendary
Activity: 1470
Merit: 1114
I don't understand your point. A new compiler feature was added experimentally

Lemme stop you there, GCC supports C++11 since 4.8.1 as feature-complete and stable, not experimental.

https://gcc.gnu.org/projects/cxx-status.html#cxx11

They do not change defaults for other reason -- a lot of code will break if you upgrade to newer version of the standard, not because it's experimental.


I'm still not sure it's the rigt thing to do. So c++11 is not the default until 6.1 over concerns it may break existing code. You have shown it is not the case with cpuminer-opt sothere is little risk. The fact the code is agnostic is good because it can still work on c++98 (not sure about this because I believe it requires code changes between versions). These are all reasons it's not a bad thing but I'm having a hard time coming up with any reasons it's a good thing.

I don't expect any performance improvements out of the box, and if there are any to be realized it would require code chnages that would break compatibility. It's forward thinking but the migration could be done at any time, such as after c++11 v6.1 is released. If it had been done with the first hodl release it would have eliminated the need to install packages but now that hodl is in the wild with the libboost dependencies it's no longer an issue.

Can you help me out here? Why is it a good thing?
member
Activity: 83
Merit: 10
I don't understand your point. A new compiler feature was added experimentally

Lemme stop you there, GCC supports C++11 since 4.8.1 as feature-complete and stable, not experimental.

https://gcc.gnu.org/projects/cxx-status.html#cxx11

They do not change defaults for other reason -- a lot of code will break if you upgrade to newer version of the standard, not because it's experimental.

I did (march=core2, not real HW), and it did. One of us has a big blind spot. I've looked at groestl-version.h many times and there is no reference to NO_AES_NI. I had previously commented them out and hard coded VAES because that's all that was being used.

Yes, I was wrong, sorry. Probably a leftover change that isn't needed anymore.

PS: On this forum there's no need for doing linebreaks yourself — it looks weird when the window size is smaller than your linebreaks, and different font rendering systems and different installed fonts will end up with words being different sizes than what you see on your computer (yay compatibility!). Here's how it looks for me — http://i.imgur.com/BRx5I0r.png — note the hanging 'to' and 'the'.
legendary
Activity: 1470
Merit: 1114
I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable
performance gain I could be persuaded otherwise.

The defaults vary between versions of compiler. On GCC 5.x, it will compile fine because it's default there, by explicitly stating what language version you're using you're actually ensuring compatibility of your code with the compiler (it won't try to treat 'auto' as a keyword, which is new in C++14, for example, and default in 5.3.0, also inline semantics are incompatible between C89 and C99).

The message you were seeing about unordered_map was applicable to that language version you were using, it does not apply to other language version and it goes away if you explicitly specify newer C++11 rather than default C++98 (12 years of difference) in gcc 4.9.

I don't understand your point. A new compiler feature was added experimentally and required a non-default option to
enable it. This is to ensure any incompatibilities with legacy code can be avoided. When the app has migrated to
use the new feature it would set that option to enable the compiler to use the new feature. After some transition
period the new feature would be made default on the assumption that all apps have migrated. Based on that
using the default (experimental feature disabled) should be safer. Why not?  

Edit: You mentioned the message is due to the language version I'm using. That supports my point. I'm running the
latest LTS version of mint and the feature is experimental. Older distros may be incompatible and I want to
remain compatible with older distros using older compilers. I can't assume that Centos 6, or Centos 5 (yes still
supported) have updated the compiler to support the new feature.

All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed
needs block out AES code in order to compile on core2?

Try compiling with -march=core2 and you'll see it won't compile.

Maybe the location I placed that include isn't the best.

I did (march=core2, not real HW), and it did. One of us has a big blind spot. I've looked at groestl-version.h many
times and there is no reference to NO_AES_NI. I had previously commented them out and hard coded VAES because
that's all that was being used.
member
Activity: 83
Merit: 10
I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable
performance gain I could be persuaded otherwise.

The defaults vary between versions of compiler. Even on latest GCC 5.3 it's not default and it won't be default for some time (the next major release after 5.3 changes it to C++11), there are still distributions that come with gcc 4.7.

By explicitly stating what language version you're using you're actually ensuring compatibility of your code with the compiler (it won't try to treat 'auto' as a keyword, which is new in C++14, for example, also inline semantics are incompatible between C89 and C99).

The message you were seeing about unordered_map was applicable to that language version you were using (C++98), it does not apply to other language version and it goes away if you explicitly specify newer C++11 rather than default C++98 (12 years of difference) in gcc 4.7 or newer.

It's up to you anyway, I'll just keep it in my fork, maybe people will find it easier to start using.

Don't forget to add checks for boost in configure.ac so it's absence is detected at configure time.

All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed
needs block out AES code in order to compile on core2?

Try compiling with -march=core2 and you'll see it won't compile.

Maybe the location I placed that include isn't the best.
legendary
Activity: 1470
Merit: 1114
I'm hesitant to change boost because the compiler flagged it as experimental.
I'm not sure I want to go there yet.

What version of compiler? Did you put -std=gnu++11 compiler flag into CXXFLAGS? That's important.

Adding #include "miner.h' does nothing in groestl-version.h, everything is hard coded.
It does when you try to compile for -march=core2 or on core2 with -march=native, without the include groestl won't have NO_AES_NI when needed and it won't compile on core2.

To simplify testing I do a -march=core2 pass after every change, and then -march=native pass.

I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable
performance gain I could be persuaded otherwise.

All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed
needs block out AES code in order to compile on core2?
member
Activity: 83
Merit: 10
I'm hesitant to change boost because the compiler flagged it as experimental.
I'm not sure I want to go there yet.

What version of compiler? Did you put -std=gnu++11 compiler flag into CXXFLAGS? That's important.

Adding #include "miner.h' does nothing in groestl-version.h, everything is hard coded.
It does when you try to compile for -march=core2 or on core2 with -march=native, without the include groestl won't have NO_AES_NI when needed and it won't compile on core2.

To simplify testing I do a -march=core2 pass after every change, and then -march=native pass.
legendary
Activity: 1470
Merit: 1114
legendary
Activity: 1470
Merit: 1114
The problem is now understood and a fix has been coded and tested.  The fix will be in the next
release but since the workaround is to use the old build procedure I will not rush the next release
for this issue. V3.1.14is still a better release than v3.1.13.

There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have
AES_NI. Please continue to use -DNO_AES_NI on the configure command line if your CPU does
not have AES_NI.



cpuminer-opt v3.1.14 released.

https://drive.google.com/file/d/0B0lVSGQYLJIZaE5DYXA4SHl2WVk/view?usp=sharing

New in v3.1.14

Algos
     - cryptonight algo is now supported on CPUs without AES_NI.
       All algos now support both CPU architectures.
     - jane added as an alias for scryptjane with default N-factor 16

Build enhancements, see details in README.md (thanks to hmage)
     - build.sh now works for CPUs with and without AES_NI
     - it is no longer necessary to add -DNO_AES_NI CFLAG to
       configure command when building for CPUs without AES_NI.

     Note: Compiling requires some additional libraries not included
     in the default instalation of most Linux distributions: libboost-dev,
     libboost-system-dev, libboost-thread-dev.

UI enhancements
     - enhanced checks for CPU architecture, SW build and algo for
       AES_NI and SSE2 capabilities.
     - a warning is displayed if mining an untested algo.

Code cleanup
     - removed a few more compiler warnings
     - removed some dead code

Algo gate enhancements (for devs)
     - replaced algo specific null gate functions with generic null
       functions


legendary
Activity: 1470
Merit: 1114
There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have
AES_NI. Please continue to use _DNO_AES_NI on the configure command line if your CPU does
not have AES_NI.

Yeah, you added the NO_AES_NI define inside the VisualC-specific section of miner.h  Cool

DOH!. That explains everything. Thanks for noticing that stupidity.

I found the bugs in has_sse2, yes there were two of them, wrong reg & wrong field.

I'll look over your other stuff and fully understand it before I implement it. I will learn more
by hand coding it. I also need the practice.
 
Eliminating dependencies is good. You should pass these along to Epsylon3 as he may be interested
in porting them to his fork.


member
Activity: 83
Merit: 10
There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have
AES_NI. Please continue to use _DNO_AES_NI on the configure command line if your CPU does
not have AES_NI.

Yeah, you added the NO_AES_NI define inside the VisualC-specific section of miner.h  Cool

Here's a diff against v3.1.14 to make it work on core2:
 * https://github.com/hmage/cpuminer-opt/commit/94f503824a852335770c0ee5789bbd7511224a37
 * https://github.com/hmage/cpuminer-opt/commit/94f503824a852335770c0ee5789bbd7511224a37.diff -- raw diff

You also no longer need to carry around compat/curl-for-windows/openssl, since you are including system openssl headers anyway:
 * https://github.com/hmage/cpuminer-opt/commit/b0c0a7ab9b1d812075de90052b7f6ecb58d8a2cf
 * https://github.com/hmage/cpuminer-opt/commit/b0c0a7ab9b1d812075de90052b7f6ecb58d8a2cf.diff -- raw diff

You no longer need boost to compile hodlcoin, the code uses unordered_map, which is part of C++11 standard and gcc supports it since 4.7:
 * https://github.com/hmage/cpuminer-opt/commit/73eae16f3afa1b1f32ab4264c705b1e6bbbd8bbc
 * https://github.com/hmage/cpuminer-opt/commit/73eae16f3afa1b1f32ab4264c705b1e6bbbd8bbc.diff -- raw diff

To properly report that SW and CPU support AES and SSE2, you need to fix has_sse2() first:
 * https://github.com/hmage/cpuminer-opt/commit/d867ad08c7dac01f8c3735b0e65afb0242a4566f
 * https://github.com/hmage/cpuminer-opt/commit/d867ad08c7dac01f8c3735b0e65afb0242a4566f.diff -- raw diff

Then go fix the check_cpu_capability() to actually report the proper values:
 * https://github.com/hmage/cpuminer-opt/commit/c869830ef2446d863019e284437382681da259ab
 * https://github.com/hmage/cpuminer-opt/commit/c869830ef2446d863019e284437382681da259ab.diff -- raw diff

To avoid problems with misapplying the patches (I guess you applied my previous patch manually, therefore you put NO_AES_NI define in wrong place in miner.h), I recommend using the standard patch utility to automate the process:
Code:
cd cpuminer-opt
wget https://github.com/hmage/cpuminer-opt/commit/94f503824a852335770c0ee5789bbd7511224a37.diff
wget https://github.com/hmage/cpuminer-opt/commit/b0c0a7ab9b1d812075de90052b7f6ecb58d8a2cf.diff
wget https://github.com/hmage/cpuminer-opt/commit/73eae16f3afa1b1f32ab4264c705b1e6bbbd8bbc.diff
wget https://github.com/hmage/cpuminer-opt/commit/d867ad08c7dac01f8c3735b0e65afb0242a4566f.diff
wget https://github.com/hmage/cpuminer-opt/commit/c869830ef2446d863019e284437382681da259ab.diff
patch -p1 -i 94f503824a852335770c0ee5789bbd7511224a37.diff
patch -p1 -i b0c0a7ab9b1d812075de90052b7f6ecb58d8a2cf.diff
patch -p1 -i 73eae16f3afa1b1f32ab4264c705b1e6bbbd8bbc.diff
patch -p1 -i d867ad08c7dac01f8c3735b0e65afb0242a4566f.diff
patch -p1 -i c869830ef2446d863019e284437382681da259ab.diff

This way it will apply exactly as it is on my side, assuming you didn't make any changes nearby.
legendary
Activity: 1470
Merit: 1114
There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have
AES_NI. Please continue to use _DNO_AES_NI on the configure command line if your CPU does
not have AES_NI.



cpuminer-opt v3.1.14 released.

https://drive.google.com/file/d/0B0lVSGQYLJIZaE5DYXA4SHl2WVk/view?usp=sharing

New in v3.1.14

Algos
     - cryptonight algo is now supported on CPUs without AES_NI.
       All algos now support both CPU architectures.
     - jane added as an alias for scryptjane with default N-factor 16

Build enhancements, see details in README.md (thanks to hmage)
     - build.sh now works for CPUs with and without AES_NI
     - it is no longer necessary to add -DNO_AES_NI CFLAG to
       configure command when building for CPUs without AES_NI.

     Note: Compiling requires some additional libraries not included
     in the default instalation of most Linux distributions: libboost-dev,
     libboost-system-dev, libboost-thread-dev.

UI enhancements
     - enhanced checks for CPU architecture, SW build and algo for
       AES_NI and SSE2 capabilities.
     - a warning is displayed if mining an untested algo.

Code cleanup
     - removed a few more compiler warnings
     - removed some dead code

Algo gate enhancements (for devs)
     - replaced algo specific null gate functions with generic null
       functions


member
Activity: 83
Merit: 10
I had problems with has_sse2 last time I tried that, maybe I'll success this time.

has_sse2() was broken. Had to repair it. It was looking at wrong cpuid field.
legendary
Activity: 1470
Merit: 1114
Hmmm, I should retest, maybe I actually used "-march=native -DNO_AES_NI". I'm hoping I'm right
because it enables detection of an AES_NI cross compiled build being run on a non-AES_NI CPU. Although
very unlikely to occur it also adds a safety net in case bugs creep in, and that kind of design was part of my
past professional life.

Edit: if __AES__ doesn't reflect the HW arch I can go back to has_aes_ni() from cpuid.

You're probably confusing two things:
 * The processor the binary was built for
 * The processor the binary is running on

These two things are similar but separate:
 * -march sets the _target_ processor the binary will be built for. The binary will error with "illegal instruction" if you run it on older CPU's.
 * cpuid returns the _current_ processor it's running under, if you can code your program to dynamically switch between implementations based on what CPU is currently running the binary.

__AES__ means that the _target_ processor supports AES -- that is a macro, not a variable, it doesn't change between runs. It reflects the hardware architecture of the processor the binary is _designed_ to be run on. Trying to execute AES instructions on core2 will give you "illegal instruction" error.

You need to check that _both_ the binary and the CPU support AES before actually doing the code. You need to check for __AES__ flag and get the cpuid capability.

Same goes for SSE2, you need to pull it from cpuid and check that __SSE2__ is defined.

I've rewritten the check_cpu_capability() to reflect what I mean:
https://github.com/hmage/cpuminer-opt/commit/30ef05f18b538655440667c5106e4967a27ccb51?diff=split

I unerstood the concepts but not the implementation. I had problems with has_sse2 last time I tried that, maybe I'll
success this time.
member
Activity: 83
Merit: 10
Hmmm, I should retest, maybe I actually used "-march=native -DNO_AES_NI". I'm hoping I'm right
because it enables detection of an AES_NI cross compiled build being run on a non-AES_NI CPU. Although
very unlikely to occur it also adds a safety net in case bugs creep in, and that kind of design was part of my
past professional life.

Edit: if __AES__ doesn't reflect the HW arch I can go back to has_aes_ni() from cpuid.

You're probably confusing two things:
 * The processor the binary was built for
 * The processor the binary is running on

These two things are similar but separate:
 * -march sets the _target_ processor the binary will be built for. The binary will error with "illegal instruction" if you run it on older CPU's.
 * cpuid returns the _current_ processor it's running under, if you can code your program to dynamically switch between implementations based on what CPU is currently running the binary.

__AES__ means that the _target_ processor supports AES -- that is a macro, not a variable, it doesn't change between runs. It reflects the hardware architecture of the processor the binary is _designed_ to be run on. Trying to execute AES instructions on core2 will give you "illegal instruction" error.

You need to check that _both_ the binary and the CPU support AES before actually doing the code. You need to check for __AES__ flag and get the cpuid capability.

Same goes for SSE2, you need to pull it from cpuid and check that __SSE2__ is defined.

I've rewritten the check_cpu_capability() to reflect what I mean:
https://github.com/hmage/cpuminer-opt/commit/30ef05f18b538655440667c5106e4967a27ccb51?diff=split
legendary
Activity: 1470
Merit: 1114
I think there is. Even with "-march=core2 -DNO_AES_NI" the __AES__ var is still defined.

It depends on -march:
Code:
hmage@dhmd:~$ echo '' | gcc -E -dD -march=core2 -|grep __AES__|wc -l
0
hmage@dhmd:~$ echo '' | gcc -E -dD -march=haswell -|grep __AES__|wc -l
1

If you're seeing __AES__ macro defined when you're using -march=core2, then you have wrong build flags or unclean build.

Hmmm, I should retest, maybe I actually used "-march=native -DNO_AES_NI". I'm hoping I'm right
because it enables detection of an AES_NI cross compiled build being run on a non-AES_NI CPU. Although
very unlikely to occur it also adds a safety net in case bugs creep in, and that kind of design was part of my
past professional life.

Edit: if __AES__ doesn't reflect the HW arch I can go back to has_aes_ni() from cpuid.
member
Activity: 83
Merit: 10
That seems very worthwhile to do if it gets windows Working.

Fixing LTO to work on GCC/Linux has nothing to do with making it work for MSVC/Windows. If anything, I don't have Windows here to test so it might be actually breaking Windows build.

I think there is. Even with "-march=core2 -DNO_AES_NI" the __AES__ var is still defined.

It depends on -march:
Code:
hmage@dhmd:~$ echo '' | gcc -E -dD -march=core2 -|grep __AES__|wc -l
0
hmage@dhmd:~$ echo '' | gcc -E -dD -march=haswell -|grep __AES__|wc -l
1

If you're seeing __AES__ macro defined when you're using -march=core2, then you have wrong build flags or unclean build.
Jump to: