Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 163. (Read 444131 times)

legendary
Activity: 1470
Merit: 1114
Curious. How often does it occur? I have seen this very infrequently but have seen it before. Restarting the miner
usually fixes it. If it's happening more often that's a new issue.

Nice observation about the 2 GB RAM usage. HOdl uses a 1 GB buffer so somehow there were 2 allocated.
You didn't have two instances running at the same time did you? That wouldn't explain th erejects but would
explain the RAM usage.

My opinion is it occurs rather infrequent but annoying enough. No I didn't run 2 instances of the miner.

That's great info. If one miner instance can allocate two buffers it's a mutex issue and I know exactly where.

I've got a better solution. Take the buffer alloc out of the threads (don't know why it was there in the first place)
and move it to main.
sr. member
Activity: 292
Merit: 250
Curious. How often does it occur? I have seen this very infrequently but have seen it before. Restarting the miner
usually fixes it. If it's happening more often that's a new issue.

Nice observation about the 2 GB RAM usage. HOdl uses a 1 GB buffer so somehow there were 2 allocated.
You didn't have two instances running at the same time did you? That wouldn't explain th erejects but would
explain the RAM usage.

My opinion is it occurs rather infrequent but annoying enough. No I didn't run 2 instances of the miner.
legendary
Activity: 1470
Merit: 1114
@joblo Think I have found a bug. Using the windows build from cryptomining-blog, get rejects: share above target and noticed the memory usage spikes to 2GB of ram. When it works normally only 1GB ram is used. The problem occurs occasionally at miner launch, restarting can get it to work.



Curious. How often does it occur? I have seen this very infrequently but have seen it before. Restarting the miner
usually fixes it. If it's happening more often that's a new issue.

Nice observation about the 2 GB RAM usage. HOdl uses a 1 GB buffer so somehow there were 2 allocated.
You didn't have two instances running at the same time did you? That wouldn't explain th erejects but would
explain the RAM usage.
sr. member
Activity: 292
Merit: 250
@joblo Think I have found a bug. Using the windows build from cryptomining-blog, get rejects: share above target and noticed the memory usage spikes to 2GB of ram. When it works normally only 1GB ram is used. The problem occurs occasionally at miner launch, restarting can get it to work.

legendary
Activity: 1470
Merit: 1114

So, in this particular case, moving the assignments to the left will solve the issue:
Code:
CFLAGS="-O3 -march=native -Wall" CXXFLAGS="$CFLAGS -std=gnu++11" ./configure --with-curl


Thanks I'll make the change.
member
Activity: 83
Merit: 10
Thanks. I can't take credit for the HOdl optimizations, that's Wolf0's work, but yes it is hand optimized.

Regarding the propagation of CFLAGS I assumed that the first part (assigning CFLAGS) would run before assigning
CXXFLAGS. Is that incorrect? Do I need to set CFLAGS outside of the configure command line?

You are not assigning CFLAGS in the shell if you give it as command line argument. And $CFLAGS expansion is done by the shell, not configure.

Short explanation:
Code:
echo A=1 B=1
shell runs "echo" with arguments "A=1" and "B=1". You'll get "A=1 B=1" as program's output. (with newline)

Code:
A=1 B=1 echo
shell sets environment variables "A=1" and "B=1", then runs "echo" with no arguments. You'll get empty string as program's output. (with newline)

This is big difference.

Many unix shells can expand any environment variable anywhere by using dollar sign, and this is where this is behaving not the way you intuitively might expect.

Code:
echo A=1 B=$A
shell, before executing this, expands $A into contents of that environment variable. But since A is not set, it replaces it with nothing, so shell runs "program" with arguments "A=1" and "B=". You'll get "A=1 B=" as program's output. (with newline)

Code:
A=1 B=$A echo
Shell, before executing this, sets environment variable "A=1", then expands $A into 1, then sets environment variable "B=1", then runs "echo" with no arguments. You'll get empty string as program's output. (with newline)

"configure" is coded to read CFLAGS and CXXFLAGS from environment variables if they are set, since it is common to set them as a separate phase in package managers before building (ubuntu, debian) and in some environments like gentoo it is common to have them set globally.

As a way to override that behaviour, it is possible to override environment's CFLAGS and CXXFLAGS with your custom ones by passing them as command line arguments to configure.

So, in this particular case, moving the assignments to the left will solve the issue:
Code:
CFLAGS="-O3 -march=native -Wall" CXXFLAGS="$CFLAGS -std=gnu++11" ./configure --with-curl
legendary
Activity: 1470
Merit: 1114
member
Activity: 83
Merit: 10
Thanks, you may have noticed that I implemented unordered_map as you suggested a while back.

Yeap, I just needed to be patient Wink

There is an (albeit small) issue with C++ files not being optimized:

Code:
./configure --with-curl CFLAGS="-O3 -march=native -Wall" CXXFLAGS="$CFLAGS -std=gnu++11"

Unless you previously set CFLAGS bash variable, $CFLAGS becomes nothing, so you're effectively calling this:
Code:
./configure --with-curl CFLAGS="-O3 -march=native -Wall" CXXFLAGS=" -std=gnu++11"

As a result, c++ files are compiled without any optimizations:
Code:
hmage@dhmd:~/src/cpuminer-opt$ ./build.sh 2>&1 | fgrep g++
checking for g++... g++
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT crypto/cpuminer-magimath.o -MD -MP -MF crypto/.deps/cpuminer-magimath.Tpo -c -o crypto/cpuminer-magimath.o `test -f 'crypto/magimath.cpp' || echo './'`crypto/magimath.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-hodl.o -MD -MP -MF algo/hodl/.deps/cpuminer-hodl.Tpo -c -o algo/hodl/cpuminer-hodl.o `test -f 'algo/hodl/hodl.cpp' || echo './'`algo/hodl/hodl.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-hodl_arith_uint256.o -MD -MP -MF algo/hodl/.deps/cpuminer-hodl_arith_uint256.Tpo -c -o algo/hodl/cpuminer-hodl_arith_uint256.o `test -f 'algo/hodl/hodl_arith_uint256.cpp' || echo './'`algo/hodl/hodl_arith_uint256.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-hodl_uint256.o -MD -MP -MF algo/hodl/.deps/cpuminer-hodl_uint256.Tpo -c -o algo/hodl/cpuminer-hodl_uint256.o `test -f 'algo/hodl/hodl_uint256.cpp' || echo './'`algo/hodl/hodl_uint256.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-hash.o -MD -MP -MF algo/hodl/.deps/cpuminer-hash.Tpo -c -o algo/hodl/cpuminer-hash.o `test -f 'algo/hodl/hash.cpp' || echo './'`algo/hodl/hash.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-hmac_sha512.o -MD -MP -MF algo/hodl/.deps/cpuminer-hmac_sha512.Tpo -c -o algo/hodl/cpuminer-hmac_sha512.o `test -f 'algo/hodl/hmac_sha512.cpp' || echo './'`algo/hodl/hmac_sha512.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-sha256.o -MD -MP -MF algo/hodl/.deps/cpuminer-sha256.Tpo -c -o algo/hodl/cpuminer-sha256.o `test -f 'algo/hodl/sha256.cpp' || echo './'`algo/hodl/sha256.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-sha512.o -MD -MP -MF algo/hodl/.deps/cpuminer-sha512.Tpo -c -o algo/hodl/cpuminer-sha512.o `test -f 'algo/hodl/sha512.cpp' || echo './'`algo/hodl/sha512.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT algo/hodl/cpuminer-utilstrencodings.o -MD -MP -MF algo/hodl/.deps/cpuminer-utilstrencodings.Tpo -c -o algo/hodl/cpuminer-utilstrencodings.o `test -f 'algo/hodl/utilstrencodings.cpp' || echo './'`algo/hodl/utilstrencodings.cpp
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing -I./compat/jansson -I. -Iyes/include  -std=gnu++11 -MT cpuminer-uint256.o -MD -MP -MF .deps/cpuminer-uint256.Tpo -c -o cpuminer-uint256.o `test -f 'uint256.cpp' || echo './'`uint256.cpp

To fix this, you need to either duplicate contents of CFLAGS manually in CXXFLAGS, or use shell variables (the way I do it in my fork):
Code:
CFLAGS="-O3 -march=native -Wall" CXXFLAGS="$CFLAGS -std=gnu++11" ./configure --with-curl

bash will expand $CFLAGS into contents of the variable it was set in same line previously, and then configure will pick up shell variables CFLAGS and CXXFLAGS and use them accordingly.

The problem is marginal, because it affects only hodl and magimath, and hodl is hand-optimized into AES anyway.

I did not benchmark the difference in performance yet, but plan to -- when back from work.
legendary
Activity: 1470
Merit: 1114
@joblo cryptomining-blog have updated their original article for the 3.3.2 windows release.

http://cryptomining-blog.com/7900-windows-binaries-for-the-cpuminer-opt-3-3-cpu-miner/

Thanks, OP is updated with the link.
legendary
Activity: 1470
Merit: 1114
cpuminer-opt v3.3.2 is released.

https://drive.google.com/open?id=0B0lVSGQYLJIZczlQLV93RnUzV1U

- Fixed low difficulty shares mining hodl
- Changed minimum CPU requirement for AES optimized mining, now requires
  a CPU with AES and AVX1. This excludes first generation AES CPUs like
  Nehalem and similar AMD. These CPUs can still mine using SSE2 optimizations.
- Updated build.sh and README.md

Windows users who don't like to compile their own can wait for Cryptomining Blog
to provide precompiled Windows executables. I will post the link in the OP when
it becomes available.

Incorporated your changes into the git repository -- https://github.com/hmage/cpuminer-opt

As of now, there are no more source-level changes between my fork and yours -- https://github.com/hmage/cpuminer-opt/compare/upstream...master

Thanks, you may have noticed that I implemented unordered_map as you suggested a while back.
sr. member
Activity: 292
Merit: 250
@joblo cryptomining-blog have updated their original article for the 3.3.2 windows release.

http://cryptomining-blog.com/7900-windows-binaries-for-the-cpuminer-opt-3-3-cpu-miner/
member
Activity: 83
Merit: 10
cpuminer-opt v3.3.2 is released.

https://drive.google.com/open?id=0B0lVSGQYLJIZczlQLV93RnUzV1U

- Fixed low difficulty shares mining hodl
- Changed minimum CPU requirement for AES optimized mining, now requires
  a CPU with AES and AVX1. This excludes first generation AES CPUs like
  Nehalem and similar AMD. These CPUs can still mine using SSE2 optimizations.
- Updated build.sh and README.md

Windows users who don't like to compile their own can wait for Cryptomining Blog
to provide precompiled Windows executables. I will post the link in the OP when
it becomes available.

Incorporated your changes into the git repository -- https://github.com/hmage/cpuminer-opt

As of now, there are no more source-level changes between my fork and yours -- https://github.com/hmage/cpuminer-opt/compare/upstream...master
legendary
Activity: 1470
Merit: 1114
cpuminer-opt v3.3.2 is released.

https://drive.google.com/open?id=0B0lVSGQYLJIZczlQLV93RnUzV1U

- Fixed low difficulty shares mining hodl
- Changed minimum CPU requirement for AES optimized mining, now requires
  a CPU with AES and AVX1. This excludes first generation AES CPUs like
  Nehalem and similar AMD. These CPUs can still mine using SSE2 optimizations.
- Updated build.sh and README.md

Windows users who don't like to compile their own can wait for Cryptomining Blog
to provide precompiled Windows executables. I will post the link in the OP when
it becomes available.
legendary
Activity: 1470
Merit: 1114
AMD FX7600 (SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, SSE4A, XOP, AVX, FMA, FMA4, AES)

cpuminer v3.3.1 - 15-17 khash
Wolf0 - 33-35 khash

I will definitely follow up for v3.3.3, need to get the other fixes out first. Thanks.

This is precisely the issue I'm trying to solve, where to find the best miner for a specific
algo. I want cpuminer-opt to the the one stop shop for all CPU mining.

Thanks for that tip.
sr. member
Activity: 312
Merit: 250
AMD FX7600 (SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, SSE4A, XOP, AVX, FMA, FMA4, AES)

cpuminer v3.3.1 - 15-17 khash
Wolf0 - 33-35 khash
legendary
Activity: 1470
Merit: 1114
joblo, today I decided to test the built-in Magi XMG M7M algo and for my surprise the miner says the algo in not AES-NI.

Magi has AES-NI support in cpuminer from atleast an year ago, by Wolf0.

Here is the github link:
https://github.com/wolf9466/wolf-m7m-cpuminer

How does your performance compare?
sr. member
Activity: 312
Merit: 250
joblo, today I decided to test the built-in Magi XMG M7M algo and for my surprise the miner says the algo in not AES-NI.

Magi has AES-NI support in cpuminer from atleast an year ago, by Wolf0.

Here is the github link:
https://github.com/wolf9466/wolf-m7m-cpuminer
legendary
Activity: 1470
Merit: 1114
I have an update on a couple of issues that have existed for a while but are now better understood.

1. The definition of AES

The miner's AES optimized code also includes AVX instructions that are not included in the first generation
of AES CPUs, however the miner's CPU capabilities check uses the pure drefinition of AES (without AVX).
This results in the following message being displayed on startup on CPUs with AES but not AVX:

Code:
Rebuild with "-march=native" for better performance.

This is not an error, the miner will work, but not at AES performance levels. This message should be ignored
for Nehalem series CPUs and similar AMD. These CPUs can't mine using the miner's AES & AVX code.

I other words the miner will mine at the best rate possible for that CPU in spite of the suggestion otherwise.

The minimum requirement for AES mining will be changed to Sandybridge to conform with the actual limitations
of the AES mining code and the capabilities check will be changed to also check for AVX support in the CPU.

Update: I have changed the terminology to be more precise. It will now be called AES-AVX1 to more clearly
exclude CPUs that have AES but not AVX1. Also the rebuild warning will not be displayed for those CPUs.
Documentation will also be updated.

2. Low difficulty shares mining hodl

Since v3.2 hodl mining has started producing low difficulty shares at a rate of 1 or 2%. Prior to that there were
virtually no rejects. I may have found the problem and am testing a fix. At this point the first 170 shares have
been valid.

Update: I am optimistic the fix works. The first test run produced no low diff shares in 268 accepts

3. Hodl support on Windows for non AES-AVX1 CPUs. This is a more challenging issue and will not be fixed
in the next release.

I will build a new release when these 2 issues are solved.

In the meantime users should ignore the warning to rebuild for faster performance and hodl users with Linux
can use v3.1.18 if the rejects are unacceptible. Hodl miners n Windows will have to live with the rejects until
the fix is released.

Note to Cryptimining Blog: You might want to drop the Westmere build for the next release as it seems to have
no practical use. A core2 build would expand the range of supported CPUs and would also work on Westmere.

Update:

It appears both issues are fixed. The fix to hodl was more of a trial-and-error proceess so I want to test longer
to raise the confidence level. I don't know why it broke and I don't know why the fix works.

The AES issue, on the other hand, is now well understood so I am confident in the fix although I don't have the
resources to test it properly.

An interesting contrast in processes.

V3.3.2 coming in a matter of hours.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
the block version is only for the gbt mode, which is only compatible with bitcoin as far as i know... (remains from original cpuminer)

gbt rules are too much variable per coin to handle them in a multi purpose miner
sr. member
Activity: 292
Merit: 250
I think this is your problem.

[2016-06-01 13:25:05] Unrecognized block version: 4

I think there is a new block version for hodl and believe there is a bounty for someone to write the code to support it.

Until then you'll have to mine in a pool.

Ok thanks.
Jump to: