Pages:
Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 2. (Read 444040 times)

legendary
Activity: 1470
Merit: 1114
Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt

Thanks for the head up, I am sure the link they are posting has a download filled with all kinds of holiday goodies intended to make his/her holidays more festive. Its a good reminder to always double check things before you click them, because even the best of us get caught slipping sometimes.

The POS tried to copy my ANN but couldn't even do that right, A real winner.

I reported it to Mod and it seems to have been deleted.
legendary
Activity: 1049
Merit: 1001
Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt

Thanks for the head up, I am sure the link they are posting has a download filled with all kinds of holiday goodies intended to make his/her holidays more festive. Its a good reminder to always double check things before you click them, because even the best of us get caught slipping sometimes.
legendary
Activity: 1470
Merit: 1114
Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt
legendary
Activity: 1470
Merit: 1114
which version of AVX2 would you like to see?.. i think i have twenty of your previous versions benched up to version 3.10.2 for avx2 on this cpu

Just use the latest release compiled for avx2. That will provide the most direct comparison. If you have Windows
it's already compiled for you. With Linux just compile with "-march=skylake" instead of "-march=native".

You can confirm that the SW features only list AVX2 but the CPU still lists AVX512.
sr. member
Activity: 703
Merit: 272

Here are my results so far with a 7820X... i've only benched for the pools shown.


Thanks for posting. It would be nice to compare with AVX2.


which version of AVX2 would you like to see?.. i think i have twenty of your previous versions benched up to version 3.10.2 for avx2 on this cpu
legendary
Activity: 1470
Merit: 1114

Here are my results so far with a 7820X... i've only benched for the pools shown.


Thanks for posting. It would be nice to compare with AVX2.

I'm seeing genarally around 30% increase in most X algos as they are a mix of optimized
and unoptimized hash functions. Algos like lyra2v3, which are 100% optimized are getting
nearly double.

It's too bad CPUs don't have a chance with those algos anymore.
sr. member
Activity: 703
Merit: 272
cpuminer-opt-3.10.1 was just released. It fixes some bugs that can cause generally poor performance
without reporting any errors. All users should upgrade.

https://github.com/JayDDee/cpuminer-opt/releases

AVX512


Here are my results so far with a 7820X... i've only benched for the pools shown.

ps.. i'm mining ethash on two VII also at the same time




and here are the programs currently benched

legendary
Activity: 1470
Merit: 1114
The previous optimization request got me thinking. It raised concerns similar to
another request I resisted and raises an interesting question.

How far should a miner go for optimizing performance?

Should it modify system configuration?

Should the miner be required to run with root/admin privileges?

The 2 cases that illustrate the issue are the one imediately above. The miner makes a system
configuration change that will affect all applications, and it can't restore the original config itself.

The other case is huge pages. Huge pages requires system configuration changes as well
but only to enable the feature. It does not affect applications that don't explicitly use it.
Buit it requires the miner to be run by administrator on Windows.

My opinion is these features may be appropriate on a dedicated mining system but maybe not
for a typical desktop PC.

The ideal would be able to handle both environments transparently but that takes a lot of work.

Automated config changes that affect everything and aren't automatically reversed is
completely unnacceptible, IMO. If manual intervention is reruired to "undo" it should also
be required to "do".

My only concern is with the automation of the change and lack of automated reversal.
That has a simple solution. Don't do it in the miner.

HW prefetch changes should be done manually by the user before starting to mine, and then
undone when no longer mining.  It's completely up to the user which algos to use it with
and requires no complex logic in the miner.

Huge pages is not so risky but does have the issue of requiring the miner to be run by admin.
My other concern is the lack of transparency.

Huge pages should be completely transparent. The system should be smart enough to allocate
huge pages for large datasets. I don't see why any application changes should be required,
it should all happen behind the scenes in malloc. And it shouldn't require root/admin.

My stubbornness on this point may be part of the issue.

Both of these optimizations could help some algos and hurt others, they have to be set for
each algo individually. With nealry 100 algos that a huge task.

So aside from the technical concerns I don't know if it's worth the work.

Comments are welcome.





legendary
Activity: 1470
Merit: 1114
Good evening
Can MSR be implemented in your cpuminer-opt?
https://xmrig.com/docs/miner/randomx-optimization-guide/msr
Or is it just about RandomX.
Thanks you

It looks interesting but I have lots of questions about it. I'm deep into AVX512 right now
so I'll follow up later.

It might be specific to RandomX (and probably cryptonight) because they were both designed
with specific cache usage in mind.

I assume the technique is to disable next line prefetching which assumes sequential access.
RandomX won't need the next line due to it's randomness so it's waste to prefetch it.

Edit:

It appears this optimzation is specific to certain algorithms and could negatively impact others.
To implement it would require using it only on selected algos. The algos currently benefitting
are not supported by cpuminer-opt. It would be a lot of work to analyze which supported algos
might be helped.

I'm also concerned about the system impact. This kind of  optimization may be appropriate for a
dedicated mining system but not for a multi purpose desktop. Changing the prefetch configuration
has system wide effect and will affect other applications positively or negatively, even when not mining.

There is no gaceful way to undo the changes. Miners don't usually exit gracefuly, Ctrl C is
the standard exit, or sometimes a crash. This would leave the system prefetch configuration
modified and would require manually restoring it.

I think I'll pass.

legendary
Activity: 1470
Merit: 1114
cpuminer-opt-3.10.1 was just released. It fixes some bugs that can cause generally poor performance
without reporting any errors. All users should upgrade.

https://github.com/JayDDee/cpuminer-opt/releases

AVX512 for blake2b, nist5, quark, tribus.

More broken lane fixes, fixed buffer overflow in skein AVX512, fixed
quark invalid shares AVX2.

Only the highest ranking feature in a class is listed at startup, lower ranking
features are available but no longer listed.

Edit: v3.10.3 is out with more AVX512
legendary
Activity: 1470
Merit: 1114
I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx

Ryzen 5 3600 @ 4.2GHz (CPU Core Ratio - 42x, PBO is disabled) GCC 9.2.1:
Code:
blake2s:
        znver1  231.46 MH/s
        znver2  238.08 MH/s
          avx2  236.11 MH/s
           avx  236.09 MH/s

sha256t:
        znver1   61.44 MH/s
        znver2   61.69 MH/s
          avx2   46.25 MH/s
           avx   46.26 MH/s


Many thanks. It's not quite the results I expected. I was hoping AVX2 would be better.
SHA is clearly the winner over AVX2. That was expected given the AVX2 results.

I see no need for seperate znver1 and znver2 packages, there is only a slight improvement
for AVX and AVX2.

I also see no need to override SHA until Intel CPUs with SHA become mainstream.
with Icelake.
legendary
Activity: 2317
Merit: 2318
I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx

Ryzen 5 3600 @ 4.2GHz (CPU Core Ratio - 42x, PBO is disabled) GCC 9.2.1:
Code:
blake2s:
        znver1  231.46 MH/s
        znver2  238.08 MH/s
          avx2  236.11 MH/s
           avx  236.09 MH/s

sha256t:
        znver1   61.44 MH/s
        znver2   61.69 MH/s
          avx2   46.25 MH/s
           avx   46.26 MH/s


legendary
Activity: 1470
Merit: 1114
I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx
to compare AVX2 vs AVX performance. With Ryzen 1xxx AVX2 was often slower than AVX.

The results will help me decide how to deliver Windows binaries for Ryzen and whether
AVX2 should override SHA.

Currently only Ryzen has SHA so it's simple, use it if it's there because AVX2 is slow.
It gets more complicated when Intel releases Icelake with SHA for the desktop. AVX2 is
faster than SHA on Intel CPUs.

Which is faster on Ryzen 3xxx and does the new znver2 compile arch make a difference?

Requirements:

Any Ryzen or TR CPU from the 3xxx series.
A recent Linux distro.

Goal:

Compare AVX2 vs AVX performance on Ryzen 3000 series CPUs using blake2s algo.
Compare AVX2 vs SHA performance on Ryzen 3000 series CPUs usimg sha256t algo.
Determine if the new znver2 compile arch has an effect on the results.
Determine if Intel and Ryzen need to prioritize features differently..

Procedure:

1. Compile seperate builds for znver1, znver2, and avx2 and avx

Code:
./autogen.sh
CFLAGS="-O3 -march=znver1 -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-znver1
make clean
CFLAGS="-O3 -march=znver2 -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-znver2
make clean
CFLAGS="-O3 -march=core-avx2 -maes -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-avx2
[make clean
CFLAGS="-O3 -march=core-avx -maes -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-avx

2. Do a blake2s benchmark on each build. 5 minutes each should be enough
to produce a stable hash rate.

Code:
./cpuminer-znver1 -a blake2s --benchmark --hash-meter
./cpuminer-znver2 -a blake2s --benchmark --hash-meter
./cpuminer-avx2 -a blake2s --benchmark --hash-meter
./cpuminer-avx -a blake2s --benchmark --hash-meter

3. Repeat the tests with sha256t.

4. Post your results including CPU model, GCC version and the stable total hash rate
for each test.

Thanks in advance, the results will help ensure optimum performance on Ryzen CPUs.
legendary
Activity: 1470
Merit: 1114
After many delays, AVX-512 was supposed to be generally available 3 years ago with
Intel Cannon Lake, cpuminer-opt now supports AVX-512.

AVX-512 is currently available on Intel Skylake-X and the newly released Cascadelake-X
CPUs from Intel. It is also available on Icelake but only for mobile CPUs.

It looks like AVX512 will finally be released for mainstream desktops in 2020.
I'm not aware of plans to add AVX512 to AMD Ryzen CPUs.

Algos will be optimized gradually over the next few releases. First up are argon2d, blake2s,
keccak, keccakc, skein and skein2.

https://github.com/JayDDee/cpuminer-opt/releases/tag/v3.10.0
legendary
Activity: 1470
Merit: 1114
Some notes about pecuriarities using GCC 9 that affect cpuminer-opt
and may be of interest to developpers.

1. It produces more warnings about array bounds, found some violations
in cpuminer-opt that will be fixed in the next release.

2. It no longer includes AES in "-march=core-avx2", need to add aes
manually: "-march=core-avx2 -maes".  

3. It doesn't rebuild Makefile.in after removing a source file from Makefile.am.
The compiler still looked for the deleted file. It was necessary to edit Makefile.in
manually to remove all references to the deleted file. Will follow up.

Edit: I was missing automake, didn't need it until I changed Makefile,am


For the time being I will continue to use GCC 7 for devepolment and production of
the Windows binaries.

legendary
Activity: 1470
Merit: 1114
AVX512 is coming soon.

I've been waiting over a year for a reasonably priced CPU with AVX512. With
the price drops for Cascade Lake X I was thinking of getting one.
Instead I got an Ice Lake laptop for less than the cheapest Cascade Lake X CPU.
But I won't be using it to mine, just develop.

Ice Lake also has VAES which will can also speed up algos that use AES.

I'm setting it up now, had a problem with the NVME SSD being recognized by Ubuntu.
Got an external SSD and all was fine. I would have liked to use he built in SSD but I have
a feeling the BIOS is preventing it. Some tips suggest changing it to AHCI but there are
no BIOS options for the SSD. Another possibility is a lack of SATA support. I may revisit that
issue later, the external SSD works fine and it leaves the system otherwise untouched.
If anyone has some ideas I'll check them out.

AVX512 wil be rolled out gradually over the next several releases with single function algos
optimized first and the longer chained algos later. VAES will come much later.

Stay tuned.
legendary
Activity: 1470
Merit: 1114
sr. member
Activity: 703
Merit: 272
@joblo
Any chance of getting verushash algo added?
legendary
Activity: 1470
Merit: 1114
While testing some blake2s code for x25x in v3.9.11 I noticed some peculiar results.
I made some changes to the AVX code whcih inproves performance on my Skylake
but the same changes slowed my Ryzen 1700.

Ryzen ver1 is known to have poor AVX2 performance but I have no idea about
ver2.

If anyone has done some comparison testing of Ryzen AVX vs AVX2, or would like to
do some testing please post your results. It will help me decide how to proceed
particularly with the Windows binaries package.

I wouild like to know whether AMD has improved AVX2 in zen2 and whether the compiler
makes a difference. znver2 is supported starting in GCC 9.

Blake2s is a good test algo for AVX and AVX2. There's no profit to be made mining blake2s
with a CPU but a benchmark test will do.

./cpuminer -a blake2s --benchmark --hash-meter

TIA.
sr. member
Activity: 445
Merit: 255
Thank you very much for all the work you have done with cpuminer. Funny, I was thinking you were a young programmer motivated by improving his cpu architecture understanding and close to machine coding skills. Well your example shows me I am not too old to improve my coding skill and may motivate me to do this.


Pages:
Jump to: