Pages:
Author

Topic: [ANN] cpuminer-opt v3.14.2, open source optimized multi-algo CPU miner - page 4. (Read 9705 times)

full member
Activity: 1372
Merit: 216
Info about GCC support for znver4...

https://www.phoronix.com/news/AMD-Zen-4-Znver4-GCC-Enable

Quote
So what this amounts to at this point is getting -march=native working for Zen 4, honoring -march=znver4, and then over the Znver3 target just flipping on AVX512F, AVX512DQ, AVX512IFMA, AVX512CD, AVX512BW, AVX512VL, AVX512BF16, AVX512VBMI, AVX512VBMI2, GFNI, AVX512VNNI, AVX512BITALG, and AVX512VPOPCNTDQ.

The current advice for compiling is to use znver3 and add the AVX512 extensions manually. For the binary Windows builds avx512-sha-vaes will provide near optimum performance.
sr. member
Activity: 703
Merit: 272
OK Mate ...

This week Smiley

Hope You are well.

#crysx #cwi

All is well. cpuminer-opt development is winding down, running out of ideas. I have one more release coming with some new optimizations
and propagating existing optimizations to another algo or 2 that doesn't yet have ASICs. After that I don't know.

If you want to talk, send me an email or PM.

----------

Before the next release I'd still like some data for zen4 in case the builds need to be tweaked.


Here's my 7950X with AVX2,AVX512, and AVX512-sha-vaes data.  also a shot from cpuz and core temp.  7950X settings are all stock... no overclocks.

https://imgur.com/a/YvaVUQa
full member
Activity: 1372
Merit: 216
OK Mate ...

This week Smiley

Hope You are well.

#crysx #cwi

All is well. cpuminer-opt development is winding down, running out of ideas. I have one more release coming with some new optimizations
and propagating existing optimizations to another algo or 2 that doesn't yet have ASICs. After that I don't know.

If you want to talk, send me an email or PM.

----------

Before the next release I'd still like some data for zen4 in case the builds need to be tweaked.
legendary
Activity: 2814
Merit: 1091
--- ChainWorks Industries ---
Hi Mate ...

Let's chat!

Skype is good for Me - I would like to catch up as soon as you are available.

#crysx #cwi

No Skype here, just PM or email.

OK Mate ...

This week Smiley

Hope You are well.

#crysx #cwi
full member
Activity: 1372
Merit: 216
Hi Mate ...

Let's chat!

Skype is good for Me - I would like to catch up as soon as you are available.

#crysx #cwi

No Skype here, just PM or email.
legendary
Activity: 2814
Merit: 1091
--- ChainWorks Industries ---
@JayDDee

Has anyone benched any of the AMD 7000 series processors yet with cpuminer-opt?

I plan on picking up a 7950 or 7900 within the next couple of days.

Are there any results you want to see?

Thanks for the offer. I hope to get one before the end of the year.

The big addition to 7000 is AVX512 but I haven't found any specifics about what AVX512 extensions are included. VNNI is the only one I've seen
mentioned because it's for AI and AI Is hot right now, but of no use in cpuminer-opt. I assume DQ, BW & VL are included to be compatible with the
Intel Skylake-AVX-512 I use as a baseline reference.

The first thing would be to confirm the AVX512-SHA-VAES Windows build works on Ryzen 7000, Also confirm build options for GCC
(Linux or Windows) becaue zen4 isn't supported as of GCC-12. I've added some notes to the Wiki: https://github.com/JayDDee/cpuminer-opt/wiki/Compiling-from-source

I have read that Ryzen's implementation of AVX512 is "double pumped" which I assume is the same as the AVX2 implementation from zen1.
This will limit the gains of using larger vectors as was the case with Zen1 AVX2. The reason given this time was to reduce the clock penalty of
using larger vectors. It would be interesting to compare with and without AVX512. In some cases AVX2 may be faster.

The cache is larger so cache dependant algos may support more threads before cache overflow starts reducing performance.

Also keep a close eye on temperatures, I have read that zen4 is difficult to cool.

Edit: additional request from a Linux user: cat /pro/cpuinfo to get the list of AVX512 extensions.


Hi Mate ...

Let's chat!

Skype is good for Me - I would like to catch up as soon as you are available.

#crysx #cwi
sr. member
Activity: 703
Merit: 272

Here are the results.

https://imgur.com/a/YvaVUQa

Interesting but it seems focussed on profiability which we all know is bad news. Most of those algos are a waste of time.
I was hoping for comparisons of AVX512 vs AVX2. Is that a 7950 or 7900? Some of the results are confusing.

avx512-sha-vaes build was not used on all algos, do you know why?

Can you provide the miner's startup messages showing the CPU features?

Edit: here's some data from my i9-9940x 14/24 core CPU for comparison of AVX512.

Code:
             9940x       79?0
keccak        248Mh/s    309Mh/s     25%
bmw512        336Mh/s    471Mh/s     40%
skein2        172Mh/s    256Mh/s     48%
blake2s       1.1Gh/s    1.6Gh/s     45%
average gain   42%

I can you a list it for just those two later.. it'd bed time
full member
Activity: 1372
Merit: 216

Here are the results.

https://imgur.com/a/YvaVUQa

Interesting but it seems focussed on profiability which we all know is bad news. Most of those algos are a waste of time.
I was hoping for comparisons of AVX512 vs AVX2. Is that a 7950 or 7900? Some of the results are confusing.

avx512-sha-vaes build was not used on all algos, do you know why?

Can you provide the miner's startup messages showing the CPU features?

Edit: here's some data from my i9-9940x 14/28 core CPU for comparison of AVX512.

Code:
             9940x       79?0
keccak        248Mh/s    309Mh/s     25%
bmw512        336Mh/s    471Mh/s     40%
skein2        172Mh/s    256Mh/s     48%
blake2s       1.1Gh/s    1.6Gh/s     45%
average gain   42%
sr. member
Activity: 703
Merit: 272
i'm running windows 11, so the /pro/cpuinfo is currently not doable... that i know of.

CPU-Z will do it on Windows.

Nice setup!

Some algos of interest for performance testing:

myg-gr, groestl: heavy VAES, some AVX512
allium: heavy AVX512, some VAES
skein2, blake2b, keccak, sha256t: pure AVX512
SHA is not used much because parallel Nway is always faster.

myr-gr uses some AVX2 with the AVX512 build, Groestl is pure VAES+AVX512.
sha256d is also pure AVX512 but the AVX2 version uses Pooler ASM code so it can't be used for comparison.

Here are the results.

https://imgur.com/a/YvaVUQa
full member
Activity: 1372
Merit: 216
i'm running windows 11, so the /pro/cpuinfo is currently not doable... that i know of.

CPU-Z will do it on Windows.

Nice setup!

Some algos of interest for performance testing:

myg-gr, groestl: heavy VAES, some AVX512
allium: heavy AVX512, some VAES
skein2, blake2b, keccak, sha256t: pure AVX512
SHA is not used much because parallel Nway is always faster.

myr-gr uses some AVX2 with the AVX512 build, Groestl is pure VAES+AVX512.
sha256d is also pure AVX512 but the AVX2 version uses Pooler ASM code so it can't be used for comparison.
sr. member
Activity: 703
Merit: 272
@JayDDee

Has anyone benched any of the AMD 7000 series processors yet with cpuminer-opt?

I plan on picking up a 7950 or 7900 within the next couple of days.

Are there any results you want to see?

Thanks for the offer. I hope to get one before the end of the year.

The big addition to 7000 is AVX512 but I haven't found any specifics about what AVX512 extensions are included. VNNI is the only one I've seen
mentioned because it's for AI and AI Is hot right now, but of no use in cpuminer-opt. I assume DQ, BW & VL are included to be compatible with the
Intel Skylake-AVX-512 I use as a baseline reference.

The first thing would be to confirm the AVX512-SHA-VAES Windows build works on Ryzen 7000, Also confirm build options for GCC
(Linux or Windows) becaue zen4 isn't supported as of GCC-12. I've added some notes to the Wiki: https://github.com/JayDDee/cpuminer-opt/wiki/Compiling-from-source

I have read that Ryzen's implementation of AVX512 is "double pumped" which I assume is the same as the AVX2 implementation from zen1.
This will limit the gains of using larger vectors as was the case with Zen1 AVX2. The reason given this time was to reduce the clock penalty of
using larger vectors. It would be interesting to compare with and without AVX512. In some cases AVX2 may be faster.

The cache is larger so cache dependant algos may support more threads before cache overflow starts reducing performance.

Also keep a close eye on temperatures, I have read that zen4 is difficult to cool.

Edit: additional request from a Linux user: cat /pro/cpuinfo to get the list of AVX512 extensions.


I just started benchmarking... i plan on doing AVX2, AVX2-sha, AVX2-sha-vaes, AVX512, and AVX512-sha-vaes

The system is:  Win11 Pro 64bit
Case:   Lian Li 011 Dynamic Evo Mid-tower (with intake fans on bottom and exhaust fans on top with top mounted radiator)
CPU:    AMD 7950X (stock and custom watercooled with 360 radiator with fans in a push configuration on top case;  bitspower waterblock)
MB:     Asrock X670E Taichi MB,
RAM:   64GB(2x32) Crucial DDR5-4800,
HD:     Samsung 980PRO SSD (fan cooled),
GPU:    AMD W5700 Pro (blower fan, to exhaust out the back and it allows me to use the USB-C video out to hook up to the USB-C video in on monitor),
           AMD 6900X (will put on wb eventually)
PSU:    Corsair HX1200

Currently only CPU is on water loop.
idle cpu temps are 40C (in bios).. under load cpu is 85-95C (never goes over 95C.. exhaust temps of radiator with usb attached FLIR ONE is around 51C), and 151-190Watts, mostly @170W

I'll be benching against available algos on zergpool, zpool, and nicehash.

i'm running windows 11, so the /pro/cpuinfo is currently not doable... that i know of.

member
Activity: 325
Merit: 42
JayzTwoCents https://www.youtube.com/watch?v=tzm5pFq7ol0&t=909s has some thoughts on the temps.
Apparantly 95C is by design and it will not go over it.


They also raised the power limit. Including the gimped AVX512 it all seems fishy. Apparently the CPUs will throttle the clock to manage
temps the way laptops do. The advertised clock rates, a significant part of the hype, may not be achievable without a monster cooler.

I look forward to some real results.

Or do something like what Optimum Tech did https://www.youtube.com/watch?v=FaOYYHNGlLs undervolting in an other way.
full member
Activity: 1372
Merit: 216
JayzTwoCents https://www.youtube.com/watch?v=tzm5pFq7ol0&t=909s has some thoughts on the temps.
Apparantly 95C is by design and it will not go over it.


They also raised the power limit. Including the gimped AVX512 it all seems fishy. Apparently the CPUs will throttle the clock to manage
temps the way laptops do. The advertised clock rates, a significant part of the hype, may not be achievable without a monster cooler.

I look forward to some real results.
member
Activity: 325
Merit: 42
JayzTwoCents https://www.youtube.com/watch?v=tzm5pFq7ol0&t=909s has some thoughts on the temps.
Apparantly 95C is by design and it will not go over it.
full member
Activity: 1372
Merit: 216
@JayDDee

Has anyone benched any of the AMD 7000 series processors yet with cpuminer-opt?

I plan on picking up a 7950 or 7900 within the next couple of days.

Are there any results you want to see?

Thanks for the offer. I hope to get one before the end of the year.

The big addition to 7000 is AVX512 but I haven't found any specifics about what AVX512 extensions are included. VNNI is the only one I've seen
mentioned because it's for AI and AI Is hot right now, but of no use in cpuminer-opt. I assume DQ, BW & VL are included to be compatible with the
Intel Skylake-AVX-512 I use as a baseline reference.

The first thing would be to confirm the AVX512-SHA-VAES Windows build works on Ryzen 7000, Also confirm build options for GCC
(Linux or Windows) becaue zen4 isn't supported as of GCC-12. I've added some notes to the Wiki: https://github.com/JayDDee/cpuminer-opt/wiki/Compiling-from-source

I have read that Ryzen's implementation of AVX512 is "double pumped" which I assume is the same as the AVX2 implementation from zen1.
This will limit the gains of using larger vectors as was the case with Zen1 AVX2. The reason given this time was to reduce the clock penalty of
using larger vectors. It would be interesting to compare with and without AVX512. In some cases AVX2 may be faster.

The cache is larger so cache dependant algos may support more threads before cache overflow starts reducing performance.

Also keep a close eye on temperatures, I have read that zen4 is difficult to cool.

Edit: additional request from a Linux user: cat /pro/cpuinfo to get the list of AVX512 extensions.
member
Activity: 325
Merit: 42
@JayDDee

Has anyone benched any of the AMD 7000 series processors yet with cpuminer-opt?

I plan on picking up a 7950 or 7900 within the next couple of days.

Are there any results you want to see?

SOAT https://www.youtube.com/watch?v=DYD0X2_b_eg
Testing is done with xmrig.
sr. member
Activity: 703
Merit: 272
@JayDDee

Has anyone benched any of the AMD 7000 series processors yet with cpuminer-opt?

I plan on picking up a 7950 or 7900 within the next couple of days.

Are there any results you want to see?
full member
Activity: 1372
Merit: 216
MinotaurX is one of the top 5 in profitability at ZergPool.  As I type this, it is #2, with 3 coins being served. 

Profitability is not realy the issue at this point, even if it could be sustained. The killer is yespower, when it's present it dominates
everything else. There's no point in optimizing the X16R part because it won't help the overall performance. Many of the X16R
optimizations can't be applied to this implementation anyway. There's also a unique GBT protocol change that would be difficult
to test.

I believe there are at least 2 optimized CPU miners that support minotaurx already. They likely already have any optimizations
I could add.

I'm simply not interested.
legendary
Activity: 1793
Merit: 1028
MINOTAURX --

First, thank you for the rapid response.  MinotaurX is one of the top 5 in profitability at ZergPool.  As I type this, it is #2, with 3 coins being served. 

Check it out.  The code is already out there, you are the coder with assembly skills.

--scryptr
full member
Activity: 1372
Merit: 216
It's a long story why I didn't support minotaurx that has mostly to do with how quickly minotaur was abandoned by the same people
who asked me to support it, then asked me to support minotaurx. I'm no one's unpaid employee.

Minotaurx is essentially a slow X16R (like Hex) + Yespower yet it pays less per hash than Yespower alone so I don't get where this high
rating came from.
Pages:
Jump to: