Pages:
Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 41. (Read 444067 times)

legendary
Activity: 1470
Merit: 1114
It's my turm to ask fr help. I'm fine tuning the new Windows crposs compile environment
and I'd like to remove one ugly workaround, editting configure.ac.

This code should properly link pthreadGC2 when cross compiling:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthread",[])
fi

but the procedure requires the following edit:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
fi

suggesting the logic isn't working.

Any suggestions to make the logic work?

perhaps moving it to the case clause starting on line 45?

That's kinda what I'm looking for but that tests $MINGW_TARGET and it returns Linux.
What I really want to test is the host as specified by "--host=x86_64-w64-mingw32"
but I can't figure out how.



member
Activity: 473
Merit: 18
It's my turm to ask fr help. I'm fine tuning the new Windows crposs compile environment
and I'd like to remove one ugly workaround, editting configure.ac.

This code should properly link pthreadGC2 when cross compiling:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthread",[])
fi

but the procedure requires the following edit:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
fi

suggesting the logic isn't working.

Any suggestions to make the logic work?

perhaps moving it to the case clause starting on line 45?
legendary
Activity: 1470
Merit: 1114
It's my turm to ask fr help. I'm fine tuning the new Windows crposs compile environment
and I'd like to remove one ugly workaround, editting configure.ac.

This code should properly link pthreadGC2 when cross compiling:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthread",[])
fi

but the procedure requires the following edit:

Code:
# GC2 for GNU static
if test "x$OS" = "xWindows_NT" ; then
   # MinGW
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
else
   AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS="-lpthreadGC2",[])
fi

suggesting the logic isn't working.

Any suggestions to make the logic work?
legendary
Activity: 1470
Merit: 1114
Benchmark and getwork.

Both are provided as is. Getwork seems to work with most algos but is not guaranteed. I don't even know
if the coin supports it. If you can find a miner that works I'll look into fixing it in cpuminer-opt.
newbie
Activity: 1
Merit: 0
Can someone ELI5 me on how to set up multi-algo switch mining with MiningPoolHub? I can't figure it out for the life of me what to put in the bat file (or what exe to use). I'm using a Ryzen 5 1500X.
full member
Activity: 187
Merit: 100
Cryptocurrency enthusiast
Btw, I've found an issue while testing...

i3-7350k @ 4.2 GHz, HT on, 8Gb RAM @ DDR4-2400 dual channels, WIN 10

The test below produces no output!

cpuminer-aes-avx -a lyra2z330 -t 4 --cpu-affinity 15

I got 100% load for all of 4 logical cpus, then it drops to 0% after 2 minutes, and stays there. Miner does not log any hashrate, and dooes not crash either. It reproduces stable - 10 out of 10 tests.

AVX2 build works okay with same settings.

Benchmark or pool?

Benchmark and getwork.

So I tested cryptonight with the latest 3.7.7 on my Ryzen 1800X and the speed is about 60 Kh/s per thread, this is with Hyper-threading disabled and using 7 threads (7 cores).
This was with cpuminer-aes-avx (so AES and AVX). Quite a bit lower than the 80-81 Kh/s per thread I get with xmr-stak. Is this normal speed these days or should I be getting higher with cpuminer?

Well, this speed is the same speed you get with stack or xmrig if you disable huge pages support for them (while getting higher hashrate if you do not). As cpminer-opt has a lack of huge pages support, that's what you get. So it's normal.

Another thing that could help cpuminer-opt is compiling NOT with gcc. Xmrig and stack both work faster when they are compiled with Intel compiler or MSVC (up to 5-10% faster depending on CPU and mem). Cpuminer-opt may benefit more, as it has multiple algos which are bound of different things (not only L3 cache).

I personally use xmrig compiled with trial Intel compiler from Parallel Studio XE 2018. It has Linux version as well, and as I personally gave up trying to compile cpuminer-opt as MSVC solution (using msvc 2017 compiler), somebody may give a try using Intel compiler under linux. So to compile same way as joblo does, but using Intel compiler. Of course I cannot make someone to do it, but I'm noob in linux building. But I'll try on these holidays Wink
legendary
Activity: 1470
Merit: 1114
Btw, I've found an issue while testing...

i3-7350k @ 4.2 GHz, HT on, 8Gb RAM @ DDR4-2400 dual channels, WIN 10

The test below produces no output!

cpuminer-aes-avx -a lyra2z330 -t 4 --cpu-affinity 15

I got 100% load for all of 4 logical cpus, then it drops to 0% after 2 minutes, and stays there. Miner does not log any hashrate, and dooes not crash either. It reproduces stable - 10 out of 10 tests.

AVX2 build works okay with same settings.

Benchmark or pool?
full member
Activity: 187
Merit: 100
Cryptocurrency enthusiast
Btw, I've found an issue while testing...

i3-7350k @ 4.2 GHz, HT on, 8Gb RAM @ DDR4-2400 dual channels, WIN 10

The test below produces no output!

cpuminer-aes-avx -a lyra2z330 -t 4 --cpu-affinity 15

I got 100% load for all of 4 logical cpus, then it drops to 0% after 2 minutes, and stays there. Miner does not log any hashrate, and dooes not crash either. It reproduces stable - 10 out of 10 tests.

AVX2 build works okay with same settings.
sr. member
Activity: 1246
Merit: 274
This is a reminder to Windows binaries users. I intend to make some changes in the next release
to reduce the number of binaries I have to build each release.

I would like to drop the sse42 (Nehalem) build. Nehalem users would be forced to use the sse2 build.
There is no sse42 targetted code in any algos that I'm aware of so  there should be no difference in
performance. Please report if you have data that shows otherwise.

AVX2 performance on Ryzen is still questionable. It was suggested I provide a avx-sha build. I will provide
either a avx-sha or avx2-sha build. This will not apply to 4way as it requires AVX2. Ryzen users please
test avx vs avx2 (not 4way) to determine which gets the best performance. The best algo to test AVX2
is lyra2rev2 as it has the most AVX2 code.

Ryzen 1700X @ 3.8 using 8 threads mining Vertcoin.

AVX = ~139 Kh/s per thread

AVX2 = ~137 Kh/s per thread

I was doing a fair bit of multi-tasking when running the tests for about 5 minutes each, but the overall CPU load should have been the same for both tests. I can try testing on my other PC at 4.0 ghz but I expect the differences will be the same % wise.
legendary
Activity: 1470
Merit: 1114
I would like to drop the sse42 (Nehalem) build. Nehalem users would be forced to use the sse2 build.
There is no sse42 targetted code in any algos that I'm aware of so  there should be no difference in
performance. Please report if you have data that shows otherwise.

Some results for: Pentium G4600 @ stock, DDR4-2400@ dual channel, Win 10, stock 3.7.7 binaries from github. This CPU is not old, but lacks AVX/AVX2. System was clean and idle.

I've used benchmark option: cpuminer-* -a * -t 4 --cpu-affinity 15 --benchmark for all algos. So, the ones where the difference is:

yescryptr16 - ~498.5 H/s SSE2, ~512 H/s SSE42
skunk - ~285 kH/s SSE2, ~295 kH/s SSE42
nist5 - ~450 kH/s SSE42, ~460 kH/s SSE2 (ooops)

I've retested these suspicious results 5 times each, but it seems to be not a mistake. Btw, the difference is less than minor. Maybe OS uses some of these (sse2/sse42) in the background, so this gives minor speed changes. I don't know. Disassembly may help to see if there are actual changes in binaries for these algos, or not.

Your CPU has AES so you should use the aes-sse42 build. The issue really only applies to Nehalem and any possibly similar
AMD architecture.

I found a small difference in the yescrypt code but your results differ less than skunk which has identical code.
full member
Activity: 187
Merit: 100
Cryptocurrency enthusiast
I would like to drop the sse42 (Nehalem) build. Nehalem users would be forced to use the sse2 build.
There is no sse42 targetted code in any algos that I'm aware of so  there should be no difference in
performance. Please report if you have data that shows otherwise.

Some results for: Pentium G4600 @ stock, DDR4-2400@ dual channel, Win 10, stock 3.7.7 binaries from github. This CPU is not old, but lacks AVX/AVX2. System was clean and idle.

I've used benchmark option: cpuminer-* -a * -t 4 --cpu-affinity 15 --benchmark for all algos. So, the ones where the difference is:

yescryptr16 - ~498.5 H/s SSE2, ~512 H/s SSE42
skunk - ~285 kH/s SSE2, ~295 kH/s SSE42
nist5 - ~450 kH/s SSE42, ~460 kH/s SSE2 (ooops)

No tests for hodl - miner exits w/o AES, expected by miner.

I've retested these suspicious results 5 times each, but it seems to be not a mistake. Btw, the difference is less than minor. Maybe OS uses some of these (sse2/sse42) in the background, so this gives minor speed changes. I don't know. Disassembly may help to see if there are actual changes in binaries for these algos, or not.
legendary
Activity: 1470
Merit: 1114
Hey guys, first time trying CPU mining with my 3570k and i have weird problems. I downloaded CPU miner 3.7.7 and I followed these instructions

Make a shortcut or .bat file and use this command line, replace the [] with your own values :

[PATH_TO_CPU_MINER]\cpuminer-aes-avx2.exe -a lyra2z -o stratum+tcp://europe.lyra2z-hub.miningpoolhub.com:17025 -u [MPH_USERNAME].[WORKER] -p [PASSWORD]

The thing is i cant even manage to open the window,even when i try to open avx2.exe nothing happens. Do i miss something? I tried to create a bat file inside this folder too,made the path and all the adresses and still i cant make this work at all. Any suggestions?



Use avx not avx2 for your Ivybridge CPU. As previously mentioned you can always see the error by bypassing
the bat file and typing the command at a comand prompt.
newbie
Activity: 182
Merit: 0
Hey guys, first time trying CPU mining with my 3570k and i have weird problems. I downloaded CPU miner 3.7.7 and I followed these instructions

Make a shortcut or .bat file and use this command line, replace the [] with your own values :

[PATH_TO_CPU_MINER]\cpuminer-aes-avx2.exe -a lyra2z -o stratum+tcp://europe.lyra2z-hub.miningpoolhub.com:17025 -u [MPH_USERNAME].[WORKER] -p [PASSWORD]

The thing is i cant even manage to open the window,even when i try to open avx2.exe nothing happens. Do i miss something? I tried to create a bat file inside this folder too,made the path and all the adresses and still i cant make this work at all. Any suggestions?



open a command prompt and type it by hand.  if there's an error, you'll see it.
newbie
Activity: 1
Merit: 0
Hey guys, first time trying CPU mining with my 3570k and i have weird problems. I downloaded CPU miner 3.7.7 and I followed these instructions

Make a shortcut or .bat file and use this command line, replace the [] with your own values :

[PATH_TO_CPU_MINER]\cpuminer-aes-avx2.exe -a lyra2z -o stratum+tcp://europe.lyra2z-hub.miningpoolhub.com:17025 -u [MPH_USERNAME].[WORKER] -p [PASSWORD]

The thing is i cant even manage to open the window,even when i try to open avx2.exe nothing happens. Do i miss something? I tried to create a bat file inside this folder too,made the path and all the adresses and still i cant make this work at all. Any suggestions?

legendary
Activity: 1470
Merit: 1114
Another note to Ryzen users with Linux. I found a note that Linux kernel 4.10 added Ryzen support.
I don't know precisely what that means but if performance isn't what you expected check the kernel
version. Linux 4.10 was included in Ubuntu 17.04 and Fedora 26.
hero member
Activity: 677
Merit: 500
This is a reminder to Windows binaries users. I intend to make some changes in the next release
to reduce the number of binaries I have to build each release.

I would like to drop the sse42 (Nehalem) build. Nehalem users would be forced to use the sse2 build.
There is no sse42 targetted code in any algos that I'm aware of so  there should be no difference in
performance. Please report if you have data that shows otherwise.

AVX2 performance on Ryzen is still questionable. It was suggested I provide a avx-sha build. I will provide
either a avx-sha or avx2-sha build. This will not apply to 4way as it requires AVX2. Ryzen users please
test avx vs avx2 (not 4way) to determine which gets the best performance. The best algo to test AVX2
is lyra2rev2 as it has the most AVX2 code.

Ryzen 1700 at 3.7 GHz (16 threads)

cpuminer-avx.exe
~140 kH/s per thread.

cpuminer-avx2.exe
~133 kH/s per thread.

cpuminer-4way.exe
~133 kH/s per thread.

cpuminer-sse2.exe
~135 kH/s per thread


I think you're right about Ryzen and the poor avx2 performance.  Maybe compile with avx-sha?
legendary
Activity: 1470
Merit: 1114
This is a reminder to Windows binaries users. I intend to make some changes in the next release
to reduce the number of binaries I have to build each release.

I would like to drop the sse42 (Nehalem) build. Nehalem users would be forced to use the sse2 build.
There is no sse42 targetted code in any algos that I'm aware of so  there should be no difference in
performance. Please report if you have data that shows otherwise.

AVX2 performance on Ryzen is still questionable. It was suggested I provide a avx-sha build. I will provide
either a avx-sha or avx2-sha build. This will not apply to 4way as it requires AVX2. Ryzen users please
test avx vs avx2 (not 4way) to determine which gets the best performance. The best algo to test AVX2
is lyra2rev2 as it has the most AVX2 code.
full member
Activity: 187
Merit: 100
Cryptocurrency enthusiast
What 16-core model is that exactly that has the same L2 size as L3? The new bronze-gold-platinum still has a higher L3 than L2.

It's not so "cool" that you may think - it's AMD Opteron X16 6274. This one is old, yep, but it looks something like this:

http://www.pixic.ru/i/V0B1T4Y737K7s3v0.png

So, AVX and AES support is present. I have 2 of them, the only lack (for complete set) is the motherboard (dual-cpu), which costs $65 in China and is capable of holding 5GPUs on pci-e gen 2 as well.

This screenshot is of another model, mines have 8x2048 16-way L2, and 16Mb 128-way L3. Memory is ddr3-1800 (4 pieces of 4Gb for each cpu), so it's going to be quad-channel.

Will be interesting to test, the sender sends the motherboard to me in early february 2018.

Seems that AMD-specific intrinsics http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf are not used by cpuminer-opt, but I'm unsure they could give any speed gain. This will be my first AMD cpu after 486-DX2-66 lol Grin
legendary
Activity: 1470
Merit: 1114
If L2 cache is the same size as L3 there would be no use for L3.

Ryzen has 2 MB L2 per 4 core CCX, or 512 kB per core.

member
Activity: 473
Merit: 18
Can anybody tell some thing about L2/L3 cache usage? E.g., I got a 16-core CPU, it has 16Mb L2 and 16Mb L3. So, if the thread count and cpuminer-opt option "--cpu-affinity" are both set correctly, will the (e.g. cryptonight - 2Mb per thread) faster L2 cache be used instead of L3? Or the L2 size does not affect?

L2 cache is faster than L3 and holds core specific cache which prevents from other cores overriding that cache every time.
L3 is shared among the cores and is slightly slower.
Theoretically, with 16MB of L2 cache, 8 threads will provide the highest hash rate.
I'm not sure how much exactly L3 cache will benefit, but it will definitely speed up memory read operations
Pages:
Jump to: