Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 189. (Read 444131 times)

full member
Activity: 231
Merit: 150
Forgot to mention the numbers I last posted are from mining x11. More numbers coming I'll just add to this post.

System: EVGA SR2 Intel x2 OC 3.3Ghz 21 cores in VB

x15.usa.nicehash.com:3339
[2016-01-26 03:37:10] accepted: 5/5 (100.00%), 412.61 kH/s yes!
[2016-01-26 03:37:17] CPU #15: 19.77 kH/s
[2016-01-26 03:37:17] accepted: 6/6 (100.00%), 412.80 kH/s yes!
[2016-01-26 03:37:30] CPU #0: 19.72 kH/s
[2016-01-26 03:37:30] accepted: 7/7 (100.00%), 413.08 kH/s yes!
[2016-01-26 03:37:33] CPU #6: 19.66 kH/s
[2016-01-26 03:37:33] accepted: 8/8 (100.00%), 413.08 kH/s yes!
[2016-01-26 03:37:35] CPU #18: 19.68 kH/s
[2016-01-26 03:37:37] Stratum difficulty set to 0.01
[2016-01-26 03:37:37] x15.usa.nicehash.com:3339 x15 block 85823

quark.usa.nicehash.com:3345
[2016-01-26 03:40:46] accepted: 2/2 (100.00%), 881.24 kH/s yes!
[2016-01-26 03:40:59] CPU #2: 42.48 kH/s
[2016-01-26 03:40:59] accepted: 3/3 (100.00%), 880.33 kH/s yes!
[2016-01-26 03:41:03] CPU #11: 42.58 kH/s
[2016-01-26 03:41:03] accepted: 4/4 (100.00%), 881.15 kH/s yes!
[2016-01-26 03:41:10] Stratum difficulty set to 0.02

x13.usa.nicehash.com:3337
[2016-01-26 10:20:38] accepted: 159/159 (100.00%), 480.78 kH/s yes!
[2016-01-26 10:21:23] x13.usa.nicehash.com:3337 x13 block 875258


Oh well looks like NiceHash server went off line, I'll do more test at a later time.
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Is it correct that "SSE2: No"?
AFAICS, it runs fast anyway.

Checking CPU capatibility...
        Intel(R) Core(TM) i5 CPU         760  @ 2.80GHz
AES_NI: No.
SSE2: No, start mining without optimizations...

[2016-01-26 10:17:55] Starting Stratum on stratum+tcp://hashpower.co:3533
[2016-01-26 10:17:56] 4 miner threads started, using 'x11' algorithm.
[2016-01-26 10:17:56] Stratum difficulty set to 0.016
[2016-01-26 10:17:56] hashpower.co:3533 x11 block 1821843
[2016-01-26 10:17:56] hashpower.co:3533 x11 block 108239
[2016-01-26 10:18:01] CPU #1: 57.09 kH/s
[2016-01-26 10:18:01] CPU #0: 56.81 kH/s
[2016-01-26 10:18:01] CPU #2: 53.73 kH/s
[2016-01-26 10:18:02] CPU #3: 52.40 kH/s
[2016-01-26 10:18:18] CPU #0: 56.23 kH/s
[2016-01-26 10:18:18] CPU #3: 53.29 kH/s
[2016-01-26 10:18:18] CPU #1: 56.55 kH/s
[2016-01-26 10:18:18] CPU #2: 56.11 kH/s
legendary
Activity: 2716
Merit: 1094
Black Belt Developer

Cool. I didn't mean slack was obsolete, just that people who like it are old school.
I've played with it but never any real work. At one time I had 8 different distros
multibooted on 2 20GB HDDs running on a pentium 1. Now the're all in VMs.

A guy I used to work with was a slackware fan. Give him the keyboard, toss the mouse,
and he could do magic. He's was also pretty sharp in networking, knew his protocols
inside out.

I started using Linux with Slackware 2.0 :-)
full member
Activity: 231
Merit: 150
Numbers are looking very good on the 3 I have setup with v3.03.
2xSR2 have 24 cores if you count HT cores 1xSR2 has 20 cores counting HT. "E5620+E5645 CPU"

3.2GHz 1002 KH/s 21 of 24 cores used in the command line VB set to 24 cores. No mining video cards.

3.33Ghz 924.80 kH/s 21 of 24 cores used in the command line VB set to 22 cores. 2x R270 and why I left 2 unused by the VM. << should get a little better numbers once I set this one full 24 cores.

3.6Ghz 790 KH/s 17 of 20 cores used in the command line VB set to 20 cores. 2x R270x
 
I've found odd numbers with some free cores seems to give just a little better KH/s.
the free cores helps keep my video cards running at there max MH/s. Full load on all
CPU cores hurts the video cards & no real improvement to the CPU scores either.
This is due to running in a VBox, learned this back with Folding@Home, so nothing new there.


Thanks for posting your results. Running all cores does afffect GPU performance on the same machine
and we don't want to do that. It's also cool you can do it in a VM.

Even on the one I don't use a video card to mine on you will never get a true 100% load on all CPU
cores while running in a VB about 89% is max load you can get threw a VB.
legendary
Activity: 1470
Merit: 1114
legendary
Activity: 1470
Merit: 1114
Numbers are looking very good on the 3 I have setup with v3.03.
2xSR2 have 24 cores if you count HT cores 1xSR2 has 20 cores counting HT. "E5620+E5645 CPU"

3.2GHz 1002 KH/s 21 of 24 cores used in the command line VB set to 24 cores. No mining video cards.

3.33Ghz 924.80 kH/s 21 of 24 cores used in the command line VB set to 22 cores. 2x R270 and why I left 2 unused by the VM. << should get a little better numbers once I set this one full 24 cores.

3.6Ghz 790 KH/s 17 of 20 cores used in the command line VB set to 20 cores. 2x R270x
 
I've found odd numbers with some free cores seems to give just a little better KH/s.
the free cores helps keep my video cards running at there max MH/s. Full load on all
CPU cores hurts the video cards & no real improvement to the CPU scores either.
This is due to running in a VBox, learned this back with Folding@Home, so nothing new there.


Thanks for posting your results. Running all cores does afffect GPU performance on the same machine
and we don't want to do that. It's also cool you can do it in a VM.
full member
Activity: 231
Merit: 150
Numbers are looking very good on the 3 I have setup with v3.03.
2xSR2 have 24 cores if you count HT cores 1xSR2 has 20 cores counting HT. "E5620+E5645 CPU"

Stock Clock 3.2GHz 1002 KH/s 21 of 24 cores used in the command line VB set to 24 cores. No mining video cards.

OC 3.33Ghz 924.80 kH/s 21 of 24 cores used in the command line VB set to 22 cores. 2x R270 and why I left 2 unused by the VM. << should get a little better numbers once I set this one full 24 cores.

OC 3.6Ghz 790 KH/s 17 of 20 cores used in the command line VB set to 20 cores. 2x R270x
 
I've found odd numbers with some free cores seems to give just a little better KH/s.
the free cores helps keep my video cards running at there max MH/s. Full load on all
CPU cores hurts the video cards & no real improvement to the CPU scores either.
This is due to running in a VBox, learned this back with Folding@Home, so nothing new there.
legendary
Activity: 1470
Merit: 1114

Edit: Packages I have to install in order for it to compile:

sudo apt-get install libssl-dev

sudo apt-get install libcurl4-openssl-dev

sudo apt-get install g++

Just a side note for anyone compiling this. I don't think the order matters in which each is
installed this is what I have to do on the systems I have setup so far.

Thanks, I'll add that to the build instructions in the next release.
full member
Activity: 231
Merit: 150
As for the Ubuntu 15.10 I think it really depends on what version of it you get GNOME "Willy Wolf"  version VS the Ubuntu Desktop
version you get from the Ubuntu main web page. I found the Gnome Desktop version to work much better than the one I got from
there main site with the purple background vs Blue background with the Willy Wolf version. The one with the purple back ground is
very laggy especially once the miners is started, but not so with the other version & each was setup the same with VM.


Edit: Just to correct this...
Turns out it wasn't the software but what options are picked at install that causes the lag!  Roll Eyes
Don't pick anything extra at the format of the HDD screen and all should be fine.
One of the options was changing how the HDD was formatted & its lay out.

I'm resetting the 1st 3 I setup now with the Willy Wolf version, on number 2 at the moment & almost done with it then,
on to the other two if all works the same as the first version did on this one rig.

Edit: Packages I have to install in order for it to compile:

sudo apt-get install libssl-dev

sudo apt-get install libcurl4-openssl-dev

sudo apt-get install g++

Just a side note for anyone compiling this. I don't think the order matters in which each is
installed this is what I have to do on the systems I have setup so far.
legendary
Activity: 1470
Merit: 1114

Cool. I didn't mean slack was obsolete, just that people who like it are old school.
I've played with it but never any real work. At one time I had 8 different distros
multibooted on 2 20GB HDDs running on a pentium 1. Now the're all in VMs.

A guy I used to work with was a slackware fan. Give him the keyboard, toss the mouse,
and he could do magic. He's was also pretty sharp in networking, knew his protocols
inside out.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
legendary
Activity: 1470
Merit: 1114
i didnt trace to understand but xmr rejects the shares..

else i also use fedora 22 and 23, not yet centos... and ive an ubuntu 15.10 but dont really like it and some Slackware usb sticks Smiley

I pulled support for cryptonight at the last minute because I broke it. Use Wolf0's for now.
Yeah I know I promissed. Oh well.

Slackware? Old school.
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
i didnt trace to understand but xmr rejects the shares..

else i also use fedora 22 and 23, not yet centos... and ive an ubuntu 15.10 but dont really like it and some Slackware usb sticks Smiley
legendary
Activity: 1470
Merit: 1114
i got the same issue with the sse grso-asm.c with GCC 4.8.4 (ubuntu server 14.04.3 LTS)

look like the asm() style is incorrect

Thanks for popping in. I welcome and encourage your involvement.

Edit: I've nere like ubuntu, even before they changed their desktop.
I like Mint better but I still prefer the red hat family. If only I didn't have to
add extra repositories all the time. I like te init system better although
I'm still not comfortable with systemd. I flip flop betwen Fedora and Centos
depending on wether I need  the latest features.
legendary
Activity: 1470
Merit: 1114
cpuminer v3.0.3 is released.

https://drive.google.com/file/d/0B0lVSGQYLJIZT1cyRFJEeTd0cVk/view?usp=sharing

It fixes support for SSE2 capable CPUs, core2 family, and adds support for 2 more algos.

The build instructions have changed, a new option is required for SSE capable CPUs.
See the README.md file for details. See also the OP for more dtails about this release.

I seem to have lost some hash in a few algos. I'll have to try to find it or maybe I
smoked it.

I will now take a little break and recharge before diving in to Windows again. This will
likely be the last release in the 3.0 stream unless there are urgent issues or Windows
continues to be delayed.

A special thanks goes out to bobben2 for his excellent testing which helped solve the
SSE2 issues.

Enjoy
legendary
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
i got the same issue with the sse grso-asm.c with GCC 4.8.4 (ubuntu server 14.04.3 LTS)

look like the asm() style is incorrect
legendary
Activity: 1470
Merit: 1114

I think I have something that works. I justhave to bundle it up and build another debug load for you.
It should run straight out of the box on your CPU.

But things aren't perfect. There are two problems I had to workaround.

The #define AES_NI in miner.h is not being seen in any files that reference AESNI. I therefore had
to add #define AES_NI in every file  with any reference to AES_NI code.

The other problem is that has_sse2() isn't working so I hard coded it.

I tested it 4 ways:

1. AES_NI defined and march=native: my normal environment

2. AES_NI defined and march=core2. this fails to compile as expected.

3. AES_NI not defined and march=native, this compiles but performs at SSE2 rates as expected

4. AES_NI not defined and march=core2. this simulates your environment and works at SSE2 rates
    as expected.

The only thing now is to confirm it works with the default build instructions on your machine and
performs at SSE2 levels.

Follow up items:

investigate why #define in miner.h not seen

Investigate has_sse2 failure

Investigate ways to define AES_NI from the configure command line,

It's currently setup so simulate a CPU without AES_NI and with SSE2

Edit: PM sent


Hi joblo,
The package you sent almost built out of the box, but only after I modified the build.sh
I removed the -p switch and added -march=native.

With your original build.sh, the following two files threw compiler errors:
1. algo/sse2/groestl/grso-asm.c    <- will not accept the -p switch.   Perhaps the function call is using too many registers when in profiling mode
2. algo/aes_ni/echo512/hash.c    <- tons of errors when not building with -march=native

After fixing build.sh the produced executable ran fine.
I also built on one of my Haswell I5s, but mining X11 shares never got accepted...

EDIT: It Does accept shares on my I5.  But the hash is much lower than the "standard" version (cpuminer-opt-3.0.2) -
84 vs 128 kH/s per thread.

[/quote]

Thanks very muchfor your testing. I have never used the build script so thanks for that too.

Your haswelll performnce was lower because I rigged AES_NI to always be false and force it
to use the SSE kernels.

Test Passed!

I'll have a new release out later today thanks to your testing.
Cryptonight support coming, and maybe one or two more.
full member
Activity: 279
Merit: 104

The cpu check fails on this computer and I suspect cpuid has been disabled in the BIOS.  I will check when I get a chance.

I downloaded the code you sent.  I had to do the following mods to get it to compile.

1. Commented out #define AES_NI and Added #ifdef AES_NI to effectively comment out code in algo/cryptonight/cryptonight-aesni.c.
2. Added #undef AES_NI in algo/aesni/echo512/hash.c
since miner.h is not included.
3. Added #ifdef AES_NI_ON a lot of places in algo/aesni/groestl/groestl-intr-aes.h

After these changes, I could leave both
#define AES_NI
#define AES_NI_ON 1
in miner.h and it compiled/runs fine.

Hashrates:
With cpu_sse2 = false I get
[2016-01-24 13:05:53] CPU #0: 25.50 kH/s
[2016-01-24 13:05:53] CPU #1: 25.50 kH/s
Set to true:
[2016-01-24 13:06:36] CPU #0: 43.38 kH/s
[2016-01-24 13:06:36] CPU #1: 43.38 kH/s

EDIT: You might want to move the calls to has_aes_ni() and has_sse2() to the top of main()  and make the boolean flags
global.  So that these functions are not called every pass of the main loop.  No need to call these more than once  Wink

I've di there was already some cpuid checks in the code that will all have to be consolidated/ It's not a priority
because the check is not expensive and is only really done at startup.

Your x86_64/SSE2 performance ratio is 25.0 / 43.38 = .58
my 4790K  is 266 / 472 = ,56

That's pretty close. If effect your CPU SSE2's performance is just as good as mine. My only advantage
is the addition of AES_NI and possibly higher power efficiency. There' slife in those old CPU yet.

I'll take some time to digest the rest of your report, I just woke up.

I think I have something that works. I justhave to bundle it up and build another debug load for you.
It should run straight out of the box on your CPU.

But things aren't perfect. There are two problems I had to workaround.

The #define AES_NI in miner.h is not being seen in any files that reference AESNI. I therefore had
to add #define AES_NI in every file  with any reference to AES_NI code.

The other problem is that has_sse2() isn't working so I hard coded it.

I tested it 4 ways:

1. AES_NI defined and march=native: my normal environment

2. AES_NI defined and march=core2. this fails to compile as expected.

3. AES_NI not defined and march=native, this compiles but performs at SSE2 rates as expected

4. AES_NI not defined and march=core2. this simulates your environment and works at SSE2 rates
    as expected.

The only thing now is to confirm it works with the default build instructions on your machine and
performs at SSE2 levels.

Follow up items:

investigate why #define in miner.h not seen

Investigate has_sse2 failure

Investigate ways to define AES_NI from the configure command line,

It's currently setup so simulate a CPU without AES_NI and with SSE2

Edit: PM sent


Hi joblo,
The package you sent almost built out of the box, but only after I modified the build.sh
I removed the -p switch and added -march=native.

With your original build.sh, the following two files threw compiler errors:
1. algo/sse2/groestl/grso-asm.c    <- will not accept the -p switch.   Perhaps the function call is using too many registers when in profiling mode
2. algo/aes_ni/echo512/hash.c    <- tons of errors when not building with -march=native

After fixing build.sh the produced executable ran fine.
I also built on one of my Haswell I5s, but mining X11 shares never got accepted...

EDIT: It Does accept shares on my I5.  But the hash is much lower than the "standard" version (cpuminer-opt-3.0.2) -
84 vs 128 kH/s per thread.
legendary
Activity: 1470
Merit: 1114

The cpu check fails on this computer and I suspect cpuid has been disabled in the BIOS.  I will check when I get a chance.

I downloaded the code you sent.  I had to do the following mods to get it to compile.

1. Commented out #define AES_NI and Added #ifdef AES_NI to effectively comment out code in algo/cryptonight/cryptonight-aesni.c.
2. Added #undef AES_NI in algo/aesni/echo512/hash.c
since miner.h is not included.
3. Added #ifdef AES_NI_ON a lot of places in algo/aesni/groestl/groestl-intr-aes.h

After these changes, I could leave both
#define AES_NI
#define AES_NI_ON 1
in miner.h and it compiled/runs fine.

Hashrates:
With cpu_sse2 = false I get
[2016-01-24 13:05:53] CPU #0: 25.50 kH/s
[2016-01-24 13:05:53] CPU #1: 25.50 kH/s
Set to true:
[2016-01-24 13:06:36] CPU #0: 43.38 kH/s
[2016-01-24 13:06:36] CPU #1: 43.38 kH/s

EDIT: You might want to move the calls to has_aes_ni() and has_sse2() to the top of main()  and make the boolean flags
global.  So that these functions are not called every pass of the main loop.  No need to call these more than once  Wink

I've di there was already some cpuid checks in the code that will all have to be consolidated/ It's not a priority
because the check is not expensive and is only really done at startup.

Your x86_64/SSE2 performance ratio is 25.0 / 43.38 = .58
my 4790K  is 266 / 472 = ,56

That's pretty close. If effect your CPU SSE2's performance is just as good as mine. My only advantage
is the addition of AES_NI and possibly higher power efficiency. There' slife in those old CPU yet.

I'll take some time to digest the rest of your report, I just woke up.

I think I have something that works. I justhave to bundle it up and build another debug load for you.
It should run straight out of the box on your CPU.

But things aren't perfect. There are two problems I had to workaround.

The #define AES_NI in miner.h is not being seen in any files that reference AESNI. I therefore had
to add #define AES_NI in every file  with any reference to AES_NI code.

The other problem is that has_sse2() isn't working so I hard coded it.

I tested it 4 ways:

1. AES_NI defined and march=native: my normal environment

2. AES_NI defined and march=core2. this fails to compile as expected.

3. AES_NI not defined and march=native, this compiles but performs at SSE2 rates as expected

4. AES_NI not defined and march=core2. this simulates your environment and works at SSE2 rates
    as expected.

The only thing now is to confirm it works with the default build instructions on your machine and
performs at SSE2 levels.

Follow up items:

investigate why #define in miner.h not seen

Investigate has_sse2 failure

Investigate ways to define AES_NI from the configure command line,

It's currently setup so simulate a CPU without AES_NI and with SSE2

Edit: PM sent
legendary
Activity: 1470
Merit: 1114
I can confirm that old intel cpus (>= core2) can have performance/power similar to much newer amd processors.
So don't buy amd cpus for mining... well... do not buy any cpu for mining ;-)

Agreed, just use what you already have, which is why I'm trying to suport older HW.
Jump to: