Pages:
Author

Topic: [ANN] cpuminer-opt v3.14.2, open source optimized multi-algo CPU miner - page 3. (Read 10289 times)

full member
Activity: 1424
Merit: 225
ARM YOURSELVES!

cpuminer-opt-23.5 is released with support for ARM CPUs.
Consult the Wiki for details, all links are in the first post of this thread.

As  Beta software problems are expected with ARM CPUs. Please report them but be patient.
Priority will be given to x86_64 regressions, if any.

Crank those ARMs and start your engines.
full member
Activity: 1424
Merit: 225
RE: ARM UPDATE --

I tried to compile the latest source on my 8GB Rpi 4, utilizing the ARM build script.  It failed with an error pointing to line 1247.  The line instruction "-flax-vector-conversions" was in the ARM build script CFLAGS portion, so I did not edit that script, I just ran it.

I can run the build and log the entire thing in a text file, or just cut-n-paste the final error details, then pass the log to you in an email, zipped.  Let me know what you might want.       --scryptr

You jumped the gun, I haven't submitted the working code yet. It will be a full release but without ARM binaries. Sorry for the confusion, watch for the release,
I'm building it as I write.
legendary
Activity: 1797
Merit: 1028
RE: ARM UPDATE --

I tried to compile the latest source on my 8GB Rpi 4, utilizing the ARM build script.  It failed with an error pointing to line 1247.  The line instruction "-flax-vector-conversions" was in the ARM build script CFLAGS portion, so I did not edit that script, I just ran it.

I can run the build and log the entire thing in a text file, or just cut-n-paste the final error details, then pass the log to you in an email, zipped.  Let me know what you might want.       --scryptr
full member
Activity: 1424
Merit: 225
ARM update.

Development is progressing well in spite of a few problems, The next release will be soon and most algos will be supported with
only a couple of exceptions. Many SSE2 optimizations were easily translated directly to NEON and some NEON development is also
applicable to SSE2 which helps older x86_64 CPUs that don't have AVX2. It's a win-win for both architectures.

Go to the github wiki for the technical details and stay tuned.
full member
Activity: 1424
Merit: 225
First accepted share on Arm, unfortunately it's reference code, so no optimization, but still a big step forward.

Code:
./cpuminer -a sha256dt -o stratum+tcp://mine.zergpool.com:3341 -u 1FXaRoufZC6LyPzjNrs7wS47tpgzEpBSiw -p c29,c=btc,sd=0.01  --stratum-keepalive --max-temp 80

         **********  cpuminer-opt 23.5  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AVX512, SHA and VAES extensions by JayDDee.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT

CPU: ARM
SW built on Oct 16 2023 with GCC-11.4.0 Linux
CPU features:  AArch64 NEON
SW features:   AArch64 armv8 NEON
Algo features:        AVX512

Starting miner with AArch64 NEON...

[2023-10-16 02:50:18] CPU affinity [!!!!]
[2023-10-16 02:50:18] Stratum connect stratum+tcp://mine.zergpool.com:3341
[2023-10-16 02:50:18] 4 of 4 miner threads started using 'sha256dt' algorithm
[2023-10-16 02:50:18] Stratum extranonce1 0x81013e9d, extranonce2 size 4
[2023-10-16 02:50:18] Stratum connection established
[2023-10-16 02:50:18] CPU temp: curr 41 C max 0, Freq: 0.600/0.600 GHz
[2023-10-16 02:50:18] New Stratum Diff 0.2, Block 413277, Tx 3, Job 2513
                      Diff: Net 1.7532e+05, Stratum 0.2, Target 0.2
[2023-10-16 02:50:38] 1 Submitted Diff 3.6022, Block 413277, Job 2513
[2023-10-16 02:50:38] 1 Accepted 1 S0 R0 B0, 20.327 sec (141ms)
^C[2023-10-16 02:50:46] SIGINT received, exiting
full member
Activity: 1424
Merit: 225
It's alive! Only rejects so far so I have a lot  of work to do.

Code:

./cpuminer -a allium -o stratum+tcp://mine.zergpool.com:6433  --stratum-keepalive --max-temp 80

         **********  cpuminer-opt 3.23.5  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AVX512, SHA and VAES extensions by JayDDee.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT

CPU: ARM
SW built on Oct 14 2023 with GCC-11.4.0 Linux
CPU features:  AArch64 NEON
SW features:   AArch64 armv8 NEON
Algo features:        AVX512 VAES

Starting miner with AArch64 NEON...

[2023-10-14 23:47:26] CPU affinity [!!!!]
[2023-10-14 23:47:26] Stratum connect stratum+tcp://mine.zergpool.com:6433
[2023-10-14 23:47:26] 4 of 4 miner threads started using 'allium' algorithm
[2023-10-14 23:47:26] Stratum extranonce1 0x8100e7fc, extranonce2 size 4
[2023-10-14 23:47:26] Stratum connection established
[2023-10-14 23:47:26] CPU temp: curr 47 C max 0, Freq: 0.600/0.600 GHz
[2023-10-14 23:47:26] New Stratum Diff 0.08, Block 4397517, Tx 0, Job 4f99
                      Diff: Net 14.632, Stratum 0.08, Target 0.0003125
[2023-10-14 23:47:36] New Work: Block 4397517, Tx 1, Netdiff 14.632, Job 4f9b
                      Diff: Net 14.632, Stratum 0.08, Target 0.0003125
                      TTF @ 85.21 kh/s: Block 8d12h, Share 0m15s
[2023-10-14 23:47:36] New Block 4397518, Tx 0, Netdiff 13.453, Job 4f9c
                      Diff: Net 13.453, Stratum 0.08, Target 0.0003125
                      TTF @ 118.20 kh/s: Block 5d15h, Share 0m11s
                      Net hash rate (est) 5777.88 Mh/s
[2023-10-14 23:47:37] 1 Submitted Diff 0.00034945, Block 4397518, Job 4f9c
[2023-10-14 23:47:37] 1 A0 S0 Rejected 1 B0, 11.634 sec (143ms)
                      Reject reason: Invalid share
                      Share diff: 0.00034945, Target: 0.0003125

full member
Activity: 1424
Merit: 225
R-PI MAY BE CRIPPLED--

Indeed, the 4B doesn't have either AES or SHA2, hoping the 5 has both, and maybe sha3 too.
I just got my Pi 4B yesterday and started compiling today. A lot of work to do remove all the SSE2 hooks in the code,
Pi can do (almost) anything SSE2 can and some things SSE2 can't. I just have to figure out what those things are.

I've found 2 very interestings so far:
ARm as a REV instruction cna can reverse bits in a vecto or word, or word in a vector. X86_64 can't do that except maybe with
AVX512VBMI2.

ARM has no shuffle instruction, A RISC issue because random shuffles need code, either SW or microcode. ARM shuffles have to
be coded in SW using integers while x86_64 is a CISC architecture with microcode.

As far as AES and SHA are concerned, the code is already well segregated so Pi can live with out it just like Core2 can.
I'll ignore that code and focus on other parts until I get a CPU with the features, or someone in the community with
one volunteers.

It's not really about Pi though, the Pi is just a test vehicle. ARM is moving up, they'e in laptops and Macs, soon we may see
ARM desktops. I want to see how they might perform with AES and SHA2 against Intel and AMD core vs core.
legendary
Activity: 1797
Merit: 1028
R-PI MAY BE CRIPPLED--

This is an old post from "oink70", who optimized Monkins' Verushash CCminer:

        "Oink70 (2 yr. ago on Reddit)

Mining Verus is impossible on any Raspberry device. The Raspberry company did not license all CPU instructions, so the required functions are not available on Raspberry branded devices.  As far as I know Raspberry is the only brand that crippled the CPU, so any other ARM device should work."

I picked up your recent interest in coding for ARM processors after reading the CryptoMining-Blog post on VerusCoin (VRSC) mining.  I have been able to compile XMrig and tpruvot's CPUminer-Multi on my R-pi.  The above CCminer packages (Monkins & Oink70) will not compile.

I am currently mining VerusCoin on a new Orange Pi 5.  Verushash and MinotaurX look like good targets for coding, if you are able to find the right functions.

Raspberry Pi 5 is scheduled to hit the market this month, I believe.  The CPU runs at 2.4 GHZ.

Hope to see some fresh code to match!       --scryptr

Hold off on testing, it won't compile.

I have now have a Raspberry Pi so I have to work throuh the build issues. More news soon.

Looking for miners with ARM CPUs.

cpuminer-opt-3.23.4, source code only, has initial support for 64 bit AArch64 CPUs with NEON, SHA2 & AES.
It's experimental and untested. It needs testers. See Wiki for details...

https://github.com/JayDDee/cpuminer-opt/wiki/Support-for-AARCH64
full member
Activity: 1424
Merit: 225
Hold off on testing, it won't compile.

I have now have a Raspberry Pi so I have to work throuh the build issues. More news soon.

Looking for miners with ARM CPUs.

cpuminer-opt-3.23.4, source code only, has initial support for 64 bit AArch64 CPUs with NEON, SHA2 & AES.
It's experimental and untested. It needs testers. See Wiki for details...

https://github.com/JayDDee/cpuminer-opt/wiki/Support-for-AARCH64
full member
Activity: 1424
Merit: 225
Some notes about upcoming CPU extensions and what they mean for cpuminer-opt

TLDR: nothing for at least a year.

SHA512: this is an enhancement to SHA that adds support for SHA-512. SHA512 is expected on Intel Lunar Lake & Arrow Lake CPUs in 2024.
AMD availability is unknown, Zen5 would be optimistic. GCC-13 is required to build with SHA512. SHA-512 is used by m7m, minotaurx as well
as the x16 and x17 series including hmq1725. Performance is expected to exceed AVX512 8 way parallel SHA512. SHA512 needs new code.

SM3 & SM4: SM3 was only used by a couple of short lived algos, SM4 is not used at all in crypto AFAIK. They are of no interest.

AVX10: is a complete redesign of the AVX architecture to replace AVX2 & AVX512. For CPUs with 512 bit vector support it means very little.
It's most significant impact is on CPU limited to 256 bit vectors. AVX10 adds support for AVX512 features & instructions limited to 256 bits (E-cores),
essentially it's AVX512VL without AVX512F. It will supported by GCC no earlier than GCC-14. AVX10, specifically AVX10-512 will first be avalable on
Intel Granite Rapids which will support 512 bit vector length so it's just a rebranding of AVX512. The first real benefit of AVX10 is AVX10-256 which
adds bit rotation, opmasks, and more vector registers. Availability of AVX10-256 has not been announced. AVX10 is not an issue for AMD as all
new CPUs support AVX512 already. Minor coding changes are required to support AVX10-256, AVX10-512 requires no code changes or rebuilding,
AVX512 build will work as is.

APX: this will help all non-vector code by doubling the number of general purpose registers and supporting 3 operand instruction encoding that will
eliminate the need for many register moves. Nothing is known about availability but it was announced by Intel at the same time as AVX10. APX will
also improve AMD CPU performance and are expected to adopt it. No coding changes are required, just rebuilding with a supported compiler.

The current system used to build the Windows binaries package will not support any of these new features because it's based on GCC-9. Providing Windows
binaries with the new features will require a new build environment with a supported compiler and OS.
newbie
Activity: 2
Merit: 0
Yes, we just want another optimizing hoe, just can't stand Rplant's hoe.
I'm just a joy miner
full member
Activity: 1424
Merit: 225

Doesn't matter if there are one or two pools, it's still a shitcoin. Founders keep 5% of every mined block, they should provide a miner or pay
to develop one.
full member
Activity: 1424
Merit: 225
The rplant pool tool has abnormal block ratio in other mining pools.  hope  can add this algorithm

Thanks for the translation.

Rplant's cpuminer works in other pools for other algos so it might not be a problem with the miner.
I couldn't find any other pools with Applecoin/Lyra2a40.
newbie
Activity: 2
Merit: 0
The rplant pool tool has abnormal block ratio in other mining pools.  hope  can add this algorithm
full member
Activity: 1424
Merit: 225
Hello! Can you develop the Lyra2a40 algorithm for CPUminer opt?
github:https://github.com/blade-coder/apple/blob/main/src/hash.h

There was no need to quote the entire first post, or any of it.

I don't see any potential for this algo. It isn't GPU or ASIC resistant as claimed. There's nothing new in the algo so all the pieces
for a GPU or ASIC miner already exist. I'm also not impressed with Apple coin based on their ANN thread here, seems like just another shitcoin.

I think I'll pass, rplant's CPU miner has it covered anyway.
newbie
Activity: 79
Merit: 0
This is the home of cpuminer-opt, the optimized x86_64 CPU miner.

Supporting over 90 agorithms with many optimized for x86_64 CPUs with the latest technologies:

Intel Haswell: AVX2
AMD Zen1: AVX2 and SHA
AMD Zen3, Intel Alderlake*: AVX2, SHA and VAES
Intel Skylake X: AVX512
AMD Zen4, Intel Rocketlake: AVX512, SHA, and VAES

Older 64 bit CPUs with SSE2 are also supported, see below for requirements.
*Alderlake and subsequent Intel desktop architectures are hybrid architecture with AVX512 disabled.
Intel has no current consumer desktop CPUs supporting AVX512.

Downloads

Source code and Windows binaries are avaiable for download. Windows binaries support CPUs up to 64 threads,
higher requires compiling from source with CPU group support, see Wiki for details.
Download only from the official JayDDee git repository.
There may be malware masquerading as cpuminer-opt, stay alert.
I no longer post links in new mesages and I don't post direct links to files. Any such links should be treated
with suspicion. The only valid download link is below and directs to a landing page that can be examined
for legitimacy before any files are downloaded.

Latest release including Windows binaries
https://github.com/JayDDee/cpuminer-opt/releases

Documentation
https://github.com/JayDDee/cpuminer-opt/wiki

List of supported algoritms
https://github.com/JayDDee/cpuminer-opt/wiki/Supported--Algorithms

Source code
https://github.com/JayDDee/cpuminer-opt

New in v3.22.3
Data interleaving and byte swap optimizations with AVX2, AVX512 & AVX512VBMI.
Faster Luffa with AVX2 & AVX512.
Other small optimizations.
Some code cleanup.

New in v3.22.2

Added sha512256d & sha256dt algos.
Fixed intermittant invalid shares lyra2v2 AVX512.
Removed application limits on the number of CPUs and threads, HW and OS limits still apply.
Added a log warning if more threads are defined than active CPUs in affinity mask.
Improved merkle tree memory management for stratum.
Added transaction count to New Work log.
Other small improvements.

New in v3.22.1

#393 fixed segfault in GBT, regression from v3.22.0.
More efficient 32 bit data interleaving.

New in v3.22.0

Stratum: faster netdiff calculation.
Merged a few updates from Pooler/cpuminer:
   Use CURLOPT_POSTFIELDS in json_rpc_call,
   Use CURLINFO_ACTIVESOCKET when supported,
   JSONRPC speedup,
   Speed up hex2bin function.
Small log improvements, notably more frequent hash rate reports.
Removed decred algo.

Full change log: https://github.com/JayDDee/cpuminer-opt/blob/master/RELEASE_NOTES

Requirements:

1. A x86_64 architecture CPU with a minimum of SSE2 support. This includes Intel Core2 and newer and AMD equivalents.
AES optimizations require a CPU with AES_NI including Intel Westmere and newer and AMD equivalents.
Further optimizations are available on some algorithms for CPUs with AVX (Sandybridge), AVX2 (Haswell, Zen1),
AVX512 (Rocketlake, Skylake-X, Zen4), SHA (Zen1, Rocketlake), and VAES (Zen3, Rocketlake).

32 bit Intel and AMD CPUs are not supported. Other architectures such as ARM, Raspberry Pi, RISC-V, etc, are not supported.
Mobile devices like lapotop computers are not recommended because they aren't designed for continuous full load.

2. 64 bit Linux OS. Debian and Fedora based distributions including Ubuntu, Mint, RHEL and clones are known to work and have all dependencies in their
repositories. Others may work but may require more effort.

Windows 7 or newer 64 bits is supported using the pre-compiled binaries package or may be compiled from source using MinGW.

FreeBSD is not actively tested but should work, YMMV.
Apple and Android operating systems are not supported.

Older CPUs, other architectures and operating systems may be supported by TPruvot's cpuminer-multi.

Security warning

Cryptocurrency miners often flagged as malware by antivirus programs. This is usually a false positive, they are flagged simply
because of what they are. However, some malware has been spread using the cover that miners are known to be subject to
false positives. Always be on alert. The source code of cpuminer-opt is open for anyone to inspect. If you don't trust the software
don't download it.

Some cryptographic code has been taken from trusted sources but has been modified for speed at the expense of accepted
security practices. This code should not be imported into applications where secure cryptography is required.

Errata:

Old algorithms that are rarely used or are too difficult to mine with a CPU will not get the latest optimizations.
Cryptonight and variants are no longer supported, use other miners.
Hodl requires a CPU with AES.

Donations

cpuminer-opt has no fees of any kind but donations are accepted.

BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT


Hello! Can you develop the Lyra2a40 algorithm for CPUminer opt?
github:https://github.com/blade-coder/apple/blob/main/src/hash.h
full member
Activity: 1424
Merit: 225
Intel has introduced AVX10 a converged vector ISA to replace AVX512.

TLDR: No, 512 bit vectors will not be supported on future Intel hybrid CPUs.

AVX10 essentially brings AVX512VL features to 256 & 128 bit vectors on E-cores. Features include: double the number of SIMD registers, mask registers to support
masking operations, bit rotation & ternary logic instructions will be implemented on E-cores but 512 bit wide vectors won't.

512 bit vectors (AVX10-512) will only exist on P-cores and only if no E-cores are present. Hybrid CPUs will use AVX10-256 on both E-cores and P-cores.

Version 1: AVX10.1-512, a rebranding of AVX512 with support for 512, 256 & 128 bit vectors, is scheduled to be included on Granite Rapids Xeon CPUs, P-core only, in 2024.
Version 2, AVX10.2-256 will come later and support 256 & 128 bit vectors is expected on Clearwater Forest Xeon E-cores in 2025.

AVX10-512 is binary compatible with AVX512 and software built for AVX512 should run fine on AVX10-512 CPUs.
AVX10-256 should be able to run software built for AVX2 but will not have the AVX512VL features. Recompiling for AVX10 will enable most features
but some source code tweaks may be needed to targeted code to get the best performance.

AVX10 is not backward compatible with existing CPUs, they must continue to use AVX512 or AVX2 software builds as appropriate.

AVX10 could be a non issue for AMD because Zen4 & Zen4c already support AVX512. A Zen based hybrid CPU, should AMD decide to build one, would have a
unified ISA so the P-cores wouldn't need to be gimped because of the E-cores. Zen4 is missing a couple of the latest AVX512 extensions that are to be included
in AVX10.1. Assuming those extensions are forthcoming future Zen CPU architectures will implicitly have AVX10.1-512 support using AVX512 flags without formally
supporting AVX10. There is no official word from AMD yet.

It looks like Intel is giving up on AVX512, future Xeon server CPUs appear to E-core based to increase thead count.

Intel also announced the APX extension, a significant improvement to the base X86_64 ISA by doubling the number of GPRs and supporting 3 operand instruction
encoding. Existing SSE* and AVX* instructions will also gain 3 op encoding. Intel didn't mention when APX would become available but coincident with AVX10
would be a reasonable assumption. No source code changes are required to take adavantage of this feature, recompiling with APX on an APX capable CPU is all
that is needed.

GCC support for AVX10 version 1 is expected in gcc-14 with support for the following:
  -march=graniterapids          includes "-mavx10.1-512"
  -mavx10.1                           default, same as "-mavx10.1-256"
  -mavx10.1-256                    AVX512 without 512 bit vectors, no supported CPUs until AVX10.2
  -mavx10.1-512                    equivalent of existing AVX512 superset

full member
Activity: 1424
Merit: 225
dose it support all intel cpus  old and new ?


Mostly yes, see the first post for details.
newbie
Activity: 21
Merit: 0
dose it support all intel cpus  old and new ?
Pages:
Jump to: