[ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 64.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: speedyb on September 03, 2017, 11:22:16 PM

whats most profitable to cpu mine right now or best long term coins to hol;d onto that this program can do?
also how bout XMR (monero) music coin and DMD diamon
are any of those profitable cpu mining on a core i5 7600k

I don't recommend coins. Do your own research, there's lots of info available.

joblo

legendary

Activity: 1470

Merit: 1114

Typical whiny little child. Accuse me of being a bully then launch a personal attack with false accusations
of deleting your post. Just go away.

speedyb

full member

Activity: 203

Merit: 100

whats most profitable to cpu mine right now or best long term coins to hol;d onto that this program can do?
also how bout XMR (monero) music coin and DMD diamon
are any of those profitable cpu mining on a core i5 7600k

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: semihsert on September 03, 2017, 09:13:17 AM

is this miner compatible with mac os x . i can't build it. can anyone send builded version with xevan alog support ?

Thanks.

Sorry, no MAC support. I should be possible (it compiles on FreeBSD) but I know nothing about MAC development
so I can't help.

semihsert

newbie

Activity: 13

Merit: 0

is this miner compatible with mac os x . i can't build it. can anyone send builded version with xevan alog support ?

Thanks.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: ?? on ??

none of them do it ether it crashes unless i set lower cores but i can get all to run on 3 cores but it always defaults to sse2 it even does it in a xenon e5 all default to sse2 could the pool cause that too happen even if i run the aes-avx it does sse2 and i only get like 20 H/s max even with all cores or but slower with 4

You complained about AES_NI when it's the same as AES.
You complained about a crash while using the AVX2 build on a CPU without AVX2.
You failed to provide the key error message (it's not a crash).
You failed to mention it worked with fewer threads (this is a FAQ with verium).
And now your're complaining that it's only using SSE2 on an algo that only supports SSE2.

Stop complaining and learn something.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: ?? on ??

Hi i have an AMD fx 6300 and understand it supports aes-ni and am trying to use your miner bi every time i run it i get this read out

   ********** cpuminer-opt 3.6.8 ***********
   A CPU miner with multi algo support and optimized for CPUs
   with AES_NI and AVX extensions.
   BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT
   Forked from TPruvot's cpuminer-multi with credits
   to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d,
   Wolf0, Jeff Garzik and Optiminer.

CPU: AMD FX(tm)-6300 Six-Core Processor
CPU features: SSE2 AES AVX
SW built on Jul 31 2017 with GCC 4.8.3
SW features: SSE2 AES AVX AVX2
Algo features: SSE2
Start mining with SSE2

[2017-08-28 21:00:48] Starting Stratum on stratum+tcp://vrm.poolium.win:3333
[2017-08-28 21:00:48] 6 miner threads started, using 'scrypt' algorithm.

now you see when detects my CPU it states CPU features: SSE2 AES AVX and i know that also does AES-NI but also you see it says the SW features: SSE2 AES AVX AVX2 but not aes-ni why i downloaded it directly from here or im probably using the wrong exe which one should i use for mining Verium the fastest thanks

You're using the wrong exe, your CPU doesn't have AVX2. Use the exe that matches your CPU's features.
AES is the same as AES-NI.

NameTaken

hero member

Activity: 630

Merit: 502

Quote from: joblo on August 26, 2017, 04:52:57 PM

Interesting question, did this come from the Xeon Phi thread? The short answer is no.

AVX512, as its name suggests. introduces 512 bit vector processing. This means it can only be used on arrays
of multiples of 512 bits (8*64, 16*32 etc) and only if the same operation is performed on all array elements.
This limits the applicability of AVX512 depending on what kind of processing is performed to hash the algorithm.

Vectorizing improves compute performance by reducing the number of instructions but it doesn't help memory performance.
If you need to process 512 bits you srtill need to load the same amount of data from memory. If an algo is memory
hard vectorizing it just means the CPU spends more time idle while it waits for data.

Some algos have segments of code that can be vectorized, some to 128 bits, fewer to 256 and even fewer to 512.
However converting these sections adds overhead because the vector instructions use a different register set than
scalar instructions. Switching back and forth from scalar to vector instructions on the same data means extra instructions
are required to move the data from one register set to the other and back.

I haven't looked deeply into AVX512 but I don't see much opportunity to optimize any algos with it. Anything that can be
vectorized further is likely more efficient on a GPU which is essentially a vector processor. At best, vectorizing a CPU miner
simply reduces the performance deficit slightly compared to GPUs. Many "CPU" algos are designed to be difficult to vectorize.

An interesting article about AVX512:

https://www.hpcwire.com/2017/06/29/reinders-avx-512-may-hidden-gem-intel-xeon-scalable-processors/

Thanks for the response albeit a bit over my head. I saw that X299 and future Ice Lake CPUs will support AVX-512.

joblo

legendary

Activity: 1470

Merit: 1114

Interesting question, did this come from the Xeon Phi thread? The short answer is no.

AVX512, as its name suggests. introduces 512 bit vector processing. This means it can only be used on arrays
of multiples of 512 bits (8*64, 16*32 etc) and only if the same operation is performed on all array elements.
This limits the applicability of AVX512 depending on what kind of processing is performed to hash the algorithm.

Vectorizing improves compute performance by reducing the number of instructions but it doesn't help memory performance.
If you need to process 512 bits you srtill need to load the same amount of data from memory. If an algo is memory
hard vectorizing it just means the CPU spends more time idle while it waits for data.

Some algos have segments of code that can be vectorized, some to 128 bits, fewer to 256 and even fewer to 512.
However converting these sections adds overhead because the vector instructions use a different register set than
scalar instructions. Switching back and forth from scalar to vector instructions on the same data means extra instructions
are required to move the data from one register set to the other and back.

I haven't looked deeply into AVX512 but I don't see much opportunity to optimize any algos with it. Anything that can be
vectorized further is likely more efficient on a GPU which is essentially a vector processor. At best, vectorizing a CPU miner
simply reduces the performance deficit slightly compared to GPUs. Many "CPU" algos are designed to be difficult to vectorize.

An interesting article about AVX512:

https://www.hpcwire.com/2017/06/29/reinders-avx-512-may-hidden-gem-intel-xeon-scalable-processors/

NameTaken

hero member

Activity: 630

Merit: 502

Does any algorithm use AVX-512?

oldDIN

member

Activity: 85

Merit: 10

Quote from: nizzuu on August 14, 2017, 01:40:40 AM

.....
[2017-08-14 09:29:32] No payout address provided, switching to getwork
......

Specify the address in the bat file --coinbase-addr = .....

guytechie

hero member

Activity: 677

Merit: 500

Quote from: NameTaken on August 13, 2017, 11:18:01 AM

Anyone planning on testing Threadripper?

Soon. Soon.

I have a Ryzen 7 that's OC'd to 3.7 GHz (very mild OC) on stock cooler. I only mine m7m and xevan (autoswitching). I get around 208 kh/s with m7m and around 182 kh/s with xevan. Stock is 3.5 GHz, and i get about 15-20 kh/s less for both IIRC.

I am building a 1950X soon. Just waiting for that damn backordered motherboard. However, since it's just basically 2 Ryzen cores, I'm going to assume roughly 2x the speeds if clocked at the same 3.7 GHz. Since base clock is 3.4 GHz, then maybe around 376 kh/s for m7m and 330 kh/s for xevan.

NameTaken

hero member

Activity: 630

Merit: 502

Quote from: joblo on August 14, 2017, 08:44:31 AM

Quote from: NameTaken on August 13, 2017, 11:18:01 AM

Anyone planning on testing Threadripper?

Threadripper expands the concept of the CCXs, 4 core modules that are combined to form a 8, 12 or 16 core CPUs.
The side effect of this is each CCX has it's own piece of the L3 cache. This makes it inefficient for a CCX to access
data from the L3 cache on a different CCX. Ryzen also uses this architecture.

In many ways this is like a multi-CPU system.

Higher hashrates may be achieved by running seperate miner instances for each CCX module to prevent data accesses
that cross CCXs. This will require a different cpu-affinity mask for each instance. I don't know exactly how AMD
maps logical cores to CCXs, particularly when hyperthreading (AMD calls it SMT) is enabled. It may take some trial
and error to reverse engineer the mapping and determine the best affinity mask for each miner instance.

Ryzen CPUs and multi-CPU systems may also benefit from multiple miner instances.

Unfortunately I am just a spectator as I don't have a Ryzen or Threadripper CPU (yet) to do the work myself.

This video from ServeTheHome with AMD EPYC says mining is prefect for the Zen architecture.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: NameTaken on August 13, 2017, 11:18:01 AM

Anyone planning on testing Threadripper?

Threadripper expands the concept of the CCXs, 4 core modules that are combined to form a 8, 12 or 16 core CPUs.
The side effect of this is each CCX has it's own piece of the L3 cache. This makes it inefficient for a CCX to access
data from the L3 cache on a different CCX. Ryzen also uses this architecture.

In many ways this is like a multi-CPU system.

Higher hashrates may be achieved by running seperate miner instances for each CCX module to prevent data accesses
that cross CCXs. This will require a different cpu-affinity mask for each instance. I don't know exactly how AMD
maps logical cores to CCXs, particularly when hyperthreading (AMD calls it SMT) is enabled. It may take some trial
and error to reverse engineer the mapping and determine the best affinity mask for each miner instance.

Ryzen CPUs and multi-CPU systems may also benefit from multiple miner instances.

Unfortunately I am just a spectator as I don't have a Ryzen or Threadripper CPU (yet) to do the work myself.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: nizzuu on August 14, 2017, 01:40:40 AM

Hi there. Joblo, thanx for your great work. Can u please check the possible issue? The problem is hashrate reporting while solo mining (getwork):

********** cpuminer-opt 3.6.8 ***********

[2017-08-14 09:29:31] Binding process to cpu mask f
[2017-08-14 09:29:31] Binding thread 0 to cpu mask f
[2017-08-14 09:29:31] Binding thread 1 to cpu mask f
[2017-08-14 09:29:31] Binding thread 2 to cpu mask f
[2017-08-14 09:29:31] Binding thread 3 to cpu mask f
[2017-08-14 09:29:31] 4 miner threads started, using 'cryptonight' algorithm
[2017-08-14 09:29:32] Current block is 19325
[2017-08-14 09:29:32] No payout address provided, switching to getwork

Then it may take about 10+ minutes to display hashrate, or even up to 1 hour (the result will be printed for all that time though). Tried several coins and several algos, as well as sse42 and avx/avx2 versions, and the previous 3.6.7 version. No changes there. I personally do not see some walls to avoid hasrate output, as all built-in-wallet miners do that. This reproduces both on win7 and win10.

It's hard to tune settings (I'm not about cryptonote algo, of course) for new coins with such a wait Undecided

I had to use shitpool with zero miners and broken payouts to tune Grin

Cos on stratum work there's no issues.

I use some kinda stock .conf file, which is similar for all the coins.

Maybe there're some solutions rather than miner's code? Thanks.

I'm not sure I understand the problem. It is normal to have fewer hash reports when solo mining because you have to solve
the entire block, not just a share.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: Epsylon3 on August 14, 2017, 04:24:14 AM

Quote from: joblo on August 13, 2017, 10:58:08 PM

A heads up for those mining on yiimp with a CPU, you may see rejected shares for low difficulty.

https://bitcointalksearch.org/topic/m.20854352

No, its not what i meant.. its just the minimal static diff is now limited

My misunderstanding. The impact will be infrequent share submissions.

Epsylon3

legendary

Activity: 1484

Merit: 1082

ccminer/cpuminer developer

Quote from: joblo on August 13, 2017, 10:58:08 PM

A heads up for those mining on yiimp with a CPU, you may see rejected shares for low difficulty.

https://bitcointalksearch.org/topic/m.20854352

No, its not what i meant.. its just the minimal static diff is now limited

nizzuu

full member

Activity: 187

Merit: 100

Cryptocurrency enthusiast

Hi there. Joblo, thanx for your great work. Can u please check the possible issue? The problem is hashrate reporting while solo mining (getwork):

********** cpuminer-opt 3.6.8 ***********

[2017-08-14 09:29:31] Binding process to cpu mask f
[2017-08-14 09:29:31] Binding thread 0 to cpu mask f
[2017-08-14 09:29:31] Binding thread 1 to cpu mask f
[2017-08-14 09:29:31] Binding thread 2 to cpu mask f
[2017-08-14 09:29:31] Binding thread 3 to cpu mask f
[2017-08-14 09:29:31] 4 miner threads started, using 'cryptonight' algorithm
[2017-08-14 09:29:32] Current block is 19325
[2017-08-14 09:29:32] No payout address provided, switching to getwork

Then it may take about 10+ minutes to display hashrate, or even up to 1 hour (the result will be printed for all that time though). Tried several coins and several algos, as well as sse42 and avx/avx2 versions, and the previous 3.6.7 version. No changes there. I personally do not see some walls to avoid hasrate output, as all built-in-wallet miners do that. This reproduces both on win7 and win10.

It's hard to tune settings (I'm not about cryptonote algo, of course) for new coins with such a wait Undecided

I had to use shitpool with zero miners and broken payouts to tune Grin

Cos on stratum work there's no issues.

I use some kinda stock .conf file, which is similar for all the coins.

Maybe there're some solutions rather than miner's code? Thanks.

TexasHuck

full member

Activity: 126

Merit: 100

Ⓚ Kore Projects CTO Ⓚ

Thank you joblo for this great tool. I just started my test run with it. Can't wait to see the results. Thanks.

joblo

legendary

Activity: 1470

Merit: 1114

A heads up for those mining on yiimp with a CPU, you may see rejected shares for low difficulty.

https://bitcointalksearch.org/topic/m.20854352

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 64. (Read 444122 times)