[ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 89.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

3.5.9
timetravel
lyra2z
deep ?
im voting for 3.5.9 Wink

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: th3.r00t on March 02, 2017, 12:32:09 AM

Quote from: joblo on March 01, 2017, 01:57:21 PM

Which is better on x2, 3.4.12 or 3.5.9?

Generally on all AMD's without AES/AVX, v3.4.12 produce higher hashrates than newer v3.5.x branch.
If there is AES/AVX support on AMD CPU newer v3.5.x branch is better.

Thanks. How about broken algos? Any difference?

I'm thinking of updating the legacy branch and
want it to have the most working algos. Restoring the 3.4.12 SSE2 performnace is the easy part.
Tiptoeing through the bug fixes and regressions is a little trickier. I want to chose the legacy base
(3.4.12 or 3.5.9) to make it easier. Find one algo that works on one but not the other and we have
a winner.

Plans for the legacy branch:

- updated no sooner than after 3.5.12 is released, master branch always has priority
- port new algos
- port bug fixes
- maintain the SSE2 groestl performance of 3.4.12
- have the most working algos

Any update to the legacy branch can be considered the last, so ity's a good idea to get it right.

th3.r00t

sr. member

Activity: 312

Merit: 250

Quote from: joblo on March 01, 2017, 01:57:21 PM

Which is better on x2, 3.4.12 or 3.5.9?

Generally on all AMD's without AES/AVX, v3.4.12 produce higher hashrates than newer v3.5.x branch.
If there is AES/AVX support on AMD CPU newer v3.5.x branch is better.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

i must be a superhuman and magician if i can read something from a white blank page LOL or they hiding something from us Wink

edit : just tried "quark " works Wink

integrale

full member

Activity: 144

Merit: 100

Eager to learn

ok thanks for the tip last try

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: integrale on March 01, 2017, 03:49:49 PM

i would use 3.5.9 because of more algos implemented , my opinion

btw. the link you provided is a master release

looks like hmq1725 not included

Code:

./cpuminer -a hmq1725 -t 2 -o stratum+tcp://yiimp.ccminer.org:3747 -u BCuFcijYuwgn4oFzo3ZVecebdrw2C7Fbqy -p x,c=BOAT,stats
** cpuminer-multi 1.1-git by Tanguy Pruvot (tpruvot@github) **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd

[2017-03-01 18:29:07] Unknown algo parameter 'hmq1725'
Try `cpuminer-multi --help' for more information.
sklave@miner3-HP-500B:~/hmq1725$

im giving up for now searching this algo

LOL. Ocminer likes to take shortcuts and reuse existing algo names, I think he used quark for this one.
Check the instructions at suprnova to be sure.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

i would use 3.5.9 because of more algos implemented , my opinion

btw. the link you provided is a master release

looks like hmq1725 not included

Code:

./cpuminer -a hmq1725 -t 2 -o stratum+tcp://yiimp.ccminer.org:3747 -u BCuFcijYuwgn4oFzo3ZVecebdrw2C7Fbqy -p x,c=BOAT,stats
** cpuminer-multi 1.1-git by Tanguy Pruvot (tpruvot@github) **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd

[2017-03-01 18:29:07] Unknown algo parameter 'hmq1725'
Try `cpuminer-multi --help' for more information.
sklave@miner3-HP-500B:~/hmq1725$

im giving up for now searching this algo

joblo

legendary

Activity: 1470

Merit: 1114

Which is better on x2, 3.4.12 or 3.5.9?

denravonska

newbie

Activity: 37

Merit: 0

Quote from: joblo on March 01, 2017, 11:01:08 AM

Quote from: integrale on March 01, 2017, 10:23:25 AM

the algos which gonna fail

miner stops after checking first block , crash is then reported by miner as illegal instruction

3.5.11 tested , same results as in 3.5.10 nothing changes for Amd cpu

Thanks for testing. I think I've done all I can to get thosre old AMDs to work.

An illegal instruction usually means the CPU hasn't implemented it, often when trying to
ue AES instructions on a CPU that doesn't support AES. In this case it appears these AMD
CPUs don't have the full SSE2 implementation.

Even if there was a way for the compiler to detect this condition the result would be a build
without any optimizations, ie it would be the same as cpuminer-multi.

My recommendation is to use cpuminer-multi on those algos.

Oh, I didn't realize my Athlon64 was _that_ old, failing at SSE3 instructions. Reverting back to 3.4.12 then

integrale

full member

Activity: 144

Merit: 100

Eager to learn

no need to change something , only inform in the OP-page those user of this type of cpu
because hard enough any change

take it only as my advice

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: integrale on March 01, 2017, 12:05:10 PM

as i remember you changed some memory things because of the Stack-smashing to avoid crashes in 3.5.10 maybe that makes some algos slower , that affects the miner reponse is slower to find accepted hash , its really not easy to detect all difference on my testing-tour uffff allways coming new one to the daylight Wink

Yes, I converted some memory accesses to use SSE2 instructions. If those instructions are less efficient on AMD x2
it could be slower.

If that's the case and it is faster on newer CPUs I don't want to trade that speed for an old AMD architecture.
I haven't tested yet, so we'll see.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

as i remember you changed some memory things because of the Stack-smashing to avoid crashes in 3.5.10 maybe that makes some algos slower , that affects the miner reponse is slower to find accepted hash , its really not easy to detect all difference on my testing-tour uffff allways coming new one to the daylight Wink

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: integrale on March 01, 2017, 11:43:32 AM

i havent noticed on other algos , i know timetravel is changing and i have also tested several times to confirm the loss for my own changing miner version in time gaps of some second many time it ranges in between 8-10% lower hashrate on 3.5.10 compared to 3.5.9 because of this im using the faster one only for that type of cpu . (3 pcs. of Athlon II x2 240 , 250 255 ) the peak delivered hashrate is between 126 - 140 kh/s
if it helps you i can test other algos of your choice , let me only know wich one´s i´ll report then

I misunderstood, the delta was in 3.5.10 vs 3.5.9. 3.5.10 did have some changes that would affect Timetravel.
I only tested performance on AVX2. the changes may have had a negative effect on older CPUs.
I'll do some more testing.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: ocminer on March 01, 2017, 07:17:20 AM

I've created a pull request to add support for onecoin (OC):

https://github.com/JayDDee/cpuminer-opt/pull/5

Coin:
https://bitcointalksearch.org/topic/annoc-onecoin-no-premine-fair-launch-sha256-triple-1801129

I've looked at your pull request. I would normally take the code and implement it in my own environment
then upload it to git but I'm willing to let git do the merge, but there are a couple of small changes
needed.

1. I like to keep the algos in alphabetical order, so sha256t would be listed after sha256d everywhere.

2. sha256t.c source file should be in algo/sha2/, same place as little brother.

3. It may be desireable to include the coin name and symbol in the help text, and fix the text alignment.

I can easilly make the changes and merge the algo into my 3.5.12 dev stream and it will get released
in the normal course. Nothing else to release yet so it may be a few days.

Or, you can make the changes and resubmit the pull request and I will merge it immediatly in git.
Either way works for me.

Edit: one more thing, the algo_not_tested warning in register_sha256t_algo. You can remove it if you've
tested it yourself, or leave it in until the coin goes live and gets field tested. Your call.

Edir Nevermind. I've got it coded and it works, will be in 3.5.12, missed 3.5.11 by less than a day.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

i havent noticed on other algos , i know timetravel is changing and i have also tested several times to confirm the loss for my own changing miner version in time gaps of some second many time it ranges in between 8-10% lower hashrate on 3.5.10 compared to 3.5.9 because of this im using the faster one only for that type of cpu . (3 pcs. of Athlon II x2 240 , 250 255 ) the peak delivered hashrate is between 126 - 140 kh/s
if it helps you i can test other algos of your choice , let me only know wich one´s i´ll report then

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: integrale on March 01, 2017, 11:05:07 AM

yes , good advice only that multiminer doesnt support hmq1725 yet , maybe in future releases hopefully

thank you

I stole the code from ocminer at suprnova.

https://github.com/ocminer/cpuminer-hmq1725

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: oldDIN on March 01, 2017, 11:08:20 AM

Quote from: joblo on March 01, 2017, 10:52:06 AM

LOL. Which algo?

Yet they do not believe Undecided

AMD FX-6300 t-5 example
hmq1725 90 - 95
zoin 201 - 210
TimeTrav 360 - 385
Although the transition has been with GCC 6.2 GCC 6.3
Is he able to influence?

Not even AVX2, pretty good. The thing is there is nothing that changed in those algos in 3.5.11
except fixing the bug in hmq1275, but that had no effect on performance.

joblo

legendary

Activity: 1470

Merit: 1114

Quote from: integrale on March 01, 2017, 10:54:21 AM

fine , then you can count yourself to the lucky one´s

we´ve talked about this config

Xubuntu 16.04LTS GCC 5.4.1 AMD Athlon II x2 240

Cpuminer-opt 3.5.10

hmq1725 and some other x-variants , also the hashrate is lower at about 10kh/s compared to 3.5.9 on timetravel makes around 8% in my case

Can you clarify the performance loss? Timetravel is dificult to measure because the hash rate is always changing.
You need to compare both immediately after each other and sometimes repeatedly to ensure the same permutation
is tested.

Most of the optimizations I implemented don't help on CPUs without AES and AVX, but shouldn't hurt either.
If Timetravel is slower in 3.5.10 many other algos should also be slower. It's possible some changes may have
actually hurt performance but but wasn't obvious because it was offset by gains from other changes. It may
only become apparent on CPUs that don't get the offsetting gains. The result is I may be bale to restore the
lost performance on your x2 while also improving the performance on more advanced acrhitectures.

TLDR alert.

I am discovering this more and more. My attempts to implemet AVX2 for Luffa have been slower. Switching
back and forth from 128 bit to 256 bit vectors adds overhead. Conversely converting some operations to 256 bit
adds complexity which also adds overhead.

This kind of optimization reduces the instruction count but only if the operations translate directly from 128 to 256 bits.
Anything more complex than simple arithmetic will likley add overhead by requiring more complex (and slower)
instructions or more instructions to perform the same operation. This negates the instruction savings of doubling up
the data size. Also AVX2 doesn't reduce memory accesses, it can just do them with fewer instructions.

As optimizing gets more aggressive there is more potential for unintended side effects. I would like to avoid them
if possible.

oldDIN

member

Activity: 85

Merit: 10

integrale
Для чeгo ты этo нaпиcaл? Я oтвeчaл нe тeбe. Coмнeний пo твoим пpoблeмaм нe вoзникaлo.

integrale

full member

Activity: 144

Merit: 100

Eager to learn

FX 6300

Features

MMX instructions
Extensions to MMX
SSE / Streaming SIMD Extensions
SSE2 / Streaming SIMD Extensions 2
SSE3 / Streaming SIMD Extensions 3
SSSE3 / Supplemental Streaming SIMD Extensions 3
SSE4 / SSE4.1 + SSE4.2 / Streaming SIMD Extensions 4 ?
SSE4a ?
AES / Advanced Encryption Standard instructions
AVX / Advanced Vector Extensions
BMI1 / Bit Manipulation instructions 1
FMA3 / 3-operand Fused Multiply-Add instructions
FMA4 / 4-operand Fused Multiply-Add instructions
F16C / 16-bit Floating-Point conversion instructions
TBM / Trailing Bit Manipulation instructions
XOP / eXtended Operations instructions
AMD64 / AMD 64-bit technology ?
AMD-V / AMD Virtualization technology
EVP / Enhanced Virus Protection ?
Turbo Core 3.0 technology

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 89. (Read 444067 times)