Pages:
Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 89. (Read 444067 times)

full member
Activity: 144
Merit: 100
Eager to learn
3.5.9
timetravel
lyra2z
deep ?
im voting for 3.5.9 Wink
legendary
Activity: 1470
Merit: 1114
Which is better on x2, 3.4.12 or 3.5.9?

Generally on all AMD's without AES/AVX, v3.4.12 produce higher hashrates than newer v3.5.x branch.
If there is AES/AVX support on AMD CPU newer v3.5.x branch is better.

Thanks. How about broken algos? Any difference?

I'm thinking of updating the legacy branch and
want it to have the most working algos. Restoring the 3.4.12 SSE2 performnace is the easy part.
Tiptoeing through the bug fixes and regressions is a little trickier. I want to chose the legacy base
(3.4.12 or 3.5.9) to make it easier. Find one algo that works on one but not the other and we have
a winner.

Plans for the legacy branch:

- updated no sooner than after 3.5.12 is released, master branch always has priority
- port new algos
- port bug fixes
- maintain the SSE2 groestl performance of 3.4.12
- have the most working algos

Any update to the legacy branch can be considered the last, so ity's a good idea to get it right.
sr. member
Activity: 312
Merit: 250
Which is better on x2, 3.4.12 or 3.5.9?

Generally on all AMD's without AES/AVX, v3.4.12 produce higher hashrates than newer v3.5.x branch.
If there is AES/AVX support on AMD CPU newer v3.5.x branch is better.
full member
Activity: 144
Merit: 100
Eager to learn
i must be a superhuman and  magician  if i can read something from a white blank page   LOL  or they hiding something from us Wink



edit : just tried "quark "    works  Wink
full member
Activity: 144
Merit: 100
Eager to learn
ok  thanks for the tip   last try  Smiley
legendary
Activity: 1470
Merit: 1114
i would use 3.5.9 because of more algos implemented , my opinion


btw. the link you provided is a master release

looks like hmq1725 not included

Code:
./cpuminer -a hmq1725 -t 2 -o stratum+tcp://yiimp.ccminer.org:3747 -u BCuFcijYuwgn4oFzo3ZVecebdrw2C7Fbqy -p x,c=BOAT,stats
** cpuminer-multi 1.1-git by Tanguy Pruvot (tpruvot@github) **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd

[2017-03-01 18:29:07] Unknown algo parameter 'hmq1725'
Try `cpuminer-multi --help' for more information.
sklave@miner3-HP-500B:~/hmq1725$

im giving up for now searching this algo

LOL. Ocminer likes to take shortcuts and reuse existing algo names, I think he used quark for this one.
Check the instructions at suprnova to be sure.
full member
Activity: 144
Merit: 100
Eager to learn
i would use 3.5.9 because of more algos implemented , my opinion


btw. the link you provided is a master release

looks like hmq1725 not included

Code:
./cpuminer -a hmq1725 -t 2 -o stratum+tcp://yiimp.ccminer.org:3747 -u BCuFcijYuwgn4oFzo3ZVecebdrw2C7Fbqy -p x,c=BOAT,stats
** cpuminer-multi 1.1-git by Tanguy Pruvot (tpruvot@github) **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd

[2017-03-01 18:29:07] Unknown algo parameter 'hmq1725'
Try `cpuminer-multi --help' for more information.
sklave@miner3-HP-500B:~/hmq1725$

im giving up for now searching this algo
legendary
Activity: 1470
Merit: 1114
Which is better on x2, 3.4.12 or 3.5.9?
newbie
Activity: 37
Merit: 0
the algos which gonna fail

miner stops after checking first block , crash is then reported by miner as illegal instruction

3.5.11 tested , same results as in 3.5.10 nothing changes for Amd cpu

Thanks for testing. I think I've done all I can to get thosre old AMDs to work.

An illegal instruction usually means the CPU hasn't implemented it, often when trying to
ue AES instructions on a CPU that doesn't support AES. In this case it appears these AMD
CPUs don't have the full SSE2 implementation.

Even if there was a way for the compiler to detect this condition the result would be a build
without any optimizations, ie it would be the same as cpuminer-multi.

My recommendation is to use cpuminer-multi on those algos.

Oh, I didn't realize my Athlon64 was _that_ old, failing at SSE3 instructions. Reverting back to 3.4.12 then Smiley
full member
Activity: 144
Merit: 100
Eager to learn
no need to change something , only inform in the OP-page those user of this type of cpu   
because  hard enough any change

take it only as my advice
legendary
Activity: 1470
Merit: 1114
as i remember you changed some memory things    because of the Stack-smashing   to avoid crashes in 3.5.10  maybe that makes some algos slower , that affects the miner reponse is slower to find accepted hash , its really not easy to detect all difference on my testing-tour   uffff allways coming new one to the daylight Wink


Yes, I converted some memory accesses to use SSE2 instructions. If those instructions are less efficient on AMD x2
it could be slower.

If that's the case and it is faster on newer CPUs I don't want to trade that speed  for an old AMD architecture.
I haven't tested yet, so we'll see.
full member
Activity: 144
Merit: 100
Eager to learn
as i remember you changed some memory things    because of the Stack-smashing   to avoid crashes in 3.5.10  maybe that makes some algos slower , that affects the miner reponse is slower to find accepted hash , its really not easy to detect all difference on my testing-tour   uffff allways coming new one to the daylight Wink
legendary
Activity: 1470
Merit: 1114
i havent noticed on other algos , i know timetravel is changing    and i have also tested several times to confirm the loss for my own   changing miner version in time gaps of some second many time  it ranges in between 8-10% lower hashrate on 3.5.10 compared to  3.5.9  because of this im using the faster one only for that type of cpu . (3 pcs. of Athlon II x2 240 , 250 255 ) the peak delivered hashrate is between 126 - 140 kh/s
if it helps you i can test other algos of your choice , let me only know wich one´s   i´ll report then


I misunderstood, the delta was in 3.5.10 vs 3.5.9. 3.5.10 did have some changes that would affect Timetravel.
I only tested performance on AVX2. the changes may have had a negative effect on older CPUs.
I'll do some more testing.
legendary
Activity: 1470
Merit: 1114

I've looked at your pull request. I would normally take the code and implement it in my own environment
then upload it to git but I'm willing to let git do the merge, but there are a couple of small changes
needed.

1. I like to keep the algos in alphabetical order, so sha256t would be listed after sha256d everywhere.

2. sha256t.c source file should be in algo/sha2/, same place as little brother.

3. It may be desireable to include the coin name and symbol in the help text, and fix the text alignment.

I can easilly make the changes and merge the algo into my 3.5.12 dev stream and it will get released
in the normal course. Nothing else to release yet so it may be a few days.

Or, you can make the changes and resubmit the pull request and I will merge it immediatly in git.
Either way works for me.

Edit: one more thing, the algo_not_tested warning in register_sha256t_algo. You can remove it if you've
tested it yourself, or leave it in until the coin goes live and gets field tested. Your call.


Edir Nevermind. I've got it coded and it works, will be in 3.5.12, missed 3.5.11 by less than a day.
full member
Activity: 144
Merit: 100
Eager to learn
i havent noticed on other algos , i know timetravel is changing    and i have also tested several times to confirm the loss for my own   changing miner version in time gaps of some second many time  it ranges in between 8-10% lower hashrate on 3.5.10 compared to  3.5.9  because of this im using the faster one only for that type of cpu . (3 pcs. of Athlon II x2 240 , 250 255 ) the peak delivered hashrate is between 126 - 140 kh/s
if it helps you i can test other algos of your choice , let me only know wich one´s   i´ll report then
legendary
Activity: 1470
Merit: 1114
yes  , good advice    only that multiminer doesnt support hmq1725 yet  , maybe in future releases  hopefully

thank you

I stole the code from ocminer at suprnova.

https://github.com/ocminer/cpuminer-hmq1725
legendary
Activity: 1470
Merit: 1114
LOL. Which algo?
Yet they do not believe  Undecided
AMD FX-6300 t-5 example
hmq1725   90 -  95
zoin        201 - 210
TimeTrav 360 - 385
Although the transition has been with GCC 6.2  GCC 6.3
Is he able to influence?

Not even AVX2, pretty good. The thing is there is nothing that changed in those algos in 3.5.11
except fixing the bug in hmq1275, but that had no effect on performance.

legendary
Activity: 1470
Merit: 1114
fine , then you can count yourself to the lucky one´s

we´ve talked about this config

Xubuntu 16.04LTS GCC 5.4.1  AMD Athlon II x2 240

Cpuminer-opt 3.5.10

hmq1725 and some other x-variants , also the hashrate is lower at about 10kh/s compared to 3.5.9 on timetravel makes around 8% in my case

Can you clarify the performance loss? Timetravel is dificult to measure because the hash rate is always changing.
You need to compare both immediately after each other and sometimes repeatedly to ensure the same permutation
is tested.

Most of the optimizations I implemented don't help on CPUs without AES and AVX, but shouldn't hurt either.
If Timetravel is slower in 3.5.10 many other algos should also be slower. It's possible some changes may have
actually hurt performance but but wasn't obvious because it was offset by gains from other changes. It may
only become apparent on CPUs that don't get the offsetting gains. The result is I may be bale to restore the
lost performance on your x2 while also improving the performance on more advanced acrhitectures.

TLDR alert.

I am discovering this more and more. My attempts to implemet AVX2 for Luffa have been slower. Switching
back and forth from 128 bit to 256 bit vectors adds overhead. Conversely converting some operations to 256 bit
adds complexity which also adds overhead.

This kind of optimization reduces the instruction count but only if the operations translate directly from 128 to 256 bits.
Anything more complex than simple arithmetic will likley add overhead by requiring more complex (and slower)
instructions or more instructions to perform the same operation. This negates the instruction savings of doubling up
the data size. Also AVX2 doesn't reduce memory accesses, it can just do them with fewer instructions.

As optimizing gets more aggressive there is more potential for unintended side effects. I would like to avoid them
if possible.

member
Activity: 85
Merit: 10
integrale
Для чeгo ты этo нaпиcaл? Я oтвeчaл нe тeбe. Coмнeний пo твoим пpoблeмaм нe вoзникaлo.
full member
Activity: 144
Merit: 100
Eager to learn
FX 6300

Features   

    MMX instructions
    Extensions to MMX
    SSE / Streaming SIMD Extensions
    SSE2 / Streaming SIMD Extensions 2
    SSE3 / Streaming SIMD Extensions 3
    SSSE3 / Supplemental Streaming SIMD Extensions 3
    SSE4 / SSE4.1 + SSE4.2 / Streaming SIMD Extensions 4  ?
    SSE4a  ?
    AES / Advanced Encryption Standard instructions
    AVX / Advanced Vector Extensions
    BMI1 / Bit Manipulation instructions 1
    FMA3 / 3-operand Fused Multiply-Add instructions
    FMA4 / 4-operand Fused Multiply-Add instructions
    F16C / 16-bit Floating-Point conversion instructions
    TBM / Trailing Bit Manipulation instructions
    XOP / eXtended Operations instructions
    AMD64 / AMD 64-bit technology  ?
    AMD-V / AMD Virtualization technology
    EVP / Enhanced Virus Protection  ?
    Turbo Core 3.0 technology
Pages:
Jump to: