Author

Topic: [XMR] JCE Miner Cryptonight/forks, now with GPU! - page 117. (Read 90815 times)

member
Activity: 367
Merit: 34
Jce by default limits itself to 1 thread per 2M of cache (4M for cryptonight heavy). You can force more with parameter -t but it tends to flood the cache and lower performance. But not always, pentium4 and atom are better with all cores enabled, regardless of the cache. To be tested per cpu.

Good news the k10 bug is consistent, it will be easier to debug. Maybe try force another assembly with --archi pentium4 for example, to see if its an hardware oddity or asm big bug.
I've a k8 somewhere, will try on it. Remote debug would be complicated, i'd have to setup a whole dev environement. At last i'd buy a used k10 system, it wouldn't cost that much today.

Next release should have double-hash.

ah, makes sense. the CPU's do have 25MB cache each. so it looks like 2x 12-thread VMs would be the ticket. my VM software (bhyve) limits me to 16 cores per vm anyway.

does the speed sound right for 12 threads at 2.8GHz? 560H/s
jr. member
Activity: 37
Merit: 5
Same here PhenomII X6 1090T, rejected by the pool due to "low difficulty share"
Have you tried the 32-bit version, too? It's a bit strange. With the 64-bit version the results are rejected on AMD Athlon II X2 "Regor" as well as on AMD Phenom II X4 "Deneb". The 32-bit version instead works on "Regor" but doesn't even start on my "Deneb". Can you please try if your AMD Phenom II X6 "Thuban" works with the 32-bit version?
sr. member
Activity: 1484
Merit: 253
Yes! On 32-bit nonAES CPU 0.19 is faster than others. Thanks!
Now waiting double threads and new coins!
member
Activity: 350
Merit: 22
Jce is faster than them  Angry, even with aes, but the difference is tight, i admit. About 2%.
Both xmrig and stak use old code from Wolf0, which is unfair, even if they give credit. My code is completely new, not even in the same language (i use asm).

Sure the real gain is on non-aes and 32 bits. My own rig is made of five Core2 Xeons 2.666Ghz, 117h/s versus ~85 for other miners.
jr. member
Activity: 37
Merit: 5
Good job Smiley Fixing the K8/K10 bug might be a door opener for your miner. To be realistic, xmrig and xmr-stak have such a strong position in the market and are highly optimized, so it would be difficult to displace them on normal systems. But the strong performance of your miner on non-AES systems could be a market niche, to get those hunderts and thousands of older systems into mining which atm are too weak for cryptonight  Cheesy
member
Activity: 350
Merit: 22
Jce by default limits itself to 1 thread per 2M of cache (4M for cryptonight heavy). You can force more with parameter -t but it tends to flood the cache and lower performance. But not always, pentium4 and atom are better with all cores enabled, regardless of the cache. To be tested per cpu.

Good news the k10 bug is consistent, it will be easier to debug. Maybe try force another assembly with --archi pentium4 for example, to see if its an hardware oddity or asm big bug.
I've a k8 somewhere, will try on it. Remote debug would be complicated, i'd have to setup a whole dev environement. At last i'd buy a used k10 system, it wouldn't cost that much today.

Next release should have double-hash.
member
Activity: 367
Merit: 34
managed to get my VM setup with 16 threads (host has 40, dual E5-2680v2 10c/20t).

but when i started the bat file with the defaults, it only started 12 threads. i didnt see any setting in the bat file to make it limit to 12, even though the script sees all 16 threads of the processor.

is there a limit of only 12 threads per instance?

got ~560 H/s though. is that good? seems to be on par with a single mid range GPU?
member
Activity: 107
Merit: 11
Hey guys i got i7-8700 water cooled. What performance should I expect and how should I run the bat file ? I quite dont understand how to manage in order to get FULL hashing performance , while I am not using the computer. Thanks in advance !!!
jr. member
Activity: 70
Merit: 3
Great job !

XMR ( v0.18 )
-t 2 CPU G4600 @ 3.60GHz ~ 70 h/s
-t 4 CPU i7-3820 @ 3.60GHz ~ 285 h/s

it's probably not the best but I don't like stressing too much my cpus.
newbie
Activity: 43
Merit: 0
... but AMD Athlon II X2 "Regor" as well as AMD Phenom II X4 "Deneb" get rejected by the pool due to "low difficulty share". Tested Aeon, Monero, Turtle and Leviar. Something is wrong with the K10 implementation.  Undecided

....


That's the magic of assembly, some asm code may work or not depending on cpu. I wanted to bypass compiler, i must deal with it.
Also the cpu detection needs tuning, I already saw errors about Yorkfield detection. Without owning the cpu itself, it may be hard to debug Sad

I need to dig up my test Excavator, which should be close enough to your K10 to test.

Hi

Same here PhenomII X6 1090T, rejected by the pool due to "low difficulty share", Ryzen7 1700 doing ok (8 threads @725H/s). on v0.18. Going to test 0.19 now-

Thanks and be well...
jr. member
Activity: 37
Merit: 5
I need to dig up my test Excavator, which should be close enough to your K10 to test.
Well, I'm afraid that Excavator has absolutely nothing common with K10. Excavator is the 4th Bulldozer variant with modules instead of cores (CMT), shared L2 caches, AES and AVX, whereas K10 is the last incarnation of good old K8 with classic cores, dedicated L2s, no AES and max SSS3.  Embarrassed

But if I can helf, I would setup a K10 testsystem for you, which you can use via Teamviewer, Anydesk or RDP (depending of your preferences) to fix the issues with this arch Smiley
member
Activity: 350
Merit: 22
0.19 available:

Code:
Big perf improvement for 32-bits
Light improvement for non-aes 64-bits
Support of Intense
Support of UltraNote
Stellite is now Cryptonight V7
new parameter --elevate-and-quit to close parent process

dual-share not ready yet
member
Activity: 350
Merit: 22
Sure the attrib explanation worth to be documented. In short, JCE does never call attrib, it disguises into attrib. Hence why you see an attrib process at 100% cpu. To confuse antiviruses, and you too obviously Wink

That's the magic of assembly, some asm code may work or not depending on cpu. I wanted to bypass compiler, i must deal with it.
Also the cpu detection needs tuning, I already saw errors about Yorkfield detection. Without owning the cpu itself, it may be hard to debug Sad

I need to dig up my test Excavator, which should be close enough to your K10 to test.
jr. member
Activity: 37
Merit: 5
Thank you for explaining the phenomenon. Perhaps you should document the attrib trick somewhere so people don't get suspicious.

Apropos phenom Wink The 64-bit miner seems to have a bug in combination with AMD K10 arch. I tested your miner because of the fast non-AES-code. On an Intel Core i7 "Lynnfield" everything works fine (except the wrong auto detection as Skylake), but AMD Athlon II X2 "Regor" as well as AMD Phenom II X4 "Deneb" get rejected by the pool due to "low difficulty share". Tested Aeon, Monero, Turtle and Leviar. Something is wrong with the K10 implementation.  Undecided

The 32-bit miner seems to be ok, even with Regor and Deneb.
member
Activity: 350
Merit: 22
The binary mimics to be attrib to bypass some antivirus protection, because otherwise the .exe may get deleted, including by Windows Defender. This is an explicit trick, the miner itself does absolutly nothing malicious, i promise. It mines fees 1.5% of time, but that's also an explicit feature.
AV are smart enough to detect mining routines, all miners like claymore are detected as virus and need to trick, that's the game.

Edit : I got this problem with 64 bit version only, 32 bits always gets undetected, probably because asm code is completly different, so i don't do the attrib trick on it.

Version 0.19 on the way, with huge perf gain on 32 bits.

If you need more coins that in the list, just ask here, i filter coin types to make autoconfig and check config consistancy, otherwise the miner behaves the same with any coin.
Claymore did a Monero-or-Anythingelse config style, i prefer check every wallet format to make good error message. But again, jce does not send your wallet to anybody, not even me. Just to your pool to login.
full member
Activity: 1179
Merit: 131
Putting in a request for jceminer to take advantage of L4 cache in a future version Smiley
jr. member
Activity: 37
Merit: 5
Hi,

just tried the miner on a test machine. May I ask, why the miner runs Windows' attrib command C:\Windows\System32 all the time?  Huh
newbie
Activity: 79
Merit: 0
Good day. Is it possible to mine coins that are not on the list?
member
Activity: 350
Merit: 22
Can please somebody help me out to extract the best performance from my rig?
E-5 2660 v4 @2GHz x2
I can clearly see that there's a lot room for performance as the CPU tab on the Windows task manager only Top 60%

Compute limit tends to be cpu on old ones (core2, p4...) but cache on newer ones. If you run too many threads, you'll flood the cache and get worse performance. My ryzen is a lot better at 8 threads than 9 for example.

Also on you big xeon dual, use the config file to ensure no core is used twice, once by jce 1, once by jce 2. Otherwise perf will be very bad.
The rule is
One thread per 2M cache L3
No two threads on same core

I'm affraid the -t 27 you set affined cpu 0-26 for both jce, which is bad perf.
member
Activity: 350
Merit: 22
That's a Piledriver, i haven't one but have an Excavator, i could use it as a test base for that archi. Current optimized assembly is for Ryzen only. And i never tested Intel Aes yet, but got tester reports of mitigate improvement, very close to xmrig.

On forks, jce is better, but on pure Cryptonight aes 64 i beat the compilers by 1%, so if you use one thread only, i'm not suprised the gain is only 1h. When i implement double hash, i may be better.

P.S. I don't see any substantial difference in speed from XMRig on 32-bit nonAES CPU Core i3.

You were right, i re-read my assembly and found a bit optimization mistake. Retested i got a big +5% perf increase. My test core2 gives 93 instead of 88. You can expect 0.19 to be significantly faster on non-aes 32, and slightly on non-aes 64.

Don't start several jce on the same computer, rather use parameter -t or config file to enable more threads. Just there's a limit at 32 threads per jce instance.
Jump to: