Pages:
Author

Topic: [XMR] JCE Miner Cryptonight/forks, now with GPU! - page 13. (Read 90858 times)

member
Activity: 350
Merit: 22
Ok, thanks for the report, so i'll separate the bulldozer-like assemblies from the Zen, since one got perf increase and the other perf decrease.
I'll pack this into the 0.33b14, with the GPU port of uPlexa and a more automatic legacy to avoid some GPU perf regressions.
newbie
Activity: 41
Merit: 0
Online is
0.33j CPU Windows

* Increase/restore v8 speed
* uPlexa fork, as --variation 19
* Updated shitcoins: Saronite forked to Haven, XFH to Swap

About the speed, i tested it good on CPU I have, but did theorical fixes on CPU i don't have.
On Zen i got still +0.2% extra perf.

Unfortunately the same speed as 0.33i CPU Sad
member
Activity: 350
Merit: 22
I give priority to the GPU version this time Grin
jr. member
Activity: 313
Merit: 8
Hi ! no Linux build for the j release? Smiley
member
Activity: 350
Merit: 22
It was ready before you asked, in test session.
Next release is 0.33b14 GPU with the backport of CPU version (including uPlexa) and the fix for --legacy eating too much CPU.
member
Activity: 762
Merit: 35
Online is
0.33j CPU Windows

* Increase/restore v8 speed
* uPlexa fork, as --variation 19
* Updated shitcoins: Saronite forked to Haven, XFH to Swap

About the speed, i tested it good on CPU I have, but did theorical fixes on CPU i don't have.
On Zen i got still +0.2% extra perf.

Fast as lightning! Thanks JCE!
member
Activity: 350
Merit: 22
Online is
0.33j CPU Windows

* Increase/restore v8 speed
* uPlexa fork, as --variation 19
* Updated shitcoins: Saronite forked to Haven, XFH to Swap

About the speed, i tested it good on CPU I have, but did theorical fixes on CPU i don't have.
On Zen i got still +0.2% extra perf.
member
Activity: 350
Merit: 22
Quote
uPlexa

Will be in next version, code done and tested.
Stellite v8 is ready too, but i need a test pool and didn't find any yet, the testnet still says v7

Vishera performance: i admit this time, this is a real surprise. That's a modern AES Amd CPU, while i didn't test it since i've no such CPU (i've old Athlon64 and Ryzen, but not intermediate) i expected the perf to be at least on par.

Next version will restore Core2 and older CPU perf (sure), give a little +0.1% on Zen (quite sure, but it's in the margin of bench error) and add perf for Intel and those FX Cpu (not sure, theorical optim).
And add some new shitcoins like Swap, and uPlexa.
newbie
Activity: 41
Merit: 0
I mean in my case old versions are better on CPU on CNv8 Sad

And add miner version in log, pls.
newbie
Activity: 76
Merit: 0
Nice !
newbie
Activity: 41
Merit: 0
Hi, JCE!

AMD FX-8320E, Turbo boost OFF, CNv8

0.33i CPU

Code:
03:41:01 | Hashrate CPU Thread 0: 39.68 h/s
03:41:01 | Hashrate CPU Thread 1: 41.26 h/s
03:41:01 | Hashrate CPU Thread 2: 41.06 h/s
03:41:01 | Hashrate CPU Thread 3: 41.17 h/s
03:41:01 | Hashrate CPU Thread 4: 40.56 h/s
03:41:01 | Hashrate CPU Thread 5: 40.86 h/s
03:41:01 | Hashrate CPU Thread 6: 50.80 h/s
03:41:01 | Total: 295.35 h/s - Max: 296.09 h/s

0.33g CPU

Code:
23:39:08 | Hashrate CPU Thread 0: 40.86 h/s
23:39:08 | Hashrate CPU Thread 1: 41.28 h/s
23:39:08 | Hashrate CPU Thread 2: 41.38 h/s
23:39:08 | Hashrate CPU Thread 3: 41.46 h/s
23:39:08 | Hashrate CPU Thread 4: 41.06 h/s
23:39:08 | Hashrate CPU Thread 5: 41.38 h/s
23:39:08 | Hashrate CPU Thread 6: 52.71 h/s
23:39:08 | Total: 300.09 h/s - Max: 301.51 h/s

Code:
Analyzing Processors topology...
AMD FX-8320E Eight-Core Processor
Assembly codename: generic_aes_avx
  SSE2    : Yes
  SSE3    : Yes
  SSE4    : Yes
  AES     : Yes
  AVX     : Yes
  AVX2    : No
Auto-configuration, selected CPUs will be highlighted...
Found CPU 0, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 1
  L3 Cache:  8192 KB, shared with CPU 1, 2, 3, 4, 5, 6, 7
Found CPU 1, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 0
  L3 Cache:  8192 KB, shared with CPU 0, 2, 3, 4, 5, 6, 7
Found CPU 2, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 3
  L3 Cache:  8192 KB, shared with CPU 0, 1, 3, 4, 5, 6, 7
Found CPU 3, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 2
  L3 Cache:  8192 KB, shared with CPU 0, 1, 2, 4, 5, 6, 7
Found CPU 4, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 5
  L3 Cache:  8192 KB, shared with CPU 0, 1, 2, 3, 5, 6, 7
Found CPU 5, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 4
  L3 Cache:  8192 KB, shared with CPU 0, 1, 2, 3, 4, 6, 7
Found CPU 6, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 7
  L3 Cache:  8192 KB, shared with CPU 0, 1, 2, 3, 4, 5, 7
Found CPU 7, with:
  L1 Cache:    16 KB
  L2 Cache:  2048 KB, shared with CPU 6
  L3 Cache:  8192 KB, shared with CPU 0, 1, 2, 3, 4, 5, 6
HTTP Local Server on port 3334

Preparing 7 Mining Threads...

+-- Thread 0 config ------------------------+
| Run on CPU:            0                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 1 config ------------------------+
| Run on CPU:            1                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 2 config ------------------------+
| Run on CPU:            2                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 3 config ------------------------+
| Run on CPU:            3                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 4 config ------------------------+
| Run on CPU:            4                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 5 config ------------------------+
| Run on CPU:            5                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

+-- Thread 6 config ------------------------+
| Run on CPU:            6                  |
| Use cache:             yes                |
| Multi-hash:            no                 |
| Assembly module:       generic_aes_avx    |
+-------------------------------------------+

Cryptonight Variation: Cryptonight V8 fork of Oct-2018

Low intensity.

Starting CPU Thread 0, affinity: CPU 0
Thread 0 successfully bound to CPU 0
Allocated shared Large Page at: 0000014709e00000
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 0 of NUMA node 0 at: 000001470a000000

Starting CPU Thread 1, affinity: CPU 1
Thread 1 successfully bound to CPU 1
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 1 of NUMA node 0 at: 000001470a200000

Starting CPU Thread 2, affinity: CPU 2
Thread 2 successfully bound to CPU 2
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 2 of NUMA node 0 at: 000001470a400000

Starting CPU Thread 3, affinity: CPU 3
Thread 3 successfully bound to CPU 3
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 3 of NUMA node 0 at: 000001470a600000

Starting CPU Thread 4, affinity: CPU 4
Thread 4 successfully bound to CPU 4
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 4 of NUMA node 0 at: 000001470a800000

Starting CPU Thread 5, affinity: CPU 5
Thread 5 successfully bound to CPU 5
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 5 of NUMA node 0 at: 000001470aa00000

Starting CPU Thread 6, affinity: CPU 6
Thread 6 successfully bound to CPU 6
Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 6 of NUMA node 0 at: 000001470ac00000
15:59:58 | Monero (XMR/XMV) Mining session starts!

Both with --auto --archi vishera -t 7 --low in config
member
Activity: 762
Merit: 35
Hi JCE, could you please add uPlexa coin to the miner?
https://bitcointalksearch.org/topic/dai-mainnet-upx-uplexa-ai-anonymity-and-ecommerce-via-iot-5058404

You can speak to Quantumleaper on Dicord if you decide to inlcude it:
https://discord.gg/ddRVYCb
sr. member
Activity: 652
Merit: 266
Hi all,

Linux GPU: unlikely. I'm a niche miner (CPU and older GPUs) and adding the Linux concept would make it a niche of a niche, but a lot of dev time to do. The Win GPU is already like 15% of my fees but 90% of the support, the Linux version would be like 1% of my users for 95% of the support. I cannot afford this Sad
Sometime i don't look at the market and do things for fun, like supporting the HD6000, but it remains an acceptable dev time. Linux GPU wouldn't.

Btw try TeamRed on Linux for v8 mining, it burns like fire Wink

@PIOUPIOU99: yeah thanks, my new CPU miner also burns like fire  Cool

Speed on Intel: i don't even have any big Intel CPU, i'm all AMD, as for the GPU (i've zero nVidia). But that's ok, i'll do some theorical optimizations for big Intel CPU too.
Can you tell me what exact CPU you have? Maybe a good config can close the gap with xmrstak. I know i must beat it by more than 1.5% to compensate for the devfee. It's true in most cases, but yeah maybe not the i7.
I do use it but competitive linux miner is always welcome Smiley
member
Activity: 350
Merit: 22
Right, i rephrase explicitely the comment as all CPUs where i lacked extra performance versus xmrig.
Of course my new v8 assembly is for modern AES CPU like Zen, the one for non-aes Core2 is already ultra-optimized and the 33i gives no extra perf compared to 33h

Also i observed a little regression too, you're right, it was hard to understand how there could be a side effect but found it, that's a cache allocation problem. Will be fixed in 33j that I already planned to release with an optim for Intel modern CPU and the UPlexa fork.
newbie
Activity: 33
Merit: 0
jce_cn_cpu_miner.windows.033i
-1 h/s in comparison with the previous version. Where is the optimization on V8 ? No improvement seen.
CPU Xeon E5440 , Core2Quad Q9400.
full member
Activity: 1120
Merit: 131
Mining bittube with the lastest GPU version: (4X RX574: 1240/2070; 1240/2070; 1240/2040; 1240/2035).


Code:
Starting GPU Thread 0, on GPU 0
Created OpenCL Context for GPU 0 at 000001cf487080a0
Created OpenCL Thread 0 Command-Queue for GPU 0 at 000001cf48720ca0
Scratchpad Allocation success for OpenCL Thread 0
Allocating big 1856MB scratchpad for OpenCL Thread 0...
Compiling kernels of OpenCL Thread 0...
Kernels of OpenCL Thread 0 compiled.

Starting GPU Thread 1, on GPU 0
Created OpenCL Thread 1 Command-Queue for GPU 0 at 000001cf4d55d740
Scratchpad Allocation success for OpenCL Thread 1
Allocating big 1856MB scratchpad for OpenCL Thread 1...
Compiling kernels of OpenCL Thread 1...
Kernels of OpenCL Thread 1 compiled.

Starting GPU Thread 2, on GPU 1
Created OpenCL Context for GPU 1 at 000001cf487839f0
Created OpenCL Thread 2 Command-Queue for GPU 1 at 000001cf4d55db30
Scratchpad Allocation success for OpenCL Thread 2
Allocating big 1856MB scratchpad for OpenCL Thread 2...
Compiling kernels of OpenCL Thread 2...
Kernels of OpenCL Thread 2 compiled.

Starting GPU Thread 3, on GPU 1
Created OpenCL Thread 3 Command-Queue for GPU 1 at 000001cf4d55d200
Scratchpad Allocation success for OpenCL Thread 3
Allocating big 1856MB scratchpad for OpenCL Thread 3...
Compiling kernels of OpenCL Thread 3...
Kernels of OpenCL Thread 3 compiled.

Starting GPU Thread 4, on GPU 2
Created OpenCL Context for GPU 2 at 000001cf487844f0
Created OpenCL Thread 4 Command-Queue for GPU 2 at 000001cf4d55d4a0
Scratchpad Allocation success for OpenCL Thread 4
Allocating big 1856MB scratchpad for OpenCL Thread 4...
Compiling kernels of OpenCL Thread 4...
Kernels of OpenCL Thread 4 compiled.

Starting GPU Thread 5, on GPU 2
Created OpenCL Thread 5 Command-Queue for GPU 2 at 000001cf588aa3a0
Scratchpad Allocation success for OpenCL Thread 5
Allocating big 1856MB scratchpad for OpenCL Thread 5...
Compiling kernels of OpenCL Thread 5...
Kernels of OpenCL Thread 5 compiled.

Starting GPU Thread 6, on GPU 3
Created OpenCL Context for GPU 3 at 000001cf48783470
Created OpenCL Thread 6 Command-Queue for GPU 3 at 000001cf588ab210
Scratchpad Allocation success for OpenCL Thread 6
Allocating big 1856MB scratchpad for OpenCL Thread 6...
Compiling kernels of OpenCL Thread 6...
Kernels of OpenCL Thread 6 compiled.

Starting GPU Thread 7, on GPU 3
Created OpenCL Thread 7 Command-Queue for GPU 3 at 000001cf588aa640
Scratchpad Allocation success for OpenCL Thread 7
Allocating big 1856MB scratchpad for OpenCL Thread 7...
Compiling kernels of OpenCL Thread 7...
Kernels of OpenCL Thread 7 compiled.
Keep-Alive enabled
Devfee for GPU is 0.9%

12:39:39 | Miner uptime 4:05:07
12:39:39 | Effective net hashrate 3591.50 h/s
12:39:39 | Devices results - Shares Accepted/Ignored/Rejected - Net Hashrate
12:39:39 | * GPU 0 -  98/0/0 - 891.67 h/s
12:39:39 | * GPU 1 -  87/0/0 - 817.14 h/s
12:39:39 | * GPU 2 - 101/0/0 - 976.39 h/s
12:39:39 | * GPU 3 -  91/0/0 - 906.30 h/s
12:40:56 | Hashrate GPU Thread 0: 462.00 h/s
12:40:56 | Hashrate GPU Thread 1: 461.32 h/s - Total GPU 0: 923.31 h/s
12:40:56 | Hashrate GPU Thread 2: 440.52 h/s
12:40:56 | Hashrate GPU Thread 3: 444.89 h/s - Total GPU 1: 885.41 h/s
12:40:56 | Hashrate GPU Thread 4: 449.96 h/s
12:40:56 | Hashrate GPU Thread 5: 449.84 h/s - Total GPU 2: 899.79 h/s
12:40:56 | Hashrate GPU Thread 6: 464.51 h/s
12:40:56 | Hashrate GPU Thread 7: 464.82 h/s - Total GPU 3: 929.33 h/s
12:40:56 | Total: 3637.83 h/s - Max: 3649.71 h/s
member
Activity: 350
Merit: 22
Hi all,

Linux GPU: unlikely. I'm a niche miner (CPU and older GPUs) and adding the Linux concept would make it a niche of a niche, but a lot of dev time to do. The Win GPU is already like 15% of my fees but 90% of the support, the Linux version would be like 1% of my users for 95% of the support. I cannot afford this Sad
Sometime i don't look at the market and do things for fun, like supporting the HD6000, but it remains an acceptable dev time. Linux GPU wouldn't.

Btw try TeamRed on Linux for v8 mining, it burns like fire Wink

@PIOUPIOU99: yeah thanks, my new CPU miner also burns like fire  Cool

Speed on Intel: i don't even have any big Intel CPU, i'm all AMD, as for the GPU (i've zero nVidia). But that's ok, i'll do some theorical optimizations for big Intel CPU too.
Can you tell me what exact CPU you have? Maybe a good config can close the gap with xmrstak. I know i must beat it by more than 1.5% to compensate for the devfee. It's true in most cases, but yeah maybe not the i7.
copper member
Activity: 293
Merit: 11
Still no linux Smiley

Online is the 0.33i CPU Windows and Linux, 32 and 64-bits
major release with a big +2% speed on v8, making my miner the best in all cases on CPU, even fees deduced.


for my light config v8
0.33e

0.33i


 Smiley
sr. member
Activity: 652
Merit: 266
i'd say the last one, 0.33b13

my autoconfig aims for safety, for max perf, use the manual config, the github page provides some examples.
https://github.com/jceminer/cn_gpu_miner

but each card may be different (overclocking, memory...) so take time to tune the values. only three are relevant: multi_hash (a multiple of 16), alpha (64 or 128) and beta (8 or 16).
Actually I was talking about GPU version not having linux port Smiley
jr. member
Activity: 77
Merit: 6
on my i7 im getting 295 on xmr stak vs 286 on JCE. How can i improve these numbers? I do have hyperthreading on.

I'm unsure on how to config this? Currently use xmr stak with the following:

    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 0 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 2 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 4 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 6 },


this is on v8....and compiled xmr stak with 0 dev fee....can you compete? If so I'll happily make the move if its worth it.
Pages:
Jump to: