Pages:
Author

Topic: [XMR] JCE Miner Cryptonight/forks, now with GPU! - page 91. (Read 90814 times)

gvb
jr. member
Activity: 140
Merit: 9
I added to my start a directory change to where the miner is.

cd c:\miners\Monero\gpu

then I can use right click start as admin aswell



my GPU mining went fine for hours now it's back to 0H/s even after a quit miner, disable card, enable card, start miner.


any idea what could be the cause?

The driver seemed to crash sometimes but it kept mining (no, it's not my display driver aswell)
member
Activity: 350
Merit: 22

Try to lower "multi_hash":1200 in 16 decrements, so first try "multi_hash":1184

I rebooted and even with "multi_hash":944 which was stable it doesnt compile.


i think i found the bug : the miner doesn't supported to be launched from a different dir. said differently, its current dir must be the path where the .exe is

if you call the .bat or exe from a different dir, it fails. i'll fix this in next revision.
sr. member
Activity: 1484
Merit: 253
I really cannot use a multiple smaller than 16, it would invalidate all my optims. I did a lot of data grouping by chuncks of 16, because that's 4x4. Going down to 2x2 would be impracticable.
About cached compile, that's possible, but more complicated than what Stak does.

And please, switch to the SRB topic for the SRB tips, it's not a rant, no problem to compare with SRB, even for me, but it's out of place and makes the topic harder to follow. Cry
Yes, I got it.
But think, how make more fine tuning...
sr. member
Activity: 1484
Merit: 253
Max multihash for heavy algo for 8Gb cards is about 960. With it's near 7.5Gb video memory usage. You both talking about different algo's. One about heavy, other about normal v7.

For heavy algo optimal vaules for 580 8Gb:
Code:
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":944 },
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":944 },

Also good this:
Code:
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":896 },
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":896 },

For normal v7:
Code:
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":1152 },
{ "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 8, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":1152 },
newbie
Activity: 70
Merit: 0

Try to lower "multi_hash":1200 in 16 decrements, so first try "multi_hash":1184

I rebooted and even with "multi_hash":944 which was stable it doesnt compile.
newbie
Activity: 81
Merit: 0
I get

Code:
Starting GPU Mining thread 0, on GPU 0
Created OpenCL Context for GPU 0 at 0000019eeeb67980
Created OpenCL Thread 0 Command-Queue for GPU 0 at 0000019efc3225f0
Allocating big 4800MB scratchpad for OpenCL Thread 0...
Scratchpad Allocation success for OpenCL Thread 0
Compiling kernels of OpenCL Thread 0, it will be long...
Compilation of OpenCL kernels failed.
Error: CL_BUILD_PROGRAM_FAILURE
When i try to run this for both threads.
[/quote]

Try to lower "multi_hash":1200 in 16 decrements, so first try "multi_hash":1184
newbie
Activity: 70
Merit: 0
is there a better config for a RX580 8GB? With cast xmr i can get pretty much stable 1kH/s.

"gpu_threads_conf" :
[
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
]


EDIT: Algo is cryptonight heavy. Also the hashrate with JCE is much more stable. nice work!

For RX580 8GB the best settings i found to be below:

Code:
{ "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 0, "multi_hash":1200 },
{ "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 0, "multi_hash":1200 },


I get

Code:
Starting GPU Mining thread 0, on GPU 0
Created OpenCL Context for GPU 0 at 0000019eeeb67980
Created OpenCL Thread 0 Command-Queue for GPU 0 at 0000019efc3225f0
Allocating big 4800MB scratchpad for OpenCL Thread 0...
Scratchpad Allocation success for OpenCL Thread 0
Compiling kernels of OpenCL Thread 0, it will be long...
Compilation of OpenCL kernels failed.
Error: CL_BUILD_PROGRAM_FAILURE
When i try to run this for both threads.
newbie
Activity: 81
Merit: 0
is there a better config for a RX580 8GB? With cast xmr i can get pretty much stable 1kH/s.

"gpu_threads_conf" :
[
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
]


EDIT: Algo is cryptonight heavy. Also the hashrate with JCE is much more stable. nice work!

For RX580 8GB the best settings i found to be below:

Code:
{ "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 0, "multi_hash":1200 },
{ "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 0, "multi_hash":1200 },
newbie
Activity: 70
Merit: 0
is there a better config for a RX580 8GB? With cast xmr i can get pretty much stable 1kH/s.

"gpu_threads_conf" :
[
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
     { "mode" : "GPU", "worksize" : 8, "alpha" : 128, "beta" : 16, "gamma" : 8, "delta" : 8, "epsilon" : 8, "zeta" : 8, "index" : 1, "multi_hash":944 },
]


EDIT: Algo is cryptonight heavy. Also the hashrate with JCE is much more stable. nice work!
member
Activity: 350
Merit: 22
I really cannot use a multiple smaller than 16, it would invalidate all my optims. I did a lot of data grouping by chuncks of 16, because that's 4x4. Going down to 2x2 would be impracticable.
About cached compile, that's possible, but more complicated than what Stak does.

And please, switch to the SRB topic for the SRB tips, it's not a rant, no problem to compare with SRB, even for me, but it's out of place and makes the topic harder to follow. Cry
sr. member
Activity: 1484
Merit: 253
Wow, so many comments, thanks all

Multiple of 16 : that's a technical constraint due to how is my code. I cannot work with smaller multiples.

Cached pre-compile : not that simple, on JCE since the OpenCL code is generated, pre-optimized and injected on the fly, it's not just a per-GPU like for Stak. Not impossible to cache, but not trivial. I know that on a 8-gpu system it takes a long time.
Multihash... You must think about smaller multiples. It can make faster speed in many cases...

Pre-compiling... Also need something to do - it's to long wait every time...
sr. member
Activity: 1484
Merit: 253
Quote
XMR-Stak is not fastest cryptonight miner. You must try SRB miner. It faster XMR-Stak or XMRig.

I tried them all my friend, SRB for me does give a higher hashrate displayed on the rig, but on the pool the hashrate is lower, also SRB is higher about 1-3 % so not worth the hassle considering the fee
You just didn't know optimal settings for SRB miner for your cards. It's faster significunt. Expeccially on heavy. On normal v7 srb 1.6.0 can give max speed with the right .srb kernel file. For heavy 1.6.1 can give max speed with the right kernel.
On the pools side all right. I checked SRB on few pools, all right. Many forgot that often pool take fee.


Can you share you SRB config/kernel for V7 on RX570/580 4/8GB?
I would love to try it out
Ok. I didn't have 4Gb 570/580 card. There is .srb kernel bin file for rx 580 8Gb: https://yadi.sk/d/1QFcuG-g3YVNWg
This file needs to be copied to "cache" subfolder of SRB miner folder.
Than you need 1.6.0 version of miner. On 1.6.1 speed on v7 is lower.
In config:

        { "id" : 0, "intensity" : 116.0, "worksize" : 8, "threads" : 2, "persistent_memory" : false},

It's for 580 8Gb card without connected monitor (more available video memory). Maked and tested on Windows 10 x64 1803 with AMD 18.6.1 driver.
This settings require about 7.2Gb of free video memory.

P.S. Kernel bin from link woudn't work on 1.6.1 version. doctor changed cn kernels on 1.6.1 version...

Thanks for the link, just tried and in fact its a better kernel for RX580 8GB.
Where did you get it, is it possible to get one for 4GB RX570/580, i only have 2 8GB cards Sad
You can try to make these kernel bins yourself. Just try to run 4Gb cards with the next parameters:

        { "id" : 0, "intensity" : 58.0, "worksize" : 8, "threads" : 2, "persistent_memory" : false, "fragments": 43650010},

Maybe new kernel will be faster.
member
Activity: 350
Merit: 22
JCE GPU is on Win64 only.
I don't know if i'll release a Linux version. On CPU version, linux users are like 5% of total, not good to be motivated. And Linux miners are a lot more rare than Windows ones.

For the same reason, there will be no Win32 version.

Quote
In castxmr and xmr-stak (intensity 1932 double threads) I get around 13800-13900 h/s.
In JCE (intensity 1920) got 14300-14400 h/s. So a little bit better than those.
Dunno about SRB. Probably gonna be same results as cast and stak.
All this hashrate is what the mining software has reported.

So I beat Cast Huh woow, i didn't even do it on purpose Cool
JCE reports exact instant hashrate with no tweak, at 0.01 precision. And bad shares are reported, never hidden.
member
Activity: 564
Merit: 19
I get an error when running with GPU'

Compilation of Open CL Kernels failed CL_BUILD_PROGRAM_FAILURE

It works on another system with no issues though. Any ideas?

Maybe this?

Code:
sudo apt install opencl-headers

Cant wait to test gpu version with my Ubuntu 18.04 LTS Server AMD rig.
jr. member
Activity: 41
Merit: 1
Monero (Cryptonight V7)
AMD Driver 18.6.1
7 x Vega 64

In castxmr and xmr-stak (intensity 1932 double threads) I get around 13800-13900 h/s.
In JCE (intensity 1920) got 14300-14400 h/s. So a little bit better than those.
Dunno about SRB. Probably gonna be same results as cast and stak.
All this hashrate is what the mining software has reported.

Also, I briefly tried haven algo but saw no significant improvement over competitors.

Vega 64s

confgpu.txt
     { "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 0, "multi_hash":1920 },
     { "mode" : "GPU", "worksize" : 8, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 0, "multi_hash":1920 },
(etc.)

12:08:34 | Hashrate GPU Thread 0: 1019.38 h/s
12:08:34 | Hashrate GPU Thread 1: 1042.35 h/s
12:08:34 | Hashrate GPU Thread 2: 1022.37 h/s
12:08:34 | Hashrate GPU Thread 3: 1035.88 h/s
12:08:34 | Hashrate GPU Thread 4: 1026.74 h/s
12:08:34 | Hashrate GPU Thread 5: 1029.77 h/s
12:08:34 | Hashrate GPU Thread 6: 1030.88 h/s
12:08:34 | Hashrate GPU Thread 7: 1024.28 h/s
12:08:34 | Hashrate GPU Thread 8: 1020.92 h/s
12:08:34 | Hashrate GPU Thread 9: 1031.15 h/s
12:08:34 | Hashrate GPU Thread 10: 1007.70 h/s
12:08:34 | Hashrate GPU Thread 11: 1027.57 h/s
12:08:34 | Hashrate GPU Thread 12: 1019.47 h/s
12:08:34 | Hashrate GPU Thread 13: 1027.29 h/s
12:08:34 | Total: 14365.70 h/s - Max: 14388.12 h/s
It's need results from other miners like SRB and Cast XMR to compare with. And point algo and amount of video memory please.
member
Activity: 350
Merit: 22
Wow, so many comments, thanks all

Multiple of 16 : that's a technical constraint due to how is my code. I cannot work with smaller multiples.

Cached pre-compile : not that simple, on JCE since the OpenCL code is generated, pre-optimized and injected on the fly, it's not just a per-GPU like for Stak. Not impossible to cache, but not trivial. I know that on a 8-gpu system it takes a long time.

Vega : i'm at 2k+ on Vega 64 ? Nice, i didn't even try to beat Cast, which is ultra-optimized for them. Is it on par with Cast? Of course i've no vega to test on Sad

CL_BUILD_PROGRAM_FAILURE : hard to tell with just that message, but in next version i'll add more details about diagnostics.

The RX550 and 560 are the card I optimized the most for, among with the HD7800s, just because... i own a lot of them. I've no RX570/580 or better.

>>2. When reporting the share value, its always the same and is equal to a static diff i set to mine at, here is the log
That's normal, don't get confused with Claymore logs. JCE logs the share value, in the sense value in hashes, not the hashcode itself (which is perfectly useless to know). So a static diff of N will display N every time, that's the expected behavior. And that was the behavior of old Clamore 9.x, i don't know why Claymore turned it into the useless hashcode...

Separate algo : i plan to separate the algo of CPU, and of every GPU (since some are better on Heavy, some on v7...). The algo in command line will be the CPU algo, and the default for GPU. And the GPU algo will then be overridable per-GPU in the config file.

270X speed: Strange your card goes so high with Claymore and so low with JCE. I retested my HD7870, with Claymore configured not to find only bad shares (which is what it does if i raise the -h too high) and i can get 516 with ~10% bad shares. JCE goes to 540 with zero bad shares. No screen plugged on the card. And on my HD7850 2G i get 512h/s, while claymore does not go over 460. But on my HD7850 1G, claymore is still at 455 while JCE drops to a pitty 355.
newbie
Activity: 81
Merit: 0
Quote
XMR-Stak is not fastest cryptonight miner. You must try SRB miner. It faster XMR-Stak or XMRig.

I tried them all my friend, SRB for me does give a higher hashrate displayed on the rig, but on the pool the hashrate is lower, also SRB is higher about 1-3 % so not worth the hassle considering the fee
You just didn't know optimal settings for SRB miner for your cards. It's faster significunt. Expeccially on heavy. On normal v7 srb 1.6.0 can give max speed with the right .srb kernel file. For heavy 1.6.1 can give max speed with the right kernel.
On the pools side all right. I checked SRB on few pools, all right. Many forgot that often pool take fee.


Can you share you SRB config/kernel for V7 on RX570/580 4/8GB?
I would love to try it out
Ok. I didn't have 4Gb 570/580 card. There is .srb kernel bin file for rx 580 8Gb: https://yadi.sk/d/1QFcuG-g3YVNWg
This file needs to be copied to "cache" subfolder of SRB miner folder.
Than you need 1.6.0 version of miner. On 1.6.1 speed on v7 is lower.
In config:

        { "id" : 0, "intensity" : 116.0, "worksize" : 8, "threads" : 2, "persistent_memory" : false},

It's for 580 8Gb card without connected monitor (more available video memory). Maked and tested on Windows 10 x64 1803 with AMD 18.6.1 driver.
This settings require about 7.2Gb of free video memory.

P.S. Kernel bin from link woudn't work on 1.6.1 version. doctor changed cn kernels on 1.6.1 version...

Thanks for the link, just tried and in fact its a better kernel for RX580 8GB.
Where did you get it, is it possible to get one for 4GB RX570/580, i only have 2 8GB cards Sad
newbie
Activity: 71
Merit: 0
I get an error when running with GPU'

Compilation of Open CL Kernels failed CL_BUILD_PROGRAM_FAILURE

It works on another system with no issues though. Any ideas?
sr. member
Activity: 1484
Merit: 253
Quote
XMR-Stak is not fastest cryptonight miner. You must try SRB miner. It faster XMR-Stak or XMRig.

I tried them all my friend, SRB for me does give a higher hashrate displayed on the rig, but on the pool the hashrate is lower, also SRB is higher about 1-3 % so not worth the hassle considering the fee
You just didn't know optimal settings for SRB miner for your cards. It's faster significunt. Expeccially on heavy. On normal v7 srb 1.6.0 can give max speed with the right .srb kernel file. For heavy 1.6.1 can give max speed with the right kernel.
On the pools side all right. I checked SRB on few pools, all right. Many forgot that often pool take fee.


Can you share you SRB config/kernel for V7 on RX570/580 4/8GB?
I would love to try it out
Ok. I didn't have 4Gb 570/580 card. There is .srb kernel bin file for rx 580 8Gb: https://yadi.sk/d/1QFcuG-g3YVNWg
This file needs to be copied to "cache" subfolder of SRB miner folder.
Than you need 1.6.0 version of miner. On 1.6.1 speed on v7 is lower.
In config:

        { "id" : 0, "intensity" : 116.0, "worksize" : 8, "threads" : 2, "persistent_memory" : false},

It's for 580 8Gb card without connected monitor (more available video memory). Maked and tested on Windows 10 x64 1803 with AMD 18.6.1 driver.
This settings require about 7.2Gb of free video memory.

P.S. Kernel bin from link woudn't work on 1.6.1 version. doctor changed cn kernels on 1.6.1 version...
newbie
Activity: 81
Merit: 0
Quote
XMR-Stak is not fastest cryptonight miner. You must try SRB miner. It faster XMR-Stak or XMRig.

I tried them all my friend, SRB for me does give a higher hashrate displayed on the rig, but on the pool the hashrate is lower, also SRB is higher about 1-3 % so not worth the hassle considering the fee
You just didn't know optimal settings for SRB miner for your cards. It's faster significunt. Expeccially on heavy. On normal v7 srb 1.6.0 can give max speed with the right .srb kernel file. For heavy 1.6.1 can give max speed with the right kernel.
On the pools side all right. I checked SRB on few pools, all right. Many forgot that often pool take fee.


Can you share you SRB config/kernel for V7 on RX570/580 4/8GB?
I would love to try it out
Pages:
Jump to: