Author

Topic: [ANN] sgminer v5 - optimized X11/X13/NeoScrypt/Lyra2RE/etc. kernel-switch miner - page 119. (Read 877859 times)

member
Activity: 98
Merit: 10
 Here are all the bins getting produced:

  I get this from my output.log:

Code:
[14:07:01] Building binary neoscryptHawaiigw64l4ku0big7hs.bin
[14:07:01] Error -11: Building Program (clBuildProgram)
[14:07:01] "C:\Users\ANIMAL~1\AppData\Local\Temp\OCL4772T27.cl", line 368: warning:
          variable "t" was declared but never referenced
   uint4 t, st[4];
        ^

"C:\Users\ANIMAL~1\AppData\Local\Temp\OCL4772T27.cl", line 495: error:
          identifier "MAX_GLOBAL_THREADS" is undefined
   __global ulong16 *V = (__global ulong16 *)(padcache + (0x8000 * (get_global_id(0) % MAX_GLOBAL_THREADS)));
                                                                                      ^

"C:\Users\ANIMAL~1\AppData\Local\Temp\OCL4772T27.cl", line 513: warning:
          argument of type "__global ulong16 *" is incompatible with parameter
          of type "__global uint16 *"
   SMix(X, V, flag);
          ^

1 error detected in the compilation of "C:\Users\ANIMAL~1\AppData\Local\Temp\OCL4772T27.cl".

Frontend phase failed compilation.

Any ideas?  Could CGWatcher be interfering somehow when bins are made?

You have wrong marucoin-mod.cl.
Try to find right, and replace in ./kernels
fix https://bitcointalk.org/index.php?topic=854257.240
full member
Activity: 347
Merit: 100
Thank's to Wolf0 for your .bin files

increas from 4.3 to 6.6 MHs
x11 r9 280x



but profit same as it was get 4.3 MHs
maybe diff raise like crazy

hopefully get a special .bin that not make a crazy diff  Wink
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
1.66 Mh/s R9 290@1000/1500

EDIT: still far from profitable
not enough people sold their gpu's  Grin

you're so sadly true :-D
member
Activity: 81
Merit: 1002
It was only the wind.
Still X2 works like a charm. Currently it runs on stock clocks, 0.9V, 56C, about 150W/core and delivers 5.4MH/s/core....or: 35 kH/J. Are you saying you're doing 60+ kH/J

Here are some stats, you do the math:

Mithra, running 2x270X gets 3.33MH/s & 3.45MH/s X11 for a total of 6.78MH/s, clocked at 875/1400 & 900/1500, both undervolted to 950mV, and she pulls 165W at the wall.
Screenshot (NSFW): https://ottrbutt.com/miner/x11localrigwolf-lowpower-11222014.png
sr. member
Activity: 384
Merit: 250
I know few are able determine amount of Mh/s based on their GPU but I am unsure how to determine that base on R7 240 seen on following image!



It's be great to know the Mh/s by seeing above image!
Any feedbacks?

About optimized x11 and ect...:
https://bitcointalksearch.org/topic/annx11x13-x11-darkcoinx13-marucoin-miner-based-on-sph-sgminer-623409

Are settings same as v5?
member
Activity: 81
Merit: 1002
It was only the wind.
Yes, I understand. BTW is there an easier way to pass a parameter to "searchX" function other than through global memory? Wouldn't local memory work as long as all threads are in the same local group/compute unit?

Still I wonder: would'n it be better for a single thread to compute the entire hash (all of the 11 functions) rather than having multiple threads evaluating different functions of X11, from a message/argument passing standpoint? (with the former approach you would not need any).





No, you can't set local memory like that - it is local to a workgroup.

Dear fucking god no - you don't understand basic GPU architecture. Without getting too technical, they cannot STAND large chunks of work - you MUST break it down into small chunks that can be parallelized. As a matter of fact, it's not good enough yet - 8 threads should be used per SIMD hash instead of one, because SIMD is too goddamned big to fit in the code cache, and it spills EVERYTHING to global memory. You want to figure out why X11 relies on memory when it shouldn't, look at SIMD. The access to get the work is seriously nothing at all.
legendary
Activity: 1400
Merit: 1050
1.66 Mh/s R9 290@1000/1500

EDIT: still far from profitable
not enough people sold their gpu's  Grin
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
1.66 Mh/s R9 290@1000/1500

EDIT: still far from profitable
legendary
Activity: 1400
Merit: 1050
ok current longer test on verter pools: -I 19 -g2 -w64 290x at 1030/1250: 1.320MH/s (average still oscillate a bit between 1.318 and 1.322)
(using my latest on  https://github.com/djm34/sgminer )

Getting 1.5 per on badman74's build, 1040/1500, no hangs for past 6 hours.

algo still hasn't been added to br or mrr so nh is the only place rentable right now, even with f9=# it doesn't switch on port 4342 yet
not bad after all  Grin (I am not too ridiculous finally ... ), they both use the same kernel... (I wish I was able to get my card to 1500MH/s...)

member
Activity: 81
Merit: 1002
It was only the wind.
Same as the others unfortunately, I guess only Wolf and few others enjoy better kernels at the moment.

I see. Well I'll look into kernel files in the next few days and will post the results if there are any Smiley ...
... I see where the main problem is: darkcoin_mod.cl consists of 11 functions one for each algo in X11 (search0-10)  all of which end up writing results of the global memory. This is slow and on top of that all threads compete for writes in the global memory.

Interstingly original darkcoin.cl does not do that and still it is slower. Perhaps other optimizations of darkcoin_mod make more of an impact.

No, that IS an optimization - you don't split the kernels up and you fuck occupancy so hard. It's better to take the hit of scheduling kernel launches than to do some dumb shit like that.
sr. member
Activity: 547
Merit: 250
ok current longer test on verter pools: -I 19 -g2 -w64 290x at 1030/1250: 1.320MH/s (average still oscillate a bit between 1.318 and 1.322)
(using my latest on  https://github.com/djm34/sgminer )

Getting 1.5 per on badman74's build, 1040/1500, no hangs for past 6 hours.

algo still hasn't been added to br or mrr so nh is the only place rentable right now, even with f9=# it doesn't switch on port 4342 yet
legendary
Activity: 1400
Merit: 1050
ok current longer test on verter pools: -I 19 -g2 -w64 290x at 1030/1250: 1.320MH/s (average still oscillate a bit between 1.318 and 1.322)
(using my latest on  https://github.com/djm34/sgminer )

legendary
Activity: 2716
Merit: 1094
Black Belt Developer
1.37 Mh/s R9 290@1000/1250

That is really awesome, mine to let us know the power consumption too? Almost same like mining scrypt?

no it's less, more like X11.
legendary
Activity: 1400
Merit: 1050
1.37 Mh/s R9 290@1000/1250

That is really awesome, mine to let us know the power consumption too? Almost same like mining scrypt?
hopefully not  Grin
member
Activity: 81
Merit: 1002
It was only the wind.
Uh... dude, I'm powering two off of an 850W. Also, memory clocks are in the notepad, the best one is at 1700.

I see we've come down a bit since the days of scrypt(N) mining Smiley Smiley. Today I measured power consumption off the wall for X2 and it reads approx. 380W on stock 1V and card yields 2x 5.2MH/s. (280x (dualX) undervolted uses about 125W which is phenomenal since it was using about 230W mining scrypt(N)).

Will try to undervolt a bit... indeed with 0.9V power usage drops to 300W

But what I don't get is how can you get almost double hash rate with the same power consumption. Whatever hash rate improvements you get by altering the kernel always translates into more power usage. Or does it?

Also I noticed that X2 card slows down a bit after mining for a day or so, and this has nothing to do with temperature which in in low 60s. Powerplay is as usual set to +20. Should I set it to +50?

Code that uses only half the damned GPU is horribly inefficient. The GPU draws power just because it's on - now, yes, unless you reduce instructions, you will get a higher power usage, but it is always worth it. At 50% hashrate improvement, I had 17% more power usage than stock.
sr. member
Activity: 481
Merit: 250
1.37 Mh/s R9 290@1000/1250

That is really awesome, mine to let us know the power consumption too? Almost same like mining scrypt?
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
1.37 Mh/s R9 290@1000/1250
legendary
Activity: 1400
Merit: 1050
BTW I got another 10℅, programming on the phone.

I tuned for 290X - you hit 1.6MH/s yet?

I'm on 290@1000/1250, about 1.1 Mh/s.
But income is so low it's not worth working on it.
actually I tried to use your groestl kernel yesterday (at least one part... wasn't in the mood to unroll everything  Grin), but I don't see any difference and actually it was a bit slower for me).

Wolf0, can you post your speed with these core/mem 1030/1250 (can't get my card to work at 1500MHz).
My current average speed with my r9 290x and my latest kernel is 1.3MH/s


1.35 or so, but I don't have your blake fix in.
ok so it isn't that bad  Grin
You won't gain much with the blake fix, blake takes practically no time compared to the rest. It is a lot of pain to put in and sgminer5 dev admin doesn't like it  Grin
(I need to change a few thing before they agree to update)
legendary
Activity: 1400
Merit: 1050
Hey djm34, there's something is wrong with the compiled binary that you post https://bitcointalksearch.org/topic/m.9817141

When hashing lyra2RE (vtc), the hashrate shown on the miner is indeed faster 5-10% than the metalicjames one, but on pool it will only record half the hashrate.
All stats normal, low rejected shares, no hardware errors that are shown on the miners.
Tested it on two different pools, coinotron and hashlink.eu, still when using your windows binary, only half hashrate will be recorded on the web.
Somehow only half the shares are sent by the miners or accepted by the pool.
And it's not  just estimations only, the coin received is halved for the same time period.
When back on using metalicjames version, the hashrate going back to normal again.

Where's that half hashrate gone??
Could you please recheck the binaries. Thanks.
must be related to the difficulty adjustment..., I will re-upload it. In the mean time there is also a difficulty multiplier option in sgminer (it shows as deprecated in the help, however it still works).
Also, I think that several pool are still tuning their hash report things...  Grin
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
actually I tried to use your groestl kernel yesterday (at least one part... wasn't in the mood to unroll everything  Grin), but I don't see any difference and actually it was a bit slower for me).

it is eavily tuned for groestlcoin/diamond.
to use in x11, lyra2re or other multi-algo it need to be tuned differently.
Jump to: