Pages:
Author

Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels - page 9. (Read 61229 times)

legendary
Activity: 2716
Merit: 1094
Black Belt Developer
experimental new bin for Hawaii (r9 290/290X) only:

https://dl.dropboxusercontent.com/u/40353042/Diamond/diamondHawaiiw128l8.bin

use worksize 128.

this is my opencl kernel, tweaked for speed and compatibility.
please report hashrates and show your support!
member
Activity: 143
Merit: 10
Is there still interest in this? Utahjohn and.... :-D
I could dedicate some time to finish the opensource kernel v2 if it's worth.

Oh, i'm totally interested Smiley
Thanks for your great work btw
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Is there still interest in this? Utahjohn and.... :-D
I could dedicate some time to finish the opensource kernel v2 if it's worth.
full member
Activity: 151
Merit: 100
Dobrii den! Podskajite v shapke novii fail .cl ? Kotorii vidaet 30Mh na 290 karte? Spasibo
member
Activity: 89
Merit: 10
Hi, i need Miner for Nvidia 560 TI DS Work about 3600 kh/s but in Groestl 100%  in --algo=dmd-gr  =  Bommm ...Bommm  reject  do not understand ??'

Hello, anyone home??? This is opencl kernel which is for AMD gpu not Nvidia.
hero member
Activity: 597
Merit: 500
Hi, i need Miner for Nvidia 560 TI DS Work about 3600 kh/s but in Groestl 100%  in --algo=dmd-gr  =  Bommm ...Bommm  reject  do not understand ??'
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Oh, that's a pity you moved away...
Not sure I like the idea of learning another asm, even if it's very cool!
I understand the first and last round optimizations are boring to do, but could you please, before leaving us, fix the problem with multiple cards? Where card 0 doesn't provide any work unit while card 1 works fine? Thanks!

I built my bin as posted using Hetpass.
Two machines.
Two 7950s reference cards in one.
And a Dualx  7950 in the other.
All Sapphires. I use Sgminer 4.1 the original if you will.
Not sgminer 5.1. Too many bells and whistles.
Hetpass said these cards should do soo many hashes and it is correct.
I run two cards in one machine with out any problems.
.......

On my machine, card 0 hashed fine but no work submitted (WU=0).
Card 1 had normal WU.
I'm using 4.1 as well.
Never had this problem with any kernel before.
hero member
Activity: 630
Merit: 500
@realhet
can u make the right hand pane detachable in hetpas?  I like to use multiple monitors and have a full screen for IDE ...

Also would really appreciate if you could do the finishing touches on first/last pass as neither of us are up to speed on asm yet and could be quite a while ...
hero member
Activity: 630
Merit: 500
This one is going to take a lot of learning for me (I'm 53yo LOL learning takes more time for me hehehe) ... I'm going to copy asm src to flash drive and print at apt complex office so it's a bit easier for me to follow through.  Can u send me your latest greatest fastest OCL and I'll get that printed too ...

Funny u can only get 1 card to run, I have 280x as card 0, 7950 as card 1 and they both run kernel fine ...
Just an after thought, I have not tried with single instance of sgminer controlling both cards, I run an instance of sgminer for each card individually, so I do not know if this problem affects me ...
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Oh, that's a pity you moved away...
Not sure I like the idea of learning another asm, even if it's very cool!
I understand the first and last round optimizations are boring to do, but could you please, before leaving us, fix the problem with multiple cards? Where card 0 doesn't provide any work unit while card 1 works fine? Thanks!
newbie
Activity: 32
Merit: 0
Hi,

"- asm is cooler than ocl ;-)"  Haha, yes!
And ocl needs black magic to optimize, asm just does what you tell it to.

"Nice, even gives opcodes, I bet this is reference realhet used to build hetpas Smiley"
I remember, I had a work that time when 7970 came out. I just got one in 2011 december. There was no manual for more than half a year, but the disassembler worked well. So I decoded the instruction set using the disassembler. I even found some undocumented ones that way. It was fun.
But for some unknown reasons this approach is broken because 1-2 years ago the disassembler is just does nothing when the .elf is a binary only .elf (this is the case when you use my assembler).

Some tips:
- Use Ctrl+Space in the IDE! It's like Intellisense/codeInsight. (Just start typing v_something!)
- Press F1 on any instruction, it will show a mini help.
- You can DD anything that doesn't implemented. (eg. "dd $12345678, 0x74732921, 1234" emits 3 uints into code)
- Disassembling small opencl programs is a good source of knowledge. Also this is the 'documentstion' on how to specific set of pass kernel parameters.

"Looks like realhet has moved on to another project ..."
Yea, I have to continue my job soon, as my free time runs out. I'm only planning to experiment with a bit of 2D rope physics. But whatever, now I'm in a Red Alert 2 'project', haha Cheesy
hero member
Activity: 630
Merit: 500
@pallas
will you be able to do your best first/last pass implementation in ASM?
Looks like realhet has moved on to another project ...
hero member
Activity: 630
Merit: 500
Realhet has given us an impressive tool to work with ... time to learn ASM coding ...
I don't really have a grasp on parallel processing and all the nuances of register usage on GPU ... looks like the ball is in your court Pallas.  Hopefully Realhet returns to continue on this project.

Found this, looking thru it now ...
http://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf

Wish I had a printed book of this reference ... wonder if one could order it online, I have no printer ...

This is what I was looking for Smiley Wow a lot to grasp Smiley

Nice, even gives opcodes, I bet this is reference realhet used to build hetpas Smiley

In his dev thread he mentions u can inline an instruction opcode for any that are not supported by hetpas assembler.  The opcode tables for generating instructions manually are in the ref manual Smiley
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
Utahjohn: true!
Realhet: still there?
hero member
Activity: 630
Merit: 500
Quote
* "Guys! We do not need more optimization!"
I've thought about this too. But I think if everyone use better kernels, then everyone will use the same power to get the same profit as difficulty will be harder but mining will require less power.
But what if not everyone uses the faster kernel. I think my compuler/IDE is helping in this a lot, as it is kinda user unfriendly
Just as I thought also, there has not been a widespread migration to new kernel Smiley  Difficulty for newbs to set-up properly, combined with falling BTC value make direct mining less attractive to the "Dumpers" ... Diff is back to a reasonable range again as miners drop out of game ...
This is good for those of us who stick with it Smiley BTC will recover eventually, I am a long-term DMD holder anyway Smiley
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
For those of you with 290/290x cards that are locked:
http://www.overclock.net/t/1443242/the-r9-290-290x-unlock-thread

That thread is about transforming a 290 into 290x by means of firmware flashing.
But some are hardware locked, like mine... :-(
Still may be useful for someone ...7% gain from 290 to 290x ...  Smiley

I have 30 on 290 and 33 on 290x, same clock, so it's 10% :-)
member
Activity: 81
Merit: 1002
It was only the wind.
Wow I got Best share: 702K
Was it a block?Huh :-D

Now let's get serious: I finally have a little time to write some considerations on the ocl and asm kernels.
I believe we should pursue the asm path for a number or reasons:

- currently the OCL kernel is a little faster on hawaii but not on all other cards and I don't think it can be improved in this respect
- the OCL kernel has been tweaked and optimized for months, while the asm one is new so there is probably much more room for improvement
- just by applying the first and last round optimization the asm kernel will probably be faster on hawaii as well; I'm sure that Realhet will find other asm tricks to apply
- with all these catalyst version problems, the best way to share kernels for the people to mine is by bin files, making the asm version and ocl equivalent (for distribution purposes); better yet would be a miner with all the bundled bin files (takes time)
- asm is cooler than ocl ;-)

what do you guys think?
I'm all for sticking with asm route ... u need to feed your ocl tweaks to realhet and lets maximize asm kernel.
As I already suggested to realhet "cross-compile" to generate bins for all arch we support is possible, he needs our bins created on each arch to dig out minour diffs between bins.

ASM route seems better.
hero member
Activity: 630
Merit: 500
For those of you with 290/290x cards that are locked:
http://www.overclock.net/t/1443242/the-r9-290-290x-unlock-thread

That thread is about transforming a 290 into 290x by means of firmware flashing.
But some are hardware locked, like mine... :-(
Still may be useful for someone ...7% gain from 290 to 290x ...  Smiley
legendary
Activity: 2716
Merit: 1094
Black Belt Developer
For those of you with 290/290x cards that are locked:
http://www.overclock.net/t/1443242/the-r9-290-290x-unlock-thread

That thread is about transforming a 290 into 290x by means of firmware flashing.
But some are hardware locked, like mine... :-(
hero member
Activity: 732
Merit: 500
Trust me 14.7RC3 is best.
Then u are unlucky enough to have a "Locked" card.
Only recourse for u is vbios modding your card if u want to lower memclock to 150 and be able to do higher overclock on gpu.
Do the research on vbios modding ... there are pointers in this thread by myself, I hate repeating my self a hundred times that's why the info is in thread.
https://bitcointalksearch.org/topic/m.9043545

BTW I can still clock mem at 1625 via sgminer setting when I mine X11 or Neoscrypt with it ... just have to set it manual for them ...

Welcome to Extreme Diamond Mining LOL

I have 3 different 280x cards(dual,vapor,toxic). Strange to see all are locked. The same thing is that all are sapphire. Will try this vbios modding theese days.
Thank you for the support. I will leave the rigs for now at 22.3mh(1020/1500) /23.4mh(1070/1550) /24mh(1100/1600). Little tired last 2 days trying to config everything Smiley) maybe I am wrong somewhere.
Pages:
Jump to: