Pages:
Author

Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13 - page 16. (Read 106928 times)

member
Activity: 77
Merit: 10
Hi Diapolo!

Great to see you're making progress!
There's one thing that pops into my eye:

you already do:
if(Vals[7].x == -H[7])

why not add the K[60] right into it and remove from upper instruction? Saves a whole instruction and will work 100% ;-)

if(Vals[7].x == -H[7]-K[60])


Lets see if I can find more ..

how would this save an instruction? did you just move -K[60]?
member
Activity: 77
Merit: 10
Vals[7] += K[60] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

to this
Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);


is this a typo or did you leave out the "="?

should be:

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);



do i need to change    if(Vals[7].y == -H[7])?
and    if(Vals[7] == -H[7])?
full member
Activity: 224
Merit: 100
That Worked , i managed to go from 306-308MH to 308-310MH on 5830.
donations forth coming if it remains stable. Thx...
newbie
Activity: 38
Merit: 0
Vals[7] += K[60] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

to this
Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);


is this a typo or did you leave out the "="?

should be:

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);


hero member
Activity: 772
Merit: 500
Next kernel version will, once more, be faster for 69XX and 58XX cards Smiley. Stay tuned!

Dia
full member
Activity: 224
Merit: 100
In Kernel.cl
I changed this:
if(Vals[7].x == -H[7])

to this
if(Vals[7].x == -H[7]-K[60])

Try also changing:
if(Vals[7].y == -H[7])
... to ...
if(Vals[7].y == -H[7]-K[60])

notice Y instead of X. Will be just below the X line


Same error
newbie
Activity: 17
Merit: 0
In Kernel.cl
I changed this:
if(Vals[7].x == -H[7])

to this
if(Vals[7].x == -H[7]-K[60])

Try also changing:
if(Vals[7].y == -H[7])
... to ...
if(Vals[7].y == -H[7]-K[60])

notice Y instead of X. Will be just below the X line
member
Activity: 112
Merit: 10
This most recent update is everything you said it would be!

My 5830 went from 295 MH/s to 305 MH/s with just this update!

Thanks for the great work.
full member
Activity: 224
Merit: 100
Vince i editted the code like you said and I got errors.

Which one of the changes did you try, both?

Tell me about the error you got, just "does not work" helps nobody!

In Kernel.cl
I changed this:
if(Vals[7].x == -H[7])

to this
if(Vals[7].x == -H[7]-K[60])

and changed this

Vals[7] += K[60] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

to this
Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);


Then i changed this
Vals[7] = (Vals[3] = (u)0xb956c25b + D1 + s1(4) + ch(4)) + H1;

to this

Vals[7] = (Vals[3] = D1 + s1(4) + ch(4)) + H1;


as instructed here and yes i added the line to init.py

Quote
Add

self.state2[3] = np.uint32(self.state2[3] + 0xb956c25b);

to __init__.py, line 77 for me, right behind:

self.calculateF(data)

And remove (u)0xb956c25b from kernel.cl


The error is opencl is having unusual behavior or something. it shows MH etc... just when it seems to want to accept a share it spits that out
newbie
Activity: 38
Merit: 0
Vince i editted the code like you said and I got errors.

Which one of the changes did you try, both?

Tell me about the error you got, just "does not work" helps nobody!
full member
Activity: 224
Merit: 100
Vince i editted the code like you said and I got errors.

Diapola what you did works fine. I have 2 5830's testing with now. Expect some Bit.love is all goes well.
full member
Activity: 154
Merit: 100
I hope you can make an optimization for the next OCL version (v2.5 (684.212)) available in beta form already. =)
full member
Activity: 140
Merit: 100

What parameters are you using with phatk?

On my normally aspirated R6970 Lightning I get 419.xMH/s (Ma fix in poclbm):

Code:
-k poclbm DEVICE=0 AGGRESSION=13 BFI_INT WORKSIZE=64 VECTORS FASTLOOP=false

The fastest I've ever gotten with phatk is 403.xMH/s

Code:
-k phatk DEVICE=0 AGGRESSION=13 BFI_INT WORKSIZE=256 VECTORS FASTLOOP=false

Any suggestions?
newbie
Activity: 38
Merit: 0
Another addition waiting to be removed:

 Vals[7] = (Vals[3] = (u)0xb956c25b + D1 + s1(4) + ch(4)) + H1;

-> D1 is only used here, so why not add (u)0xb956c25b during precalculation?  Wink

Add

self.state2[3] = np.uint32(self.state2[3] + 0xb956c25b);

to __init__.py, line 77 for me, right behind:

self.calculateF(data)

And remove (u)0xb956c25b from kernel.cl

This also works 100%, no logic change involved here.

hero member
Activity: 772
Merit: 500
Hi Diapolo!

Great to see you're making progress!
There's one thing that pops into my eye:

you already do:
if(Vals[7].x == -H[7])

why not add the K[60] right into it and remove from upper instruction? Saves a whole instruction and will work 100% ;-)

if(Vals[7].x == -H[7]-K[60])


Lets see if I can find more ..


Good idea and works, can't verify via KernelAnalyzer, but seems like a vector addition less.
Will be included in the next version!

Dia
newbie
Activity: 38
Merit: 0
Hi Diapolo!

Great to see you're making progress!
There's one thing that pops into my eye:

you already do:
if(Vals[7].x == -H[7])

why not add the K[60] right into it and remove from upper instruction? Saves a whole instruction and will work 100% ;-)

if(Vals[7].x == -H[7]-K[60])


Lets see if I can find more ..
hero member
Activity: 772
Merit: 500
These two changes have taken my 5850s from 345MHash/sec to 360MHash/sec. Very nice.

5830s seem like THE card for my modded kernel. Great to hear Smiley.

Dia
hero member
Activity: 772
Merit: 500
I have a feature request

Can you please make it work with intel openCL?

thank you

I know that Intel recently released and OpenCL gold SDK ... the kernel uses standard OpenCL commands and an AMD extension only for BFI_INT / BITALIGN. I see no reason why it should not work. Have you got some error logs for me? You know that for Intel it will only use the CPUs!?

Dia
I have tried.  Poclum will not run at all.  It crashed upon starting.

I took a look at the code.  I think the comments are messy and some not really helpful.  Do you think its an good idea to 'fix' the comment?  btw comment why you type cast the hex value so other developer wont think its unnecessary and remove it.

Hex-values are type-casted so that the kernel works with AMD 2.1 SDK, which throws an error, if NOT type-casted.
I don't understand what you want to tell me with the "comments are messy" part.

If you get an error log with Intel SDK please post it here, so I can have a look at it.

Dia
hero member
Activity: 588
Merit: 500
These two changes have taken my 5850s from 345MHash/sec to 360MHash/sec. Very nice.
member
Activity: 77
Merit: 10
I have a feature request

Can you please make it work with intel openCL?

thank you

I know that Intel recently released and OpenCL gold SDK ... the kernel uses standard OpenCL commands and an AMD extension only for BFI_INT / BITALIGN. I see no reason why it should not work. Have you got some error logs for me? You know that for Intel it will only use the CPUs!?

Dia
I have tried.  Poclum will not run at all.  It crashed upon starting.

I took a look at the code.  I think the comments are messy and some not really helpful.  Do you think its an good idea to 'fix' the comment?  btw comment why you type cast the hex value so other developer wont think its unnecessary and remove it.
Pages:
Jump to: