Pages:
Author

Topic: Modified Kernel for Phoenix 1.5 - page 8. (Read 96725 times)

newbie
Activity: 52
Merit: 0
August 02, 2011, 12:34:57 PM
@Phat:

I don't understand how you achieve, that base is always an uint as kernel parameter now that base has (uint2)(0, 1) or (uint4)(0, 1, 2, 3) added into it via the init-file. If I try to do this with my mod it just crashes Phoenix, now if I use const u base, instead of const uint base, it seems to work (because u reflects the correct variable type uint, uint2 or uint4). Have you got an idea for this?

Thanks,
Dia

I'm not sure I understand you... Depending on whether the number of nonces per thread (VECTORS) is 1, 2, or 4, the kernel compiles as base being either uint, uint2 or uint4.  The init file packs either 1, 2 or 4 uinits into each base entry and therefore, the init files always produces the same size variable as the kernel needs.  So, in short, both the base{i] variable being passed to the kernel and the "u base" value in the kernel can be either 1, 2 or 4 uints.  Does that answer your question?
full member
Activity: 182
Merit: 100
August 02, 2011, 10:52:39 AM
I have a 3Mhash avg improvement over Diapolo last kernel update (393 -> 396)

Setup is as follows:

Reference 5850, 1.100v 920 core / 350 mem.  11.4 preview / SDK 2.4.   Lastest GUIMiner / phoenix 1.50


Going to run it for a day to see it's stability and report back if anything arises.
hero member
Activity: 504
Merit: 502
August 02, 2011, 10:21:02 AM
using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4

I'm using that, just replaced the phatk2 kernel with phatk2.1 and that's it
vectors4 should be slower, why would you want to use it? I use -v only

Just wanted to test vectors4 with default memory, not high priority.

Still phatk2.1 is much slower than phatk2 for me as I said ~11mh per card, ati hd5850 , I wonder why o_0
hero member
Activity: 658
Merit: 500
August 02, 2011, 10:14:14 AM
using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4

I'm using that, just replaced the phatk2 kernel with phatk2.1 and that's it
vectors4 should be slower, why would you want to use it? I use -v only
hero member
Activity: 772
Merit: 500
August 02, 2011, 10:10:38 AM
@Phat:

I don't understand how you achieve, that base is always an uint as kernel parameter now that base has (uint2)(0, 1) or (uint4)(0, 1, 2, 3) added into it via the init-file. If I try to do this with my mod it just crashes Phoenix, now if I use const u base, instead of const uint base, it seems to work (because u reflects the correct variable type uint, uint2 or uint4). Have you got an idea for this?

Thanks,
Dia
newbie
Activity: 26
Merit: 0
August 02, 2011, 09:43:35 AM
I tried VECTORS4 on my 6870, since I can't underclock the memory (it's at 1050).

Results:
VECTORS
WS 64: 295 MH/s
WS 128: 299 MH/s
WS 256: 292 MH/s

VECTORS4
WS 64: 278 MH/s
WS 128: 258 MH/s
WS 256: 230 MH/s

So VECTORS4 doesn't give me any boost. But thanks for putting it in. New functionality is always a plus.
hero member
Activity: 504
Merit: 502
August 02, 2011, 09:41:19 AM
using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4
hero member
Activity: 658
Merit: 500
August 02, 2011, 09:31:02 AM
using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)
newbie
Activity: 26
Merit: 0
August 02, 2011, 09:23:03 AM
All that stuff I just said.

HA! Got it fixed. Anyone who's having the error I just had, go through __init__.py, and every time there's an 'unpack' or a 'pack' statement that gets passed some number of 'L's (they will be 2, 4, or 8 'L's long), just add an '=' to the beginning. So 'LLLL' becomes '=LLLL'.

If you look up the documentation on how struct (which is where pack and unpack come from) parses its arguments, found here, it's system dependent by default. But if you add the '=', it forces the size characters (Like the 'L's and 'I's and such) to be standard size.  Grin

(Also, I had to uncomment the self.commandQueue.finish() statement, as per this post. I thought that was fixed, but it was still broken when I dled this morning.)

Kernel - 6870 945/1050 - 5830 1030/330
Diapolo 7-17 - 293 MH/s - 325 MH/s
phatk2.1 - 299 MH/s - 328 MH/s

Thanks for the work, phateus.  Grin Grin
newbie
Activity: 26
Merit: 0
August 02, 2011, 08:43:38 AM
Using 2.1, I'm still getting the same error as before.
I'm using Ubuntu 11.04, Catalyst 11.6, Phoenix 1.50. I unpacked the phatk version 2 files into my phoenix-1.50/kernels/phatk folder.
When I ran my phoenix with kernel options
Code:
-k phatk DEVICE=0 BFI_INT VECTORS AGGRESSION=12 FASTLOOP=FALSE WORKSIZE=256
I got the following error:
Code:
user@computer:~$ sudo ./btcg0.sh
[31/07/2011 18:04:08] Phoenix 1.50 starting...
[31/07/2011 18:04:09] Connected to server
[0 Khash/sec] [0 Accepted] [0 Rejected] [RPC]Unhandled error in Deferred:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 361, in callback
    self._startRunCallbacks(result)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 455, in _startRunCallbacks
    self._runCallbacks()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 542, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/user/phoenix-1.50/QueueReader.py", line 136, in preprocess
    d2 = defer.maybeDeferred(self.preprocessor, nr)
--- ---
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 133, in maybeDeferred
    result = f(*args, **kw)
  File "kernels/phatk/__init__.py", line 167, in
    self.qr = QueueReader(self.core, lambda nr: self.preprocess(nr),
  File "kernels/phatk/__init__.py", line 361, in preprocess
    kd = KernelData(nr, self.core, self.VECTORS, self.AGGRESSION)
  File "kernels/phatk/__init__.py", line 46, in __init__
    unpack('LLLL', nonceRange.unit.data[64:]), dtype=np.uint32)
struct.error: unpack requires a string argument of length 32
I then had to CTRL+Z to kill the process.

I'm still getting the unpack 'LLLL' error. (Note: I tried to pipe the output of phoenix through tee to make a log, but tee gave a completely garbled log file and I didn't notice it until after I reverted my kernel files. This is obviously not your problem; I just want you to know why I don't have any new error messages to show.)

This is the same error as znort (except I'm on python 2.7 and he's on 2.6).

You suggested a fix...
Hmmm... can you try replacing the 'LLLL' with 'IIII' (line 46 of __init__.py), I think the windows version uses python 2.7 which may handle that differently.
...which I tried (even though you say it's a windows problem and I'm not on windows). It gave another error at a later 'unpack' call being passed 'LLLLLLLL', and the error said 'unpack requires a string argument of length 64'. I tried changing that one to 'IIIIIIII', but it gave another error down the line that said something like 'incorrect arguments passed to kernel'. (Again, apologies for not having an error log.)

EDIT: I checked something else, and now I'm more Huh than ever. I loaded up the python interpreter on my machine so I can check how it sees the 'LLLL' and 'IIII' strings.
Code:
>>> import struct
>>> struct.calcsize('LLLL')
32
>>> struct.calcsize('IIII')
16
>>> struct.calcsize('LLLLLLLL')
64
>>> struct.calcsize('IIIIIIII')
32
Does this mean it's the nonceRange data and not the 'LLLL' that's the wrong size? How could that be? Does that mean there's some error wherever that nonceRange got packed in the first place?

Like I said, I'm  Huh
hero member
Activity: 658
Merit: 500
August 02, 2011, 07:04:17 AM
I'm getting a warning:

D:\sw\python27\lib\site-packages\pyopencl\__init__.py:173: UserWarning: Build su
cceeded, but resulted in non-empty logs:
Build on           ' at 0x414d7a0> succeeded, but said:

C:\Users\Igor\AppData\Local\Temp\OCL6496.tmp.cl(155): warning: variable "t1"
          was set but never used
        u t1;
          ^

NT -D
  warn("Build succeeded, but resulted in non-empty logs:\n"+message)
hero member
Activity: 504
Merit: 502
August 02, 2011, 05:51:29 AM
Good stuff Phateus Smiley I'm getting an extra 2-3MH/s with your newest kernel compared to Diapolo's last kernel. I merged the code into my fork of poclbm and it seems to be working fine there (with command line option --phatk2):
https://github.com/progranism/poclbm

The only bug I found was that the kernel wouldn't compile without BITALIGN. Not really important, since all my mining cards support BITALIGN. It complained about rotate being ambiguous.

Keep up the good work!

Hey fpgaminer, I really like this poclbm version of phatk2 but could you update the same version with --phatk2_1 switch or something so we could testdrive both versions with ease Smiley
legendary
Activity: 1288
Merit: 1227
Away on an extended break
August 02, 2011, 04:36:04 AM
With new 2.1, hashes have improved at 4 mhs from 410 to 414 for my 5850's each ! Grin Grin
Thank you!
legendary
Activity: 1855
Merit: 1016
August 02, 2011, 04:11:32 AM
With NEW 2.1, same results only. 448 & 432.
I am using 11.8 beta.
AMD APP 2.5.709.2
AMD Display Driver 8.880.3.0000
legendary
Activity: 1344
Merit: 1004
August 02, 2011, 03:47:56 AM
Quote
Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7

Woooo!, found the bug... it is in my kernel...
replace
Code:
#self.commandQueue.finish()
with
Code:
self.commandQueue.finish()
near the end of __init__.py

*sigh*... Uploaded the file yet again...

THIS FIXED IT! THANK YOU!!!!!

Donation coming your way. EXCELLENT improvement. Gained 4 mhash on my 5830 and and 5.4 on my 5870. Amazing!
Also 3 x 5830 rig went from 966.1 to 977.3, and increase of 11.2 mhash or 1.159%
newbie
Activity: 52
Merit: 0
August 02, 2011, 03:19:39 AM
Quote
Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7

Woooo!, found the bug... it is in my kernel...
replace
Code:
#self.commandQueue.finish()
with
Code:
self.commandQueue.finish()
near the end of __init__.py

*sigh*... Uploaded the file yet again...
sr. member
Activity: 476
Merit: 250
moOo
August 02, 2011, 02:49:03 AM
Quote
Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7
legendary
Activity: 1344
Merit: 1004
August 02, 2011, 02:31:26 AM
2.1version. With VECTORS4 & worksize 128 or 64, i only get 365 instead of 441. I under clock memory.
But with just vectors i get 448 & 432.
Using 2version i got 441 & 427

cards MSI Lightning 5870 & Sapphire HD 5870
MSI  448 Mhash/s - 975/325, 1175mV - aggression 13
Sapphire 432 Mhash/s - 939/313, 1163mV - aggression 12

Windows 7, 64 bit, AERO enabled, AOCLBF 1.75

VECTORS4 is only if you DON'T underclock memory i.e. stock memory clocks or the glitch where you can only underclock memory 100mhz lower than core speeds.

Low memory speed (<400MHz) = VECTORS and WORKSIZE=256
High memory speed (>900MHz) = VECTORS4 and WORKSIZE=64 or WORKSIZE=128
legendary
Activity: 1855
Merit: 1016
August 02, 2011, 02:13:21 AM
2.1version. With VECTORS4 & worksize 128 or 64, i only get 365 instead of 441. I under clock memory.
But with just vectors i get 448 & 432.
Using 2version i got 441 & 427

cards MSI Lightning 5870 & Sapphire HD 5870
MSI  448 Mhash/s - 975/325, 1175mV - aggression 13
Sapphire 432 Mhash/s - 939/313, 1163mV - aggression 12

Windows 7, 64 bit, AERO enabled, AOCLBF 1.75
legendary
Activity: 1344
Merit: 1004
August 02, 2011, 01:52:50 AM
Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating Sad
Pages:
Jump to: