Modified Kernel for Phoenix 1.5 - page 8.

Phateus

newbie

Activity: 52

Merit: 0

Quote from: Diapolo on August 02, 2011, 10:10:38 AM

@Phat:

I don't understand how you achieve, that base is always an uint as kernel parameter now that base has (uint2)(0, 1) or (uint4)(0, 1, 2, 3) added into it via the init-file. If I try to do this with my mod it just crashes Phoenix, now if I use const u base, instead of const uint base, it seems to work (because u reflects the correct variable type uint, uint2 or uint4). Have you got an idea for this?

Thanks,
Dia

I'm not sure I understand you... Depending on whether the number of nonces per thread (VECTORS) is 1, 2, or 4, the kernel compiles as base being either uint, uint2 or uint4. The init file packs either 1, 2 or 4 uinits into each base entry and therefore, the init files always produces the same size variable as the kernel needs. So, in short, both the base{i] variable being passed to the kernel and the "u base" value in the kernel can be either 1, 2 or 4 uints. Does that answer your question?

Tx2000

full member

Activity: 182

Merit: 100

I have a 3Mhash avg improvement over Diapolo last kernel update (393 -> 396)

Setup is as follows:

Reference 5850, 1.100v 920 core / 350 mem. 11.4 preview / SDK 2.4. Lastest GUIMiner / phoenix 1.50

Going to run it for a day to see it's stability and report back if anything arises.

Clipse

hero member

Activity: 504

Merit: 502

Quote from: iopq on August 02, 2011, 10:14:14 AM

Quote from: Clipse on August 02, 2011, 09:41:19 AM

Quote from: iopq on August 02, 2011, 09:31:02 AM

using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4

I'm using that, just replaced the phatk2 kernel with phatk2.1 and that's it
vectors4 should be slower, why would you want to use it? I use -v only

Just wanted to test vectors4 with default memory, not high priority.

Still phatk2.1 is much slower than phatk2 for me as I said ~11mh per card, ati hd5850 , I wonder why o_0

iopq

hero member

Activity: 658

Merit: 500

Quote from: Clipse on August 02, 2011, 09:41:19 AM

Quote from: iopq on August 02, 2011, 09:31:02 AM

using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4

I'm using that, just replaced the phatk2 kernel with phatk2.1 and that's it
vectors4 should be slower, why would you want to use it? I use -v only

Diapolo

hero member

Activity: 772

Merit: 500

@Phat:

I don't understand how you achieve, that base is always an uint as kernel parameter now that base has (uint2)(0, 1) or (uint4)(0, 1, 2, 3) added into it via the init-file. If I try to do this with my mod it just crashes Phoenix, now if I use const u base, instead of const uint base, it seems to work (because u reflects the correct variable type uint, uint2 or uint4). Have you got an idea for this?

Thanks,
Dia

UniverseMan

newbie

Activity: 26

Merit: 0

I tried VECTORS4 on my 6870, since I can't underclock the memory (it's at 1050).

Results:
VECTORS
WS 64: 295 MH/s
WS 128: 299 MH/s
WS 256: 292 MH/s

VECTORS4
WS 64: 278 MH/s
WS 128: 258 MH/s
WS 256: 230 MH/s

So VECTORS4 doesn't give me any boost. But thanks for putting it in. New functionality is always a plus.

Clipse

hero member

Activity: 504

Merit: 502

Quote from: iopq on August 02, 2011, 09:31:02 AM

using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

Yeh I must say, poclbm fork with phatk2 outperformed phatk2 on phoenix 1.5 (and up till now all phatk mods performed between on phoenix 1.5 for me)

quite interesting.

ps. iopq, can you post the changes made to run phatk2.1 on poclbm mod by fpgaminer, I assume you are using that? Also what arg is added to use vectors4. Ive replaced phatk2.cl with phatk2.1 cl but I get ~11mh less with phatk2.1 so I am wondering if there is other changes required. I am using sdk 2.4

iopq

hero member

Activity: 658

Merit: 500

using poclbm fork with phatk2.1 and it's the fastest kernel so far
I tried with 2.4 opencl and it was slower, so I went back to 2.1 which is the fastest on my card (hd 5750)

UniverseMan

newbie

Activity: 26

Merit: 0

Quote from: UniverseMan on August 02, 2011, 08:43:38 AM

All that stuff I just said.

HA! Got it fixed. Anyone who's having the error I just had, go through __init__.py, and every time there's an 'unpack' or a 'pack' statement that gets passed some number of 'L's (they will be 2, 4, or 8 'L's long), just add an '=' to the beginning. So 'LLLL' becomes '=LLLL'.

If you look up the documentation on how struct (which is where pack and unpack come from) parses its arguments, found here, it's system dependent by default. But if you add the '=', it forces the size characters (Like the 'L's and 'I's and such) to be standard size. Grin

(Also, I had to uncomment the self.commandQueue.finish() statement, as per this post. I thought that was fixed, but it was still broken when I dled this morning.)

Kernel - 6870 945/1050 - 5830 1030/330
Diapolo 7-17 - 293 MH/s - 325 MH/s
phatk2.1 - 299 MH/s - 328 MH/s

Thanks for the work, phateus. Grin

UniverseMan

newbie

Activity: 26

Merit: 0

Using 2.1, I'm still getting the same error as before.

Quote from: UniverseMan on July 31, 2011, 06:18:42 PM

I'm using Ubuntu 11.04, Catalyst 11.6, Phoenix 1.50. I unpacked the phatk version 2 files into my phoenix-1.50/kernels/phatk folder.
When I ran my phoenix with kernel options

Code:

-k phatk DEVICE=0 BFI_INT VECTORS AGGRESSION=12 FASTLOOP=FALSE WORKSIZE=256

I got the following error:

Code:

user@computer:~$ sudo ./btcg0.sh
[31/07/2011 18:04:08] Phoenix 1.50 starting...
[31/07/2011 18:04:09] Connected to server
[0 Khash/sec] [0 Accepted] [0 Rejected] [RPC]Unhandled error in Deferred:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 361, in callback
   self._startRunCallbacks(result)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 455, in _startRunCallbacks
   self._runCallbacks()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 542, in _runCallbacks
   current.result = callback(current.result, *args, **kw)
  File "/home/user/phoenix-1.50/QueueReader.py", line 136, in preprocess
   d2 = defer.maybeDeferred(self.preprocessor, nr)
--- ---
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 133, in maybeDeferred
   result = f(*args, **kw)
  File "kernels/phatk/__init__.py", line 167, in
   self.qr = QueueReader(self.core, lambda nr: self.preprocess(nr),
  File "kernels/phatk/__init__.py", line 361, in preprocess
   kd = KernelData(nr, self.core, self.VECTORS, self.AGGRESSION)
  File "kernels/phatk/__init__.py", line 46, in __init__
   unpack('LLLL', nonceRange.unit.data[64:]), dtype=np.uint32)
struct.error: unpack requires a string argument of length 32

I then had to CTRL+Z to kill the process.

I'm still getting the unpack 'LLLL' error. (Note: I tried to pipe the output of phoenix through tee to make a log, but tee gave a completely garbled log file and I didn't notice it until after I reverted my kernel files. This is obviously not your problem; I just want you to know why I don't have any new error messages to show.)

This is the same error as znort (except I'm on python 2.7 and he's on 2.6).

You suggested a fix...

Quote from: Phateus on August 02, 2011, 01:30:23 AM

Hmmm... can you try replacing the 'LLLL' with 'IIII' (line 46 of __init__.py), I think the windows version uses python 2.7 which may handle that differently.

...which I tried (even though you say it's a windows problem and I'm not on windows). It gave another error at a later 'unpack' call being passed 'LLLLLLLL', and the error said 'unpack requires a string argument of length 64'. I tried changing that one to 'IIIIIIII', but it gave another error down the line that said something like 'incorrect arguments passed to kernel'. (Again, apologies for not having an error log.)

EDIT: I checked something else, and now I'm more Huh

than ever. I loaded up the python interpreter on my machine so I can check how it sees the 'LLLL' and 'IIII' strings.

Code:

>>> import struct
>>> struct.calcsize('LLLL')
32
>>> struct.calcsize('IIII')
16
>>> struct.calcsize('LLLLLLLL')
64
>>> struct.calcsize('IIIIIIII')
32

Does this mean it's the nonceRange data and not the 'LLLL' that's the wrong size? How could that be? Does that mean there's some error wherever that nonceRange got packed in the first place?

Like I said, I'm Huh

iopq

hero member

Activity: 658

Merit: 500

I'm getting a warning:

D:\sw\python27\lib\site-packages\pyopencl\__init__.py:173: UserWarning: Build su
cceeded, but resulted in non-empty logs:
Build on ' at 0x414d7a0> succeeded, but said:

C:\Users\Igor\AppData\Local\Temp\OCL6496.tmp.cl(155): warning: variable "t1"
was set but never used
u t1;
^

NT -D
warn("Build succeeded, but resulted in non-empty logs:\n"+message)

Clipse

hero member

Activity: 504

Merit: 502

Quote from: fpgaminer on July 30, 2011, 09:36:57 AM

Good stuff Phateus

I'm getting an extra 2-3MH/s with your newest kernel compared to Diapolo's last kernel. I merged the code into my fork of poclbm and it seems to be working fine there (with command line option --phatk2):
https://github.com/progranism/poclbm

The only bug I found was that the kernel wouldn't compile without BITALIGN. Not really important, since all my mining cards support BITALIGN. It complained about rotate being ambiguous.

Keep up the good work!

Hey fpgaminer, I really like this poclbm version of phatk2 but could you update the same version with --phatk2_1 switch or something so we could testdrive both versions with ease

John (John K.)

legendary

Activity: 1288

Merit: 1227

Away on an extended break

With new 2.1, hashes have improved at 4 mhs from 410 to 414 for my 5850's each ! Grin

Thank you!

dishwara

legendary

Activity: 1855

Merit: 1016

With NEW 2.1, same results only. 448 & 432.
I am using 11.8 beta.
AMD APP 2.5.709.2
AMD Display Driver 8.880.3.0000

ssateneth

legendary

Activity: 1344

Merit: 1004

Quote from: Phateus on August 02, 2011, 03:19:39 AM

Quote from: joulesbeef on August 02, 2011, 02:49:03 AM

Quote

Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7

Woooo!, found the bug... it is in my kernel...
replace

Code:

#self.commandQueue.finish()

with

Code:

self.commandQueue.finish()

near the end of __init__.py

*sigh*... Uploaded the file yet again...

THIS FIXED IT! THANK YOU!!!!!

Donation coming your way. EXCELLENT improvement. Gained 4 mhash on my 5830 and and 5.4 on my 5870. Amazing!
Also 3 x 5830 rig went from 966.1 to 977.3, and increase of 11.2 mhash or 1.159%

Phateus

newbie

Activity: 52

Merit: 0

Quote from: joulesbeef on August 02, 2011, 02:49:03 AM

Quote

Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7

Woooo!, found the bug... it is in my kernel...
replace

Code:

#self.commandQueue.finish()

with

Code:

self.commandQueue.finish()

near the end of __init__.py

*sigh*... Uploaded the file yet again...

joulesbeef

sr. member

Activity: 476

Merit: 250

moOo

Quote

Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating

nah phoenix seems up to date I'm guessing it is due to it using python 2.6 instead of 2.7

ssateneth

legendary

Activity: 1344

Merit: 1004

Quote from: dishwara on August 02, 2011, 02:13:21 AM

2.1version. With VECTORS4 & worksize 128 or 64, i only get 365 instead of 441. I under clock memory.
But with just vectors i get 448 & 432.
Using 2version i got 441 & 427

cards MSI Lightning 5870 & Sapphire HD 5870
MSI 448 Mhash/s - 975/325, 1175mV - aggression 13
Sapphire 432 Mhash/s - 939/313, 1163mV - aggression 12

Windows 7, 64 bit, AERO enabled, AOCLBF 1.75

VECTORS4 is only if you DON'T underclock memory i.e. stock memory clocks or the glitch where you can only underclock memory 100mhz lower than core speeds.

Low memory speed (<400MHz) = VECTORS and WORKSIZE=256
High memory speed (>900MHz) = VECTORS4 and WORKSIZE=64 or WORKSIZE=128

dishwara

legendary

Activity: 1855

Merit: 1016

2.1version. With VECTORS4 & worksize 128 or 64, i only get 365 instead of 441. I under clock memory.
But with just vectors i get 448 & 432.
Using 2version i got 441 & 427

cards MSI Lightning 5870 & Sapphire HD 5870
MSI 448 Mhash/s - 975/325, 1175mV - aggression 13
Sapphire 432 Mhash/s - 939/313, 1163mV - aggression 12

Windows 7, 64 bit, AERO enabled, AOCLBF 1.75

ssateneth

legendary

Activity: 1344

Merit: 1004

Thanks for the update. Apparently guiminer needs to be updated for this kernel to work though (outdated phoenix?..) It just spams idle on the console. I really need to use guiminer. This is so frustrating Sad

Topic: Modified Kernel for Phoenix 1.5 - page 8. (Read 96811 times)