Pages:
Author

Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13 - page 13. (Read 106928 times)

member
Activity: 98
Merit: 10
Every new version up to 07~ was faster than previous on both of my rigs. 11~ version however is faster for single 5850, but on 5870+5850 rig Im noticing minor slowdown (due to bigger deltas, means best performance is same, but it can go little lower). In adition primary GPU set '-f0' stopped to bottleneck other GPU (but changing from -f35 to -f0 didnt add any speed). After all tests I've changed version on single gpu rig and left old one on double gpu rig.
full member
Activity: 219
Merit: 120
Testing the 2011-07-11 kernel.7z

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.1
Before: [428.56 Mhash/sec] -> After: [432.28 Mhash/sec]
Stales before: 0.22% - > Stales after: 3.48%
Over a 4 hour test period

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.4
Before: [422.34 Mhash/sec] -> After: [189.58 Mhash/sec]
Stales: Not tested due to adverse affects.

Well that is very strange, but at least you are able to mine faster with SDK2.1 and the current kernel version ^^.
Btw. I had other things to do, but during the next week I will release a new version.

Dia
Awesome, I look forward to it.
I think the rejected shares was random variance from my side as it seems to have settled down to a more realistic 0.88%.

Keep in mind that OpenCL kernel changes have NO effect on stale shares (aside from the VERY small difference in time it takes to run 1 execution of some number of hashes) All nonces found by the kernel to satisfy H == 0 are verified on the CPU prior to sending. Shares are also checked against the current known block before sending, in case new work was received while the kernel was executing. Basically this means that every share sent to the server is valid as far as the miner is concerned. If the OpenCL kernel is returning bad work it will never be sent to the server, and instead you will get "Unusual behavior from OpenCL. Hardware problem?"

That said, changes to the python portion of a Phoenix kernel can increase stale shares if badly implemented. (see: FASTLOOP excessive stales with high aggression in older versions of Phoenix)
hero member
Activity: 927
Merit: 1000
฿itcoin ฿itcoin ฿itcoin
Testing the 2011-07-11 kernel.7z

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.1
Before: [428.56 Mhash/sec] -> After: [432.28 Mhash/sec]
Stales before: 0.22% - > Stales after: 3.48%
Over a 4 hour test period

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.4
Before: [422.34 Mhash/sec] -> After: [189.58 Mhash/sec]
Stales: Not tested due to adverse affects.

Well that is very strange, but at least you are able to mine faster with SDK2.1 and the current kernel version ^^.
Btw. I had other things to do, but during the next week I will release a new version.

Dia
Awesome, I look forward to it.
I think the rejected shares was random variance from my side as it seems to have settled down to a more realistic 0.88%.
hero member
Activity: 772
Merit: 500
Testing the 2011-07-11 kernel.7z

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.1
Before: [428.56 Mhash/sec] -> After: [432.28 Mhash/sec]
Stales before: 0.22% - > Stales after: 3.48%
Over a 4 hour test period

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.4
Before: [422.34 Mhash/sec] -> After: [189.58 Mhash/sec]
Stales: Not tested due to adverse affects.

Well that is very strange, but at least you are able to mine faster with SDK2.1 and the current kernel version ^^.
Btw. I had other things to do, but during the next week I will release a new version.

Dia
hero member
Activity: 927
Merit: 1000
฿itcoin ฿itcoin ฿itcoin
Testing the 2011-07-11 kernel.7z

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.1
Before: [428.56 Mhash/sec] -> After: [432.28 Mhash/sec]
Stales before: 0.22% - > Stales after: 3.48%
Over a 4 hour test period

XFX 5870 @ 940/300
Ubuntu 11.04, ATI Drivers 11.5, SDK 2.4
Before: [422.34 Mhash/sec] -> After: [189.58 Mhash/sec]
Stales: Not tested due to adverse affects.
newbie
Activity: 12
Merit: 0
Sadly it only gives me 1Mhash/s increase from 344 to 345 on my server, but keep up the good work!
newbie
Activity: 35
Merit: 0
1 - Phatk Improved - it's what this topic is all about.
2 - Most probably he meant 5870  Wink
What this guy said.
hero member
Activity: 812
Merit: 502
Uzza, two questions:

1.) what is "phatk imp"

2.) Surely you meant a 5870 instead of a 4870?  Either that or you must have four 4870's to hit 430 MH/s!

1 - Phatk Improved - it's what this topic is all about.
2 - Most probably he meant 5870  Wink
sr. member
Activity: 418
Merit: 250
Uzza, two questions:

1.) what is "phatk imp"

2.) Surely you meant a 5870 instead of a 4870?  Either that or you must have four 4870's to hit 430 MH/s!
newbie
Activity: 35
Merit: 0
I just did a quick comparison against poclbm for me.

On my dedicated 5870:
poclbm     SDK 2.1: ~424
phatk       SDK 2.1: <420
phatk imp SDK 2.1: ~432

poclbm     SDK 2.4: <424
phatk       SDK 2.4: ~424
phatk imp SDK 2.4: ~437

So on SDK 2.1 your improvements made it so phatk was better than poclbm in 2.1, and way better in 2.4.

The init optimizations gave me a minor boost of ~0.5 MHs over 2011-07-11.
full member
Activity: 219
Merit: 120
>I modified the shipping phatk Kernel from Phoenix 1.50. I now get round about 9-10 MHash/s more on my 5830 (up >from 310 to 319/320)!

I would really like to replicate this. Currently getting 310 Mh with 2.1 + 11.5 with bitless Ma() changes. Using the 7-11 I only get a small jump to 311. Can you share the config you use? sdk/driver version etc Thanks.


That might be tough since phatk doesn't work very well on older SDK versions. It was designed to be used with SDK 2.4, and on 2.1 I get better results with poclbm. The Ma() changes also apply to poclbm, so you won't see a gain there.
hero member
Activity: 772
Merit: 500
>I modified the shipping phatk Kernel from Phoenix 1.50. I now get round about 9-10 MHash/s more on my 5830 (up >from 310 to 319/320)!

I would really like to replicate this. Currently getting 310 Mh with 2.1 + 11.5 with bitless Ma() changes. Using the 7-11 I only get a small jump to 311. Can you share the config you use? sdk/driver version etc Thanks.


- Win7 X64 SP1
- Cat 11.7 with SDK 2.4 and Runtime 2.4 (in order to be able to use AMD APP KernelAnalyzer)
- Sapphire 5830 Xtreme @ 1000 MHz core / 350 MHz Mem
- Phoenix 1.5: agression 12, vectors, bfi_int

Dia
hero member
Activity: 504
Merit: 500
>I modified the shipping phatk Kernel from Phoenix 1.50. I now get round about 9-10 MHash/s more on my 5830 (up >from 310 to 319/320)!

I would really like to replicate this. Currently getting 310 Mh with 2.1 + 11.5 with bitless Ma() changes. Using the 7-11 I only get a small jump to 311. Can you share the config you use? sdk/driver version etc Thanks.
hero member
Activity: 772
Merit: 500
With the ideas that Vince gave to us, I was able to lower the ALU OP usage even further. This means next version will speed up things for 69XX and 58XX again Smiley.
Thank you Vince, I didn't need all your changes (some seem to reduce kernel speed, even if they look good), but merged the ones I like and verified everything with KernelAnalyzer.

Edit: Drawback is, that you will need to replace the Phoenix __init__.py file, so it won't be easy usable for non Phoenix users, sorry for that (some init values changed)!

Dia
what about for poclbm? there is no __init__.py

That's what my edit was about Wink. It's all a matter of how much time I and others have and current focus is on Phoenix, because that's my main miner software.
Perhaps some mods can be done without new init values, so they will work without new __init__.py. But then I have to take care of 2 kernel versions. For now there is no need to worry, new version is not out Smiley.

Dia
member
Activity: 77
Merit: 10
With the ideas that Vince gave to us, I was able to lower the ALU OP usage even further. This means next version will speed up things for 69XX and 58XX again Smiley.
Thank you Vince, I didn't need all your changes (some seem to reduce kernel speed, even if they look good), but merged the ones I like and verified everything with KernelAnalyzer.

Edit: Drawback is, that you will need to replace the Phoenix __init__.py file, so it won't be easy usable for non Phoenix users, sorry for that (some init values changed)!

Dia
what about for poclbm? there is no __init__.py
full member
Activity: 219
Merit: 120

Stales have nothing to do with GPU errors in Phoenix. All results returned by the GPU are verified before being sent to the server, which means if the kernel finds invalid shares you will get "Unexpected behavior from OpenCL. Hardware problem?" instead of the share being submitted. If you are not getting any of these errors there is NO WAY a kernel change can affect the number of stale shares. It might affect the total number of shares submitted in a given time period, but every share sent to the server by Phoenix is verified to be H == 0 on the CPU beforehand.

Thanks for clarification! Good to know, this was new for me.

Dia

Thanks for the heads up. I am running 4 5830s on 1 box clocked at 950/300 pointed at bitcoins.lc. It seems the stale count went up for me, didn't mean to blame you or anything just saying that was what happened. I'lll look into on the pool end though.

My post was mainly intended to clarify that stale shares are not a good measurement for kernel changes.

A much more reliable test would be to count the total number of shares submitted over a long period (say 24 hours or so) This includes stales, since the goal is to test how many shares the kernel finds, not how many the server accepts. If this number is higher than without the kernel modifications, you know that it's helping.
legendary
Activity: 1855
Merit: 1016
=> another addition almost effortless

here the files with these changes:
http://www.filesonic.com/file/1423103594

still some more to come!
Thanks.
It increased 440 to 443 in 5870 @ 975/325 Windows.
431-434 in 6970 & 5870 @ 975/1375 & 984/300 Ubuntu - Smartcoin

With the inclusion of _init_.py, i hope there will be still some room to tweak.
newbie
Activity: 53
Merit: 0
I was getting only .01% stales, with your patch I have an increase of 10mhash on average with my 5830 ( 295->305) but my stale count is now around 3-4%.....

Perhaps the kernel pushes your card harder and it generates errors. But could be the pool, driver and so on, like Vince said.

Dia

Stales have nothing to do with GPU errors in Phoenix. All results returned by the GPU are verified before being sent to the server, which means if the kernel finds invalid shares you will get "Unexpected behavior from OpenCL. Hardware problem?" instead of the share being submitted. If you are not getting any of these errors there is NO WAY a kernel change can affect the number of stale shares. It might affect the total number of shares submitted in a given time period, but every share sent to the server by Phoenix is verified to be H == 0 on the CPU beforehand.

Thanks for clarification! Good to know, this was new for me.

Dia

Thanks for the heads up. I am running 4 5830s on 1 box clocked at 950/300 pointed at bitcoins.lc. It seems the stale count went up for me, didn't mean to blame you or anything just saying that was what happened. I'lll look into on the pool end though.
member
Activity: 224
Merit: 10
cool, went from 349 (original improved kernel) to 350.3 with latest. Keep 'em coming Cheesy
hero member
Activity: 772
Merit: 500
I was getting only .01% stales, with your patch I have an increase of 10mhash on average with my 5830 ( 295->305) but my stale count is now around 3-4%.....

Perhaps the kernel pushes your card harder and it generates errors. But could be the pool, driver and so on, like Vince said.

Dia

Stales have nothing to do with GPU errors in Phoenix. All results returned by the GPU are verified before being sent to the server, which means if the kernel finds invalid shares you will get "Unexpected behavior from OpenCL. Hardware problem?" instead of the share being submitted. If you are not getting any of these errors there is NO WAY a kernel change can affect the number of stale shares. It might affect the total number of shares submitted in a given time period, but every share sent to the server by Phoenix is verified to be H == 0 on the CPU beforehand.

Thanks for clarification! Good to know, this was new for me.

Dia
Pages:
Jump to: