Pages:
Author

Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13 - page 8. (Read 106700 times)

hero member
Activity: 769
Merit: 500
To all happy new kernel users, there is one thing you should know ... there have been NO donations since 2011-07-31, which makes me a bit sad.

It's my free time that I put in here (it were many hours till now) and the motivation is not only to get a "Thank you!". Remember, you guys generate more BTC with the kernel mods. It doesn't matter if it's my mod, Phateus mod or any others mod ... just be a little thankful and you keep a free and fast kernel + a motivated kernel mixer Diapolo Wink.

No offense to all the great people who already donated a few bitcents or even more, who helped me testing this, who helped me fix bugs or who added great ideas into this work!

Regards,
Diapolo
hero member
Activity: 769
Merit: 500
New version was just released, it should be the fastest for 69XX cards:
Download version 2011-08-04 (pre-release): http://www.mediafire.com/?upwwud7kfyx7788

This is the preferred switch for Phoenix in order to achieve comparable performance:
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=256
or
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=128

Please test this version with SDK 2.4 / SDK 2.5! SDK 2.1 performance seems worse, but at least it should work. Report any errors and problems here and let me know what you think.
Have a look at your cards temperatures, I got a report, that they may be lower, which would be great Smiley.

Regards,
Dia

I get  0.8MH/s faster with phoenix-r112, but temps do appear to be 3C-4C lower.

6970 Lightning (940,1375) x2
Ubuntu 10.10
SDK 2.4
Cat 11.3


I can confirm the temps difference,which I thought was strange.Using Catalyst 11.6B/SDK 2.5 on a 6950 @867/1250 using V 4 W64 F3 temps are 3 C lower using GUI miner.Hash rate has also increased 3 Mh's using those settings as well as invalids are definitely much lower vs. Phataeus.

I have to ask to understand you ... you say that my current pre-release version generates 3°C less heat for your card and invalid share rate is lower in comparison to the latest Phateus phatk?

Dia
hero member
Activity: 532
Merit: 500
New version was just released, it should be the fastest for 69XX cards:
Download version 2011-08-04 (pre-release): http://www.mediafire.com/?upwwud7kfyx7788

This is the preferred switch for Phoenix in order to achieve comparable performance:
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=256
or
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=128

Please test this version with SDK 2.4 / SDK 2.5! SDK 2.1 performance seems worse, but at least it should work. Report any errors and problems here and let me know what you think.
Have a look at your cards temperatures, I got a report, that they may be lower, which would be great Smiley.

Regards,
Dia

I get  0.8MH/s faster with phoenix-r112, but temps do appear to be 3C-4C lower.

6970 Lightning (940,1375) x2
Ubuntu 10.10
SDK 2.4
Cat 11.3


I can confirm the temps difference,which I thought was strange.Using Catalyst 11.6B/SDK 2.5 on a 6950 @867/1250 using V 4 W64 F3 temps are 3 C lower using GUI miner.Hash rate has also increased 3 Mh's using those settings as well as invalids are definitely much lower vs. Phataeus.
hero member
Activity: 769
Merit: 500
Updated 1st post kernel performance data with SDK 2.5 and KernelAnalyzer 1.9 Cal 11.7 profile.
hero member
Activity: 769
Merit: 500
The latest version (2011-08-04) has a major problem that I can see.

The assumption that there won't be more than 1 valid nonce per kernel execution is very wrong. At aggression 14 for example each kernel execution tests 2^30 nonces. The chance that there will be more than 1 valid nonce in any given kernel execution in this case is going to be about 2.5% (if I did the math right) This effectively causes a net loss in performance compared to the previous version at high aggression. At lower aggression values (10 and below) this is less of a problem since the performance loss in these cases will be much less than 1%.

You have to compare the loss of valid nonces to the higher efficiency because of the removed control flow in the kernel (all current GPUs dislike if/else and so on). I thought this tradeoff would be well worth it, but you could prove me wrong. I was thinking about a better way of writing the positive nonces into output, but that didn't work.

Any good ideas for that part of the kernel will be a big plus!

Dia

full member
Activity: 219
Merit: 120
The latest version (2011-08-04) has a major problem that I can see.

The assumption that there won't be more than 1 valid nonce per kernel execution is very wrong. At aggression 14 for example each kernel execution tests 2^30 nonces. The chance that there will be more than 1 valid nonce in any given kernel execution in this case is going to be about 2.5% (if I did the math right) This effectively causes a net loss in performance compared to the previous version at high aggression. At lower aggression values (10 and below) this is less of a problem since the performance loss in these cases will be much less than 1%.
legendary
Activity: 1855
Merit: 1016
just to be clear though.. vectors vectors2 doesnt hurt anything.. it is just extraneous?
I preferred to leave it as it was lest typing and deleting when testing the versions.
But if their is 2 vectors like "vectors vectors2" , which will be taken in to acc. 1st one or last one in command line?
coz vectors2 & vectors both give different performances.
full member
Activity: 140
Merit: 100
New version was just released, it should be the fastest for 69XX cards:
Download version 2011-08-04 (pre-release): http://www.mediafire.com/?upwwud7kfyx7788

This is the preferred switch for Phoenix in order to achieve comparable performance:
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=256
or
Code:
-k phatk AGGRESSION=12 BFI_INT FASTLOOP=false VECTORS VECTORS2 WORKSIZE=128

Please test this version with SDK 2.4 / SDK 2.5! SDK 2.1 performance seems worse, but at least it should work. Report any errors and problems here and let me know what you think.
Have a look at your cards temperatures, I got a report, that they may be lower, which would be great Smiley.

Regards,
Dia

I get  0.8MH/s faster with phoenix-r112, but temps do appear to be 3C-4C lower.

6970 Lightning (940,1375) x2
Ubuntu 10.10
SDK 2.4
Cat 11.3

hero member
Activity: 769
Merit: 500
Quote
Phoenix 1.5 has the bfipatcher.py included, so I never included it in any kernel package
strange must not have been in my guiminer's version of phoenix 1.5 which does seem different.


Quote
You are right, sorry for that! Just VECTORS2 is the way to go. I edited the first post


just to be clear though.. vectors vectors2 doesnt hurt anything.. it is just extraneous?


I preferred to leave it as it was lest typing and deleting when testing the versions.


Yeah, it doesn't hurt, it's just ignored.

Dia
sr. member
Activity: 476
Merit: 250
moOo
Quote
Phoenix 1.5 has the bfipatcher.py included, so I never included it in any kernel package
strange must not have been in my guiminer's version of phoenix 1.5 which does seem different.


Quote
You are right, sorry for that! Just VECTORS2 is the way to go. I edited the first post


just to be clear though.. vectors vectors2 doesnt hurt anything.. it is just extraneous?


I preferred to leave it as it was lest typing and deleting when testing the versions.
hero member
Activity: 769
Merit: 500
It says otherwise in the first post Tongue Tongue

You are right, sorry for that! Just VECTORS2 is the way to go. I edited the first post.

Dia
sr. member
Activity: 252
Merit: 250
It says otherwise in the first post Tongue Tongue
hero member
Activity: 769
Merit: 500
I am happy to report these results for my 5830 running on SDK 2.4, fglrx-driver 1:11-6-2 (Debian unstable ~3 weeks snapshot), phoenix 1.50 and options "VECTORS VECTORS2 BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256":

blah blah blah

You don't need VECTORS VECTORS2. Just VECTORS2

Correct Smiley
legendary
Activity: 1344
Merit: 1004
I am happy to report these results for my 5830 running on SDK 2.4, fglrx-driver 1:11-6-2 (Debian unstable ~3 weeks snapshot), phoenix 1.50 and options "VECTORS VECTORS2 BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256":

blah blah blah

You don't need VECTORS VECTORS2. Just VECTORS2
sr. member
Activity: 252
Merit: 250
I am happy to report these results for my 5830 running on SDK 2.4, fglrx-driver 1:11-6-2 (Debian unstable ~3 weeks snapshot), phoenix 1.50 and options "VECTORS VECTORS2 BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256":

* 07-11 @ 1040 Mhz - 334 Mhash
* 08-04 @ 1040 Mhz - 335,7 Mhash

* 07-11 @ 1050 Mhz - 337,3 Mhash
* 08-04 @ 1050 Mhz - 338,9 Mhash

07-17 was untested because of lots of hardware errors. The even more happy news for me is that 08-14 brought the number of hardware erors at about the same level as 07-11 (~0.2% of the accepted shares).

Thank you again for your work and I'm looking forward to having this kernel ported to cgminer!
hero member
Activity: 769
Merit: 500
Catalyst 11.6, SDK 2.4 here. Jumped from ~428 to ~435 with 5870, 950/300 clocks.

My friend got from ~416 to 425 with 920/300 clocks, also 5870 with Catalyst 11.5, SDK 2.1.

Keep up the good work, Dia!!

No problem mate, that's 10 BTC each card :-D.

Dia
newbie
Activity: 23
Merit: 0
Catalyst 11.6, SDK 2.4 here. Jumped from ~428 to ~435 with 5870, 950/300 clocks.

My friend got from ~416 to 425 with 920/300 clocks, also 5870 with Catalyst 11.5, SDK 2.1.

Keep up the good work, Dia!!
hero member
Activity: 769
Merit: 500
04-08-2011 version reduces hash speed on 5870.
I used vectors 4 & got only 444, while 2.1 phateus gives 447.
 

The current phatk should be faster for 58XX, sorry, but I said it _IS_ faster for 69XX Wink.

Dia
legendary
Activity: 1855
Merit: 1016
04-08-2011 version reduces hash speed on 5870.
I used vectors 4 & got only 444, while 2.1 phateus gives 447.
 
hero member
Activity: 769
Merit: 500
the 8-4 prerelease for some reason lowered the hashrate on only 1 of 2 of my cards.. (the second card went from an average 426 to 389) same OC settings while card #1 remained around 426 (6970's)

Are you sure the VECTORS2 switch IS set!?

Dia
Pages:
Jump to: