Pages:
Author

Topic: DiaKGCN kernel for CGMINER + Phoenix 2 (79XX / 78XX / 77XX / GCN) - 2012-05-25 - page 5. (Read 27827 times)

hero member
Activity: 642
Merit: 500
I tried it, and I'm sorry that I haven't reported back.  Work has been chaotic.

The only way that I can get a similar hashrate compared to DiabloMiner with this kernel is to use a very high intensity (greater than 10).  By doing this though, CPU usage skyrockets and I burn up more wattage than the hashrate increase is worth.  I can make a few changes to the Poclbm kernel included with CGMiner though and get 96 percent of the performance of DiabloMiner while leaving the intensity at 9.  By using CGMiner, I am able to use a backup pools, RPC, thermal controls, etc, etc.  This more than makes up for the ~4 percent loss in performance.  I'm not at home right now to look at every change, but defining the Ch and Ma functions to use Bitselect is basically all that was needed.

I'll try to send you a PM tonight with more details.
hero member
Activity: 772
Merit: 500
Okay, I've downloaded the kernel and am trying it now.  So far, not bad.  The Vectors4 still has the whole issue with showing twice as many hashes as are actually computing, but I think that has to do with the init file as you said there were incompatibilities in the code when using the VECTORS4 option.  Also, why the (u) variable when using bitselect?
I like how you used the nonce here.  It seems that it could be better than using a series of if-else statements.
You've managed to keep the instructions low, but somehow the darn thing's not hashing faster.  Probably because it's not repeating the same task again and again for and with the same variables.  But, as you said, it's optimized for GCN so I have no idea.

VEC4 is bugged until I say it got fixed, sorry Cheesy. The (u) is a typecast because afther round 64 I use some mixed scalar and vector values and this is needed to cast them even.
For me this is the fastest version on my 7970 ... but it seems no one cares to try it (on GCN cards).

Dia
sr. member
Activity: 378
Merit: 250
Okay, I've downloaded the kernel and am trying it now.  So far, not bad.  The Vectors4 still has the whole issue with showing twice as many hashes as are actually computing, but I think that has to do with the init file as you said there were incompatibilities in the code when using the VECTORS4 option.  Also, why the (u) variable when using bitselect?
I like how you used the nonce here.  It seems that it could be better than using a series of if-else statements.
You've managed to keep the instructions low, but somehow the darn thing's not hashing faster.  Probably because it's not repeating the same task again and again for and with the same variables.  But, as you said, it's optimized for GCN so I have no idea.
hero member
Activity: 772
Merit: 500
@Diapolo:

So do you have any opinions on GCN vs. VLIW4/5 when it comes to optimizations for the mining cores that are out there? Do you expect to CGN to be a nice step forward, or at best, should we be happy that CGN didn't nerf performance when compared to the VLIW4/5 architecture?

I'm curious to get your feedback. Smiley

Thanks for all your work!

I think GCN is a great step in the right direction. It's far easier for me AND the compiler to write / generate code, which results in pretty good utilization of the GPUs compute units. The CUs in contrast to VLIW4/VLIW5 units consist of independant vector units, which makes code or wavefronts on the GPU depend less on results of other units. The OpenCL compiler for GCN feels far more matured, than it was after the relase of the 69XX series of cards. The drawback seems to be, that the current kernels have all very similar performance levels Cheesy.

Dia
hero member
Activity: 772
Merit: 500
New version 02-02-2012 is ready for download. Release highlights include OpenCL 1.1 global offset parameter support (THX DiabloD3 for the idea - damn it sucked to do this in Python ^^), fixed non VECTOR code path and faster kernel execution on GCN cards (achieved via saving instructions in the GPU ISA code).

download current version:
http://www.filedropper.com/diakgcn02-02-2012

Dia
hero member
Activity: 914
Merit: 500
@Diapolo:

So do you have any opinions on GCN vs. VLIW4/5 when it comes to optimizations for the mining cores that are out there? Do you expect to CGN to be a nice step forward, or at best, should we be happy that CGN didn't nerf performance when compared to the VLIW4/5 architecture?

I'm curious to get your feedback. Smiley

Thanks for all your work!
hero member
Activity: 772
Merit: 500
~695MH/s on a 7970 at 1175/1375 clocks, with the command line from the OP.

Diablominer gives ~700MH/s with less interface lag though.

Some reports indicate, that a lower AGGRESSION could lead to higher values, but I can't confirm this for my machine.
I'm working hard on the next version, the optimisation is not finished...

Dia
member
Activity: 121
Merit: 10
~695MH/s on a 7970 at 1175/1375 clocks, with the command line from the OP.

Diablominer gives ~700MH/s with less interface lag though.
hero member
Activity: 772
Merit: 500
download current version:
http://www.filedropper.com/diakgcn29-01-2012

Should be faster than the previous one, changelog is included and I edited the first post to be more informative!

Dia
hero member
Activity: 772
Merit: 500
Is the hashrate display broken for VECTORS4? Running VEC2/AGG10/WS256 I get ~626MH at 1080/366 (about 10 mh/s less than diablo, not bad!).

If I use VEC4 my hashrate display doubles - ~1.22GH/s. I wish this wasn't a bug or something Shocked

VEC4 is broken, sorry to say Wink ... it works with VEC2 speed currently. VEC4 seems to be not a good option for GCN.
I will polish the kernel further and supply a changelog in the future. I only wanted to get it released first.

Dia
sr. member
Activity: 406
Merit: 250
Is the hashrate display broken for VECTORS4? Running VEC2/AGG10/WS256 I get ~626MH at 1080/366 (about 10 mh/s less than diablo, not bad!).

If I use VEC4 my hashrate display doubles - ~1.22GH/s. I wish this wasn't a bug or something Shocked

Yes, I see this too at stock (1.09Gh v4 agg12). Although, shares are accepted..... gonna wait to see what my site says actual shares are

UPDATE: Actual Hashrate is about the same as vectors2. Seems like a reporting issue.
hero member
Activity: 772
Merit: 500
hero member
Activity: 772
Merit: 500
A second version was sent to the testers, if others are interested in trying this out just give me a shout.

Dia
hero member
Activity: 914
Merit: 500
Totes subbing to this thread. I have the money, just waiting for the results Smiley
legendary
Activity: 1876
Merit: 1000

I have a 4x7970 rig.  would love to test.
sr. member
Activity: 406
Merit: 250
Nice work for sure! The more 7970 kernels the better Smiley
hero member
Activity: 772
Merit: 500
If all keeps this smooth, a release is just around the corner ... stay tuned.

Dia
sr. member
Activity: 406
Merit: 250
got a 1 card rig if you need it.
hero member
Activity: 772
Merit: 500
Same story.  Have a 4 card rig and would be glad to help.

PM sent, thanks!

Dia
hero member
Activity: 642
Merit: 500
Same story.  Have a 4 card rig and would be glad to help.
Pages:
Jump to: