DiaKGCN kernel for CGMINER + Phoenix 2 (79XX / 78XX / 77XX / GCN) - 2012-05-25 - page 5.

sveetsnelda

hero member

Activity: 642

Merit: 500

I tried it, and I'm sorry that I haven't reported back. Work has been chaotic.

The only way that I can get a similar hashrate compared to DiabloMiner with this kernel is to use a very high intensity (greater than 10). By doing this though, CPU usage skyrockets and I burn up more wattage than the hashrate increase is worth. I can make a few changes to the Poclbm kernel included with CGMiner though and get 96 percent of the performance of DiabloMiner while leaving the intensity at 9. By using CGMiner, I am able to use a backup pools, RPC, thermal controls, etc, etc. This more than makes up for the ~4 percent loss in performance. I'm not at home right now to look at every change, but defining the Ch and Ma functions to use Bitselect is basically all that was needed.

I'll try to send you a PM tonight with more details.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: d3m0n1q_733rz on February 02, 2012, 02:30:30 PM

Okay, I've downloaded the kernel and am trying it now. So far, not bad. The Vectors4 still has the whole issue with showing twice as many hashes as are actually computing, but I think that has to do with the init file as you said there were incompatibilities in the code when using the VECTORS4 option. Also, why the (u) variable when using bitselect?
I like how you used the nonce here. It seems that it could be better than using a series of if-else statements.
You've managed to keep the instructions low, but somehow the darn thing's not hashing faster. Probably because it's not repeating the same task again and again for and with the same variables. But, as you said, it's optimized for GCN so I have no idea.

VEC4 is bugged until I say it got fixed, sorry Cheesy

. The (u) is a typecast because afther round 64 I use some mixed scalar and vector values and this is needed to cast them even.
For me this is the fastest version on my 7970 ... but it seems no one cares to try it (on GCN cards).

Dia

d3m0n1q_733rz

sr. member

Activity: 378

Merit: 250

Okay, I've downloaded the kernel and am trying it now. So far, not bad. The Vectors4 still has the whole issue with showing twice as many hashes as are actually computing, but I think that has to do with the init file as you said there were incompatibilities in the code when using the VECTORS4 option. Also, why the (u) variable when using bitselect?
I like how you used the nonce here. It seems that it could be better than using a series of if-else statements.
You've managed to keep the instructions low, but somehow the darn thing's not hashing faster. Probably because it's not repeating the same task again and again for and with the same variables. But, as you said, it's optimized for GCN so I have no idea.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: wndrbr3d on February 01, 2012, 02:48:43 PM

@Diapolo:

So do you have any opinions on GCN vs. VLIW4/5 when it comes to optimizations for the mining cores that are out there? Do you expect to CGN to be a nice step forward, or at best, should we be happy that CGN didn't nerf performance when compared to the VLIW4/5 architecture?

I'm curious to get your feedback.

Thanks for all your work!

I think GCN is a great step in the right direction. It's far easier for me AND the compiler to write / generate code, which results in pretty good utilization of the GPUs compute units. The CUs in contrast to VLIW4/VLIW5 units consist of independant vector units, which makes code or wavefronts on the GPU depend less on results of other units. The OpenCL compiler for GCN feels far more matured, than it was after the relase of the 69XX series of cards. The drawback seems to be, that the current kernels have all very similar performance levels Cheesy

.

Dia

Diapolo

hero member

Activity: 772

Merit: 500

New version 02-02-2012 is ready for download. Release highlights include OpenCL 1.1 global offset parameter support (THX DiabloD3 for the idea - damn it sucked to do this in Python ^^), fixed non VECTOR code path and faster kernel execution on GCN cards (achieved via saving instructions in the GPU ISA code).

download current version:
http://www.filedropper.com/diakgcn02-02-2012

Dia

wndrbr3d

hero member

Activity: 914

Merit: 500

@Diapolo:

So do you have any opinions on GCN vs. VLIW4/5 when it comes to optimizations for the mining cores that are out there? Do you expect to CGN to be a nice step forward, or at best, should we be happy that CGN didn't nerf performance when compared to the VLIW4/5 architecture?

I'm curious to get your feedback.

Thanks for all your work!

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: Dyaheon on January 30, 2012, 09:22:55 AM

~695MH/s on a 7970 at 1175/1375 clocks, with the command line from the OP.

Diablominer gives ~700MH/s with less interface lag though.

Some reports indicate, that a lower AGGRESSION could lead to higher values, but I can't confirm this for my machine.
I'm working hard on the next version, the optimisation is not finished...

Dia

Dyaheon

member

Activity: 121

Merit: 10

~695MH/s on a 7970 at 1175/1375 clocks, with the command line from the OP.

Diablominer gives ~700MH/s with less interface lag though.

Diapolo

hero member

Activity: 772

Merit: 500

download current version:
http://www.filedropper.com/diakgcn29-01-2012

Should be faster than the previous one, changelog is included and I edited the first post to be more informative!

Dia

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: ?? on ??

Is the hashrate display broken for VECTORS4? Running VEC2/AGG10/WS256 I get ~626MH at 1080/366 (about 10 mh/s less than diablo, not bad!).

If I use VEC4 my hashrate display doubles - ~1.22GH/s. I wish this wasn't a bug or something Shocked

VEC4 is broken, sorry to say Wink

... it works with VEC2 speed currently. VEC4 seems to be not a good option for GCN.
I will polish the kernel further and supply a changelog in the future. I only wanted to get it released first.

Dia

simplecoin

sr. member

Activity: 406

Merit: 250

Quote from: ?? on ??

Is the hashrate display broken for VECTORS4? Running VEC2/AGG10/WS256 I get ~626MH at 1080/366 (about 10 mh/s less than diablo, not bad!).

If I use VEC4 my hashrate display doubles - ~1.22GH/s. I wish this wasn't a bug or something Shocked

Yes, I see this too at stock (1.09Gh v4 agg12). Although, shares are accepted..... gonna wait to see what my site says actual shares are

UPDATE: Actual Hashrate is about the same as vectors2. Seems like a reporting issue.

Diapolo

hero member

Activity: 772

Merit: 500

http://www.filedropper.com/diakgcn28-01-2012

I'll leave this without comments for now ...

Dia

Diapolo

hero member

Activity: 772

Merit: 500

A second version was sent to the testers, if others are interested in trying this out just give me a shout.

Dia

wndrbr3d

hero member

Activity: 914

Merit: 500

Totes subbing to this thread. I have the money, just waiting for the results

jjiimm_64

legendary

Activity: 1876

Merit: 1000

I have a 4x7970 rig. would love to test.

simplecoin

sr. member

Activity: 406

Merit: 250

Nice work for sure! The more 7970 kernels the better

Diapolo

hero member

Activity: 772

Merit: 500

If all keeps this smooth, a release is just around the corner ... stay tuned.

Dia

simplecoin

sr. member

Activity: 406

Merit: 250

got a 1 card rig if you need it.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: sveetsnelda on January 27, 2012, 06:05:39 PM

Same story. Have a 4 card rig and would be glad to help.

PM sent, thanks!

Dia

sveetsnelda

hero member

Activity: 642

Merit: 500

Same story. Have a 4 card rig and would be glad to help.

Topic: DiaKGCN kernel for CGMINER + Phoenix 2 (79XX / 78XX / 77XX / GCN) - 2012-05-25 - page 5. (Read 27839 times)