Hey everyone, I wanted to share some good results. Thanks to NameTaken I've been able to bump things up:
Just to reiterate, what I did exactly was first switch over to the optimized .BIN files with SGMiner (since I'm running 64bit windows I had to change the last digit in the file name from an "8" to a "4" for sgminer to use the bin). Then I loaded in the optimized .CL file (backed up old .cl file first). I let that run for a few days because I got busy, noticed about a 5-10Kh bump overall, up to an average 675-680Kh per card. Then as of last night I went ahead and flashed my cards with the bios that NameTaken pointed me to, and that finally allowed me to break the 700 barrier. The screenshot above is from this morning after the rig had been running for about 7 hours. (links for all files below)
It's interesting but my GPU 2 card has now become the best performer whereas before it was GPU 0. I'm getting a very reliable 720Kh out of it. Meanwhile GPU 3 is doing the worst at 685ish. I didn't have much time because I was getting really tired, but I did mess around a little with the gpu-engine settings and right now I have it set as "--gpu-engine 1065,1075,1080,1030". Yup, doesn't look like a good config does it? But that seems to be my sweet spot (so far). GPU 2 loves the extra engine speed, it jumps 10Kh when I move from 1065 to 1080. And it's the reverse for GPU 3, which either does nothing or goes down when I go over 1030. But I do need to play around with it a bit more, maybe next week.
I've been trying to find the optimal GPU temperature also. I notice that at 75C the cards definitely throttle back a bit. I have it set to 68C as a target right now but I may be going to far and wasting energy on fans (any thoughts?)
I'd still like to see if I can inch things up a bit faster than this. But I am very pleased with the results (and thanks to everyone who helped!). Later on I'm going to try and trouble shoot GPU 3 and also see if GPU 0 & 1 can't do more. Then I'd like to do some voltage experiments to decrease the power. I also have an average rejected share rate at about 3-4% which I think I could lower too (although I have no clue where to start).
Here is my current config that results in the screenshot above:
--lookup-gap 2 --thread-concurrency 8192,8192,8192,8192 -g 2 -I 13 -w 256 --auto-fan --temp-cutoff 90,90,90,90 --temp-overheat 85,85,85,85 --temp-target 68,68,68,68 --gpu-memclock 1500,1500,1500,1500 --gpu-engine 1065,1075,1080,1030 --expiry 1 --scan-time 1 --queue 0
I left out Powertune for now although I'm going to experiment and see if it changes anything. Also might experiment again with thread concurrency in the 11200 range. Will post back with results. Oh, and one last thing- I noticed that CGWatcher does some damage to my hashrate because it is always pinging the GPU's for information. I was able to solve this by changing the monitor from checking in every 10 seconds to every 180 seconds (every 3 minutes). This seems like a good fit because I still have the benefits of CGWatcher being able to reboot the miner if something goes wrong.
Comments / critique / whatever welcome.