First off excellent miner and great work. Java has far better multi-platform support than python, and the boost libraries that that pycl requires totally suck and are huge. I hate boost.
Anyhow, I'm consistently running into a strange issue with my dual 5830 setup using your miner. After 4-5 hours of mining, one of the cores (random) will just stop. The miner won't report any errors and the cumulative hashing average will continue to slowly drop. In fact I get no other errors, can use all the aticonfig tools to manage the cores and simply restarting the miner seems to fix the problem. The cores are overclocked (temps are ~70C under load) and I poll temps every minute using aticonfig.
Would it be worth adding some code to detect this stall? Right now I just parse the GPU load output from aticonfig and restart the miners if it drops below a certain threshold.
I'm using the latest drivers and SDK. I could not get the 2.1 SDK to work for the life of me in my environment, always "No OpenCL devices found" (or something to that effect).
Also, to anyone else using the 5830: I seem to get best results with: -w 128 -f 1 -v 18
Thanks again!
Stalling cores could very well be a "normal" driver bug. It happens to some OpenCL users (isn't just my miner, but requires multiple GPUs), and I'm not sure why. Although almost every report I've heard I think involved using aticonfig to poll temps, so it might be related.
I used to have code that detected the condition, but I can't stop the threads for that GPU and it can lead to needing to reboot the machine because everything locked up if I try to "solve" it.
I'm having the same issue with random GPU stalls on a dual 6990 system. I don't remember if I noticed it before I started polling temps with aticonfig, so I turned off the polling to see if it helps. If not, I would very much like DiabloMiner to go ahead and reboot the machine (or run a script, or just exit) if it can detect the condition.