Dime
Thanks for your feedback on everything, I'll try to respond to your comments:
So I have a handful of mining rigs running 4 cards each (sempron 140/5850's).
At the current moment, I have a heavily editted linuxcoin that I pxeboot and automatically starts mining with a cascade of screens that report back to the head node and I have a watch window over each miner, etc. The drawback of my approach is that it relies on 1. nfs for loading user/pass database or 2. hard packing it into squashfs. I can see how using smartcoin might be an easier approach assuming it worked the way it advertised. In testing, I've found several bugs/issues and I'd be grateful if you all can confirm whether these are things being worked on or acknowledged.
*) In my testing, latest phoenix, poclbm, and smartcoin was used and tested with ati 11.6 and 11.8 drivers.
1) When handling multiple machines, I constantly have the windows for remote machines not update after a while. This is not a connection problem. It's just smartcoin that does this.
I'm currently in the middle of building another mining rig right now. I only had one rig when the multi-machine code was written and my testing time on other people's rigs was very limited. I'm going to continue to improve the multi-machine stuff once I get my other rig in place (should be about a week from now)
2) I haven't done enough testing yet to see what causes the problem (and this is why I'm reaching out to see if others have similar issues), but CPU usage is inconsistent.
How inconsistent is the CPU usage for you? Mine normally varies by about .2 or so
On my local machines (not dedicated mining rigs), I run a 4 window screen with phoenix instances with low aggression in each. They each use .3% cpu and the computers are usable.
When I set these machines up as remote machines from the dedicated smartcoin machine however, they run only 3 phoenix instances (exact same path, also with low aggression) and phoenix takes up 33% cpu.
With 2 pheonix instances, it uses up 50% each. Both of these cause my local machines to lag out the gui and make them unusable by the housemates.
On a remote machine, there is no extra code being executed - the only difference should be the extra communication overhead, which should be fairly minimal on the CPU load. Though Its not 100% clear to me what you are saying above...
On dedicated mining rigs, I have them running 4 phoenix instances (on 4 cards) and it eats up .7% cpu, so again, negligible.
When I set up smartcoin however, I ideally wanted them to each run 3 instances, so I set it up to run 12 phoenix instances. Each instance took up max % available cpu and 1 of the instances was always killed. When it was killed, it resulted in skewed numbers in the summary window because it did not show it was killed. For instance:
miner00: [133 mh/s] (not really running but still showing 133.3 mh/s)
miner01: [133 mh/s]
miner02: [133 mh/s]
miner03: [133 mh/s]
miner04: [200 mh/s]
miner05: [133 mh/s]
miner06: [133 mh/s]
miner07: [133 mh/s]
miner08: [200 mh/s]
miner09: [133 mh/s]
miner10: [133 mh/s]
miner11: [133 mh/s]
The total then showed 1733 mh/s instead of the correct 1600 mh/s because it couldn't tell what was going on. When I load up the miner screen, I can press o to reload that window and have it relaunch the miner. As soon as I do this, another window (usually 1 or 2 windows after that one) gets killed. It's not hung. It says Killed on the window so it's smartcoin killing the processes.
Regarding CPU usage, smartcoin will definitely use more CPU than running instances manually. This is because it is running a loop on each instance, monitoring for failover, lockup, etc - which is pretty CPU intensive. Under Edit Settings->Machine Settings there is a "Status screen loop delay" setting that you can adjust for each machine that will greatly help your CPU load. It injects a delay in each loop iteration so that you don't pound the CPU so much. I usually run mine at 5 seconds, but you can go as high as you want (of course your visual refreshes will be slower as well)
Regarding a killed process, its definitely not smartcoin killing it - smartcoin will only kill the entire screen session, and start a new one - it has no code to kill only a single instance (it sould be something at the OS level which kills it..). Whats the memory usage when this happens?
If your machines are secure (no wallet.dat or anything personal laying around) and could give me some temporary ssh access, I'd be more than happy to help you figure out what is happening
So:
a. Summary window needs work on updating. And also detecting when miners are killed (especially when smartcoin is the one doing it).
b. Why is the cpu being eaten up? Because it works properly when I launch it from my scripts, but when smartcoin launches it (with the exact same parameters), it eats up more cpu.
c. Does anyone have any experience running smartcoin with 4+ cards and 3+ workers? (12+ phoenix instances) on a single sempron?
Thanks for your time, it's pretty impressive work, but so far, my custom image that I whipped up over a weekend does a far better job with less bloat: 0.0? load averages as opposed to 1.5+ consistent. SmartCoin will be great if it gets to where it needs to be but it still needs a bit to get there.
a) Yes, some work is still needed - though as I've mentioned earlier, smartcoin is definitely not killing the miner instance.
b) see if the Status screen loop delay setting helps
c) I do 3 cards with 4 workers very often - and I've personally tested 48 instances on a single-core p4 (3 cards * 16 instances each) with no problems at all (other than very high CPU usage, which would be expected). Also, my mining rig (soon to be more than one rig) is dedicated, and therefore have no experience in keeping a usable GUI while mining. Have you messed with the phoenix aggression setting at all?
I was looking over a small part of the code, specifically the status window and I think there's lots of things that can be trimmed to make smartcoin be a bit faster.
A good example is this part:
I'll be messing with optimizations pretty soon, I'll definitely give this one a good look and do some actual benchmarking.
A couple more bugs:
1. If I don't split the screen fast enough when smartcoin loads up, and focus each window of the screen, I don't see the timestamp at the top (which is important to tell if smartcoin froze up).
2. When smartcoin creates its persistent ssh connections, it should do so using screen, or use a pid file so it can kill old ones. After running smartcoin for a while,
I had this many:
1) Can you explain what you mean by split the screen?
2) There are apparently some issues left to resolve with the persistent SSH connections, and this is undoubtedly why your remote miner status screens stop refreshing after a while. This will be my first priority once I get my second miner put together.''
Thanks again for all of the feedback!