Pages:
Author

Topic: Smartcoin Linux mining administration. [MULTI-MACHINE SUPPORT NOW IN!] - page 8. (Read 105029 times)

newbie
Activity: 42
Merit: 0
Feature request:  If lm-sensors is installed, it would be nice if the CPU temperature was displayed alongside the GPU temps. 

+1 for this.
And I can't wait for the multi-machine support for this to be ready! Excellent job so far.

I've had a bit of a problem trying to figure out why my failover settings aren't working.
I had the failover thresholds (apparently there's 3 of them?) set to 1 (not sure what exactly they're measured in. I was expecting it'd be how many minutes of downtime, but i guess not) just to test and I had the failover profile selected.

The pool went down and I waited just under 10 minutes and it still sat there without changing to another profile.
Any ideas why this isn't working? Sad

Other than that though, great job!

There was a push that had a issue in it this should be resolved in the latest may want to try and update should clear up the multiple threshholds ( which will prevent failover from working )

--Enzo
hero member
Activity: 481
Merit: 502
Feature request:  If lm-sensors is installed, it would be nice if the CPU temperature was displayed alongside the GPU temps. 

+1 for this.
And I can't wait for the multi-machine support for this to be ready! Excellent job so far.

I've had a bit of a problem trying to figure out why my failover settings aren't working.
I had the failover thresholds (apparently there's 3 of them?) set to 1 (not sure what exactly they're measured in. I was expecting it'd be how many minutes of downtime, but i guess not) just to test and I had the failover profile selected.

The pool went down and I waited just under 10 minutes and it still sat there without changing to another profile.
Any ideas why this isn't working? Sad

Other than that though, great job!
sr. member
Activity: 349
Merit: 250
BTCPak.com - Exchange your Bitcoins for MP!
Feature request:  If lm-sensors is installed, it would be nice if the CPU temperature was displayed alongside the GPU temps. 
full member
Activity: 238
Merit: 100
Update 546e now available
- Fixes the hardstatus line at the bottom of the miner screen session

I don't know why - rejection with this version is around  10%
in all tested pools. Before 500e it was 0.2-0.5%


Code:
Smartcoin r546 Wed Jul 27 09:15:11 EDT 2011
---------------------------------------------------------
Host: localhost
G0: Temp °C: 70.00 Load: 99%
G1: Temp °C: 71.00 Load: 99%
G2: Temp °C: 74.00 Load: 99%
G3: Temp °C: 70.00 Load: 99%
CPU Load: 2.38%

Profile: Failover
--------BTCGuild--------
G0:     [207.76 MHash/s] [100 OK] [5 BAD] [5.000% BAD]
G1:     [207.67 MHash/s] [96 OK] [6 BAD] [6.250% BAD]
G2:     [207.77 MHash/s] [82 OK] [8 BAD] [9.756% BAD]
G3:     [207.78 MHash/s] [86 OK] [8 BAD] [9.302% BAD]
CP:     [20.4 MHash/s] [13 OK] [6 BAD] [46.153% BAD]
Total : [851.38 MHash/s] [377 OK] [33 BAD] [8.753% BAD]

Grand Total : [851.38 MHash/s] [377 OK] [33 BAD] [8.753% BAD]

Can you post .tar stable version before multi machine support
and place it to OP ?

I know there is a stable version. But stable will move sometimes to higher numbers.
We can preserve it. I think majority people using one PC anyway.



Smartcoin can't do anything to influence or change rejection rates (aside from calculation erros, but that would only change perceived rejection)- it merely starts and stops miners based on a set of rules.  FYI, all the new multi-machine code (which isn't even fully active yet) does is determine if a command needs to run on the local machine or on a remote connection over SSH.  It changes nothing about the miners, it only chooses where to launch them, and then launches them.

newbie
Activity: 56
Merit: 0
Update 546e now available
- Fixes the hardstatus line at the bottom of the miner screen session

I don't know why - rejection with this version is around  10%
in all tested pools. Before 500e it was 0.2-0.5%


Code:
Smartcoin r546 Wed Jul 27 09:15:11 EDT 2011
---------------------------------------------------------
Host: localhost
G0: Temp °C: 70.00 Load: 99%
G1: Temp °C: 71.00 Load: 99%
G2: Temp °C: 74.00 Load: 99%
G3: Temp °C: 70.00 Load: 99%
CPU Load: 2.38%

Profile: Failover
--------BTCGuild--------
G0:     [207.76 MHash/s] [100 OK] [5 BAD] [5.000% BAD]
G1:     [207.67 MHash/s] [96 OK] [6 BAD] [6.250% BAD]
G2:     [207.77 MHash/s] [82 OK] [8 BAD] [9.756% BAD]
G3:     [207.78 MHash/s] [86 OK] [8 BAD] [9.302% BAD]
CP:     [20.4 MHash/s] [13 OK] [6 BAD] [46.153% BAD]
Total : [851.38 MHash/s] [377 OK] [33 BAD] [8.753% BAD]

Grand Total : [851.38 MHash/s] [377 OK] [33 BAD] [8.753% BAD]

Can you post .tar stable version before multi machine support
and place it to OP ?

I know there is a stable version. But stable will move sometimes to higher numbers.
We can preserve it. I think majority people using one PC anyway.

full member
Activity: 238
Merit: 100
Update 546e now available
- Fixes the hardstatus line at the bottom of the miner screen session
sr. member
Activity: 349
Merit: 250
BTCPak.com - Exchange your Bitcoins for MP!
+1 for option to exclude profiles from fallover (or better yet, just the ones specified).  I have some "experimental" profiles that I would like it to stay away from.

Dan
full member
Activity: 238
Merit: 100
Update r544e now available:
- Many changes to different areas of the code to support multiple machines.  Though multiple machine support isn't fully included yet, the "backend" functions are now machine-aware This puts multi-machine support probably near 98% complete. Just a couple of key components remain....  These were amongst the biggest changes needed yet for multi-machine support, and aside from some very small odds-and-ends, the only thing left to implement is a "Configure Machines" config screen option Smiley


NOTE: There is currently one side-effect that I haven't fixed yet.  The miner screen session's (screen -r miner) status line is all messed up, though it doesn't affect anything.  I'm going to continue to experiment until i get the status line back, but its really not even needed and only a visual artifact if you are looking at the miner screen session for now.
newbie
Activity: 42
Merit: 0
Hello,

  I am not sure if it is something I am going wrong but I am unable to trigger the failover

I have a profiles setup to mine on 3 different server on the same pool and I have tried simulating an outage using /etc/hosts file but it never fails over is there any specific data that might help in locating this.. ? it is a fresh smartcoin install

Smartcoin r495s

here is the failover order
1 was a deleted profile

2) BTCGuild All
3) BTCGuild US
4) BTCGuild USWest
5) BTCGuild USEast
6) BitClockers


Found the solution apparently I ended up with 2 menu options / database entries for failover threshold that was confusing the system as soon as I deleted one of them failover started working.

--Enzo
full member
Activity: 238
Merit: 100
Just a note not to do an update for a while (I'm uploading a bunch of VERY experimental multi-machine changes that *could* leave your inoperable for just a little while - I'll post back here in a little bit once they are all in and tested, at which time an experimental update will be relatively safe)
full member
Activity: 238
Merit: 100
Not sure if this is doable, but whenever there's a failover or profile change, Smartcoin shuts down all miners and start them up again. Would it be possible for Smartcoin to keep track of which miners are running and when there's a profile update or whatever, it would only stop and/or start the miners affected by the change?

For example, if I'm mining with 3 GPUs at bitcoins.lc and there's a failover to deepbit, Smartcoin will shut down the 3 bitcoins.lc miners and start up 6 new miner instances (3 for bitcoins.lc and 3 for deepbit). The restarting of the bitcoins.lc miners in this case is redundant. If connectivity with bitcoins.lc is restored, Smartcoin will kill all 6 miners and start up 3 new bitcoins.lc miners, when in fact, just killing the deepbit miners would suffice.

I've already done some experiments, and I will be going this route sometime in the future ("hot reloading") - its something I have been thinking about for a while, just haven't got around to it yet as the current method works just fine, but eventually this would be a great optimization!
full member
Activity: 238
Merit: 100
Have you ever considered an optional ncurses interface that would allow for some fairly nice console displays? I realize that this would cause for another dependency however if it was optional then it would give the end use the option of taking advantage of that or not.

--Enzo

I think its a good idea... Once all core functionality is in, I may look into this!
full member
Activity: 168
Merit: 100
I'll have a steak sandwich and a... steak sandwich
Not sure if this is doable, but whenever there's a failover or profile change, Smartcoin shuts down all miners and start them up again. Would it be possible for Smartcoin to keep track of which miners are running and when there's a profile update or whatever, it would only stop and/or start the miners affected by the change?

For example, if I'm mining with 3 GPUs at bitcoins.lc and there's a failover to deepbit, Smartcoin will shut down the 3 bitcoins.lc miners and start up 6 new miner instances (3 for bitcoins.lc and 3 for deepbit). The restarting of the bitcoins.lc miners in this case is redundant. If connectivity with bitcoins.lc is restored, Smartcoin will kill all 6 miners and start up 3 new bitcoins.lc miners, when in fact, just killing the deepbit miners would suffice.
newbie
Activity: 42
Merit: 0
Have you ever considered an optional ncurses interface that would allow for some fairly nice console displays? I realize that this would cause for another dependency however if it was optional then it would give the end use the option of taking advantage of that or not.

--Enzo
full member
Activity: 238
Merit: 100
Update r513e now available
-Fixed small typo in the generation of the autodonation fieldarray. ThanksEnzomatrix!
newbie
Activity: 42
Merit: 0
in r512e there appears to be an issue with the status display of the auto donate function and it keeps flashing an error on refresh about =python command not found the error is reporting line 355 of smartcoin_ops.sh as the culprit

--Enzo
full member
Activity: 238
Merit: 100
Update r512e now available:
- CPU load is now determined by /proc/loadavg 1-minute average
- Huge lockup detection optimization..  30%-50% faster status loops and lower CPU load as a result!
- New "<<>> instance status output. This tells you that the miner failed to load for some reason (you can then verify with screen -r miner)
full member
Activity: 168
Merit: 100
I'll have a steak sandwich and a... steak sandwich
I'll experiment  with the /proc load average tonight (just got to decide whether to go with 1 minute or 5 minute average, though I'm leaning towards 1 minute) Additionally,  it would get rid of another dependency.
+1 for going with the 1 minute average.
newbie
Activity: 42
Merit: 0
If you are going to utilize the load / cpu averages make sure that the thresholds are adjustable as my mining box also runs a minecraft server so my cpu runs at around 30% - 35% pretty constantly.
newbie
Activity: 42
Merit: 0
kennel -
Your settings look good to me.  If you get any false triggers of failover and/or lockup, you can always increase them 1 or 2 at a time until you find a good balance.

I'll experiment  with the /proc load average tonight (just got to decide whether to go with 1 minute or 5 minute average, though I'm leaning towards 1 minute) Additionally,  it would get rid of another dependency.

Regarding better descriptions of the settings, I'm probably going to leave it documentation in the future.  The settings system is very dynamic in that I just add an entry into the database and it automagically appears and just works - this makes development a little easier on me.  I may rethink it a bit once I can get out of beta, as I won't be adding entries so often then and it may be worth extending the system to be more verbose.

If you are using the database for that then you could add a description field utilize that for basic info with out too much modification to your existing code ( just a thought ) keep up the great work love the code so far

--Enzo
Pages:
Jump to: