Author

Topic: Harware Watchdog via USB/Serial for Debian (Read 4287 times)

full member
Activity: 124
Merit: 100
July 27, 2011, 01:08:57 AM
#15
Yes, it's more tolerant and especially it restarts much faster in such a situation as power failure or hardware reset.

I don't personally know coinlinus, but just type "mount" in a terminal window. If you see "ext2" in the line that has the lone slash, you're not using a journaling file system. If you see "ext3" or "ext4", you are on a journaling system. In most modern Linux setups the latter is the default, so your chances are pretty good.

(If you see neither of the three, post here again with the full output from "mount"; a few Linuxes use other filesystems than those from the standard "ext" series).

Excuse me, but journaling fs will kill the usb flash memory with persistent installation after one week, maximum two ...
full member
Activity: 154
Merit: 102
I once played a thousand rounds of Russian roulette, too, and not once have I ever had a serious problem.

I would have to guess that you either are extremely fortunate, or mistaken. I've actually worked with computer hardware, and I've lost count of the number of components which were damaged by an unexpected loss of power. Usually crappy low end power supplies. And that's not even getting into filesystem corruption. If you've never seen a corrupted hard drive from a Windows box, you've likely either never used a computer or you're extremely anal about taking care of it.

Perhaps the damage was cause by a spike that caused the unexpected loss of power.  I've had that happen plenty of times.  In fact, right now I have a Boxee which can only work on WiFi because a damn lightning strike blew out its wired adapter.  And no, I have never had a hard drive become corrupted by an unexpected shutdown, and really I don't even see how that could happen unless you are rewriting the geometry of the drive when it happens.  Under any file system, if a file failed writing before a commit, you'd simply lose the file.  If the file were mission critical to the operating system, then you might have issues booting, but that would only happen during a system update.  Even then, it would be unlikely.

My statement was more of an introspection than anything.  That is a standard axiom of techs, and upon reflection of my experiences I wonder how valid it actually is. 

Also, granted you don't know me so of course you get a pass, the insinuation that I never used a computer or I'm anal about taking care of it is laughable.  I've been programming since I was 9, I ran a BBS in my teens-(5 geek points if you even know what that means), and I run a network at home that most small businesses would envy.  I know my shit Smiley
hero member
Activity: 588
Merit: 500

/live/cow (which I would guess is the persistent overlay) is journaled. The rest is not.
Is that good?  Im guessing the persistence file is the only thing that changes while the system is running. I do remember having to add a persistence file to the usb stick image. So should I be okay resetting this system forcefully?

There's always some risk to a hard reset and you're better off to do a normal shutdown and reboot whenever possible.

You know, I have heard that my entire life.  I have been working on computers in a fairly advanced capacity for over 2 decades, and I must have done tens of thousands of hard resets, either intentional or forced, and not once have I ever had a serious problem.  

I once played a thousand rounds of Russian roulette, too, and not once have I ever had a serious problem.

I would have to guess that you either are extremely fortunate, or mistaken. I've actually worked with computer hardware, and I've lost count of the number of components which were damaged by an unexpected loss of power. Usually crappy low end power supplies. And that's not even getting into filesystem corruption. If you've never seen a corrupted hard drive from a Windows box, you've likely either never used a computer or you're extremely anal about taking care of it.
full member
Activity: 154
Merit: 102

/live/cow (which I would guess is the persistent overlay) is journaled. The rest is not.
Is that good?  Im guessing the persistence file is the only thing that changes while the system is running. I do remember having to add a persistence file to the usb stick image. So should I be okay resetting this system forcefully?

There's always some risk to a hard reset and you're better off to do a normal shutdown and reboot whenever possible.

You know, I have heard that my entire life.  I have been working on computers in a fairly advanced capacity for over 2 decades, and I must have done tens of thousands of hard resets, either intentional or forced, and not once have I ever had a serious problem. 
hero member
Activity: 588
Merit: 500

/live/cow (which I would guess is the persistent overlay) is journaled. The rest is not.
Is that good?  Im guessing the persistence file is the only thing that changes while the system is running. I do remember having to add a persistence file to the usb stick image. So should I be okay resetting this system forcefully?

There's always some risk to a hard reset and you're better off to do a normal shutdown and reboot whenever possible.
sr. member
Activity: 302
Merit: 250

/live/cow (which I would guess is the persistent overlay) is journaled. The rest is not.
Is that good?  Im guessing the persistence file is the only thing that changes while the system is running. I do remember having to add a persistence file to the usb stick image. So should I be okay resetting this system forcefully?
hero member
Activity: 588
Merit: 500
Heres what I got from mount:
Code:
user@linuxcoin:~$ mount
aufs on / type aufs (rw)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,size=5242880,mode=755,size=5242880,mode=755)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=755,size=10%,mode=755)
tmpfs on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880,mode=1777,size=5242880,mode=1777)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=20%,mode=1777)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,size=20%,mode=1777,size=20%,mode=1777)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620,gid=5,mode=620)
/dev/sda1 on /live/image type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=cp437,iocharset=utf8,shortname=mixed,errors=remount-ro)
/dev/loop1 on /live/cow type ext4 (rw,noatime,user_xattr,acl,barrier=1,data=ordered)
tmpfs on /live type tmpfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
I am running off a usb stick which may mean its not a "standard" install.

/live/cow (which I would guess is the persistent overlay) is journaled. The rest is not.
sr. member
Activity: 302
Merit: 250
Heres what I got from mount:
Code:
user@linuxcoin:~$ mount
aufs on / type aufs (rw)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,size=5242880,mode=755,size=5242880,mode=755)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=755,size=10%,mode=755)
tmpfs on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880,mode=1777,size=5242880,mode=1777)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=20%,mode=1777)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,size=20%,mode=1777,size=20%,mode=1777)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620,gid=5,mode=620)
/dev/sda1 on /live/image type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=cp437,iocharset=utf8,shortname=mixed,errors=remount-ro)
/dev/loop1 on /live/cow type ext4 (rw,noatime,user_xattr,acl,barrier=1,data=ordered)
tmpfs on /live type tmpfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
I am running off a usb stick which may mean its not a "standard" install.
member
Activity: 78
Merit: 10
Also how do I know if Im using a journaling file system? Im running coinlinux (debian from a USB stick). Im guessing a journaling file system is more tolerant of random restarts?

Yes, it's more tolerant and especially it restarts much faster in such a situation as power failure or hardware reset.

I don't personally know coinlinus, but just type "mount" in a terminal window. If you see "ext2" in the line that has the lone slash, you're not using a journaling file system. If you see "ext3" or "ext4", you are on a journaling system. In most modern Linux setups the latter is the default, so your chances are pretty good.

(If you see neither of the three, post here again with the full output from "mount"; a few Linuxes use other filesystems than those from the standard "ext" series).
sr. member
Activity: 302
Merit: 250

... Of course you had better use a journaling file system with such a setup. My stripped down Windows 7 setup with NTFS will be mining again just two minutes and a few seconds after the original failure.

As for a serial port - if your mainboard doesn't have a built in serial port or a serial pin header (some still do even today) you can get an USB to serial dongle. Check that it uses a chip that's supported on Linux - most should work these days. Then just point your software at /dev/ttyUSB0 or whatever it's called.
Good call on the header on the MB, Im going to use that and see what I can come up with. I tried looking on ebay, but seems like the items are expensive now, so I will have to wait. 

Also how do I know if Im using a journaling file system? Im running coinlinux (debian from a USB stick). Im guessing a journaling file system is more tolerant of random restarts?
full member
Activity: 154
Merit: 102
I picked up a bunch of iBoot devices on ebay for under $50 each.  They intercept the power, and allow you to reboot machines manually through a web interface or automatically via an auto ping feature.  When the computer locks up, it automatically reboots, and if a driver craps out (and I can't ssh to the box), I can always reboot it manually
member
Activity: 78
Merit: 10
Look at http://www.quancom.de they offer fairly affordable and reliable USB, PCI and PCI-e watchdogs. Their cheapest USB watchdog costs just €50 or so. I got an older PCI card made by them off eBay for just €25.

I'm not triggering the watchdog in the kernel or a background demon process. Instead I modfied the Phoenix miner to trigger the card every time a result is accepted by a pool (the modification is less than 20 lines of code). Together with a timeout of a bit more than one minute, this rather brutal approach works splendidly - no matter what fails (except power obviously), as soon as no results go to the pool for a while, the box does a hard reboot. Of course you had better use a journaling file system with such a setup. My stripped down Windows 7 setup with NTFS will be mining again just two minutes and a few seconds after the original failure.

As for a serial port - if your mainboard doesn't have a built in serial port or a serial pin header (some still do even today) you can get an USB to serial dongle. Check that it uses a chip that's supported on Linux - most should work these days. Then just point your software at /dev/ttyUSB0 or whatever it's called.

sr. member
Activity: 302
Merit: 250
You could use something like this that looks at the GPU usage from aticonfig directly (I wish I could link back to original thread where I got this but I lost the link)
Code:
#!/bin/bash
export DISPLAY=`cat /home/user/.display`
pc=`ps waxuf | grep miner1.sh -c`
ld=`aticonfig --odgc --adapter=0 | grep "GPU load" | cut -c 30-35 | cut -d % -f 1`
if [[ $pc -lt 2 ||  $ld -lt 50 ]] ; then
 killall -KILL miner1.sh
 nohup lxterminal --title miner1 --command /home/user/miner1.sh &
fi
So I think the software side would be pretty easy to implement (the guy even has a sample script on his site on the link above). but does anyone know if there is any way to pulse a USB pin from linux?
hero member
Activity: 588
Merit: 500
The problem with a watchdog for a mining rig is that it's rarely Linux that goes to hell, but the crappy AMD Catalyst video driver. The rest of the system keeps on going, and the watchdog would never do anything. A software watchdog is needed in this case, something that watches your miner's output, determines if the system has stopped mining, and takes corrective action.
sr. member
Activity: 302
Merit: 250
I found something that may be useful to many people here, a hardware watchdog, seems pretty easy to make and pretty cheap.

http://linuxfocus.org/English/July2002/article239.shtml

I would like to build it and try it out, Im just worried that there is no way to connect this to current motherboards as it goes through serial cable. Is there some thing that one can use to add a serial port to a linux box? Or maybe control the pins in a serial cable directly to send pulses to a device like this?
Jump to: