bugfix: rework olsrd-watchdog to avoid race condition

The olsrd watchdog module truncates and writes the latest timestamp every 5 seconds. The olsrd-watchdog script could catch the file during write causing an instant restart.

Changed to want 3 failures before restarting olsrd reducing the chance of collision based restarts.

fixes BBHN->ticket:68
This commit is contained in:
Conrad Lara - KG6JEI 2014-11-03 19:59:34 -08:00
parent 5a39316a7e
commit 94981313f8
1 changed files with 13 additions and 7 deletions

View File

@ -3,17 +3,21 @@
# wait for the watchdog file to appear
while(not -e "/tmp/olsrd.watchdog") { sleep 15 }
$stamp = 0;
$failcount = 0;
$last_olsrstamp = 0;
while(1)
{
$last_stamp = $stamp;
$stamp = time;
chomp ($olsr = `cat /tmp/olsrd.watchdog`);
$olsr = 0 unless $olsr;
# avoid false restarts due to changes in system time
if($stamp - $last_stamp < 20 and $stamp - $olsr > 20)
if ( $olsr && $olsr ne $last_olsrstamp ){
$failcount = 0;
} else {
$failcount += 1;
}
if( $failcount >= 3 )
{
($uptime) = `cat /proc/uptime` =~ /^(\d+)/;
$date = `date`;
@ -21,5 +25,7 @@ while(1)
system "/etc/init.d/olsrd restart";
}
sleep 15;
$last_olsrstamp = $olsr;
sleep 10;
}