0

So basically I have one script that is keeping a server alive. It starts the server process and then starts it again after the process stops. Although sometimes the server becomes non responsive. For that I want to have another script which would ping the server and would kill the process if it wouldn't respond in 60 seconds.

The problem is that if I kill the server process the bash script also gets terminated.

The start script is just while do: sh Server.sh. It calls other shell script that has additional parameters for starting the server. The server is using java so it starts a java process. If the server hangs I use kill -9 pid because nothing else stops it. If the server doesn't hang and does the usual restart it gracefully stops and the bash script start second loop.

6
  • Have you considered using a real process supervision system (daemontools, runit, monit, god, upstart, systemd, etc) rather than rolling your own? Some include watchdog support built-in, which means you get the functionality of the second script more tightly integrated. (example: systemd's watchdog mechanism involves having your process write to a named pipe to indicate that it's actively responsive, with a forced restart triggered if no write has happened in a configurable timeline). Commented Aug 30, 2016 at 15:14
  • 1
    Regardless, if you're having your script killed, you're Doing It Wrong, and unless you show us how you're doing it (for all the relevant values of "it"), we can't tell you what's wrong. Please update your question to include a MCVE -- the smallest necessary set of code code anyone reading this can copy/paste/run to get the same problem. Commented Aug 30, 2016 at 15:19
  • @CharlesDuffy Updated. And I don't really need any tools for monitoring as it is quite simple task and it's only for personal use. Commented Aug 30, 2016 at 15:28
  • This still isn't adequate to reproduce the issue. How do you get the pid that you run kill -9 pid against? (BTW, best practice is not to go straight to a SIGKILL, but to run a SIGTERM, wait for a while, and escalate only to KILL if the process isn't able to shut itself down in the interim). Commented Aug 30, 2016 at 15:44
  • (Also, note that sh Server.sh is not running Server.sh as a bash script, but running it as a POSIX sh script; given as this question is tagged bash, is that really what you want?) Commented Aug 30, 2016 at 15:45

1 Answer 1

3

Doing The Right Thing

Use a real process supervision system -- your Linux distribution almost certainly includes one.

Directly monitoring the supervised process by PID

An awful, ugly, moderately buggy approach (for instance, able to kill the wrong process in the event of a PID collision) is the following:

while :; do
  ./Server.sh & server_pid=$!
  echo "$server_pid" > server.pid
  wait "$server_pid"
done

...and, to kill the process:

#!/bin/bash
#      ^^^^ - DO NOT run this with "sh scriptname"; it must be "bash scriptname".

server_pid="$(<server.pid)"; [[ $server_pid ]] || exit
# allow 5 seconds for clean shutdown -- adjust to taste
for (( i=0; i<5; i++ )); do
  if kill -0 "$server_pid"; then
    sleep 1
  else
    exit 0 # server exited gracefully, nothing else to do
  fi
done

# escalate to a SIGKILL
kill -9 "$server_pid"

Note that we're storing the PID of the server in our pidfile, and killing that directly -- thus, avoiding inadvertently targeting the supervision script.


Monitoring the supervised process and all children via lockfile

Note that this is using some Linux-specific tools -- but you do have on your question.

A more robust approach -- which will work across reboots even in the case of pidfile reuse -- is to use a lockfile:

while :; do
  flock -x Server.lock sh Server.sh
done

...and, on the other end:

#!/bin/bash

# kill all programs having a handle on Server.lock
fuser -k Server.lock
for ((i=0; i<5; i++)); do
  if fuser -s Server.lock; then
    sleep 1
  else
    exit 0
  fi
done
fuser -k -KILL Server.lock
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.