[shell] Check if program is running with bash shell script?

This is an example of a bash script which checks for some running process (daemon or service) and does specific actions (reload, sends mail) if there is no such process running.

check_process(){
        # check the args
        if [ "$1" = "" ];
        then
                return 0
        fi

        #PROCESS_NUM => get the process number regarding the given thread name
        PROCESS_NUM='ps -ef | grep "$1" | grep -v "grep" | wc -l'
        # for degbuging...
        $PROCESS_NUM
        if [ $PROCESS_NUM -eq 1 ];
        then
                return 1
        else
                return 0
        fi
}

# Check whether the instance of thread exists:
while [ 1 ] ; do
        echo 'begin checking...'
        check_process "python test_demo.py" # the thread name
        CHECK_RET = $?
        if [ $CHECK_RET -eq 0 ]; # none exist
        then
                # do something...
        fi
        sleep 60
done

However, it doesn't work. I got "ERROR: Garbage option." for the ps command. What's wrong with these scripts? Thanks!

This question is related to shell ps

The answer is


PROCESS="process name shown in ps -ef"
START_OR_STOP=1        # 0 = start | 1 = stop

MAX=30
COUNT=0

until [ $COUNT -gt $MAX ] ; do
        echo -ne "."
        PROCESS_NUM=$(ps -ef | grep "$PROCESS" | grep -v `basename $0` | grep -v "grep" | wc -l)
        if [ $PROCESS_NUM -gt 0 ]; then
            #runs
            RET=1
        else
            #stopped
            RET=0
        fi

        if [ $RET -eq $START_OR_STOP ]; then
            sleep 5 #wait...
        else
            if [ $START_OR_STOP -eq 1 ]; then
                    echo -ne " stopped"
            else
                    echo -ne " started"
            fi
            echo
            exit 0
        fi
        let COUNT=COUNT+1
done

if [ $START_OR_STOP -eq 1 ]; then
    echo -ne " !!$PROCESS failed to stop!! "
else
    echo -ne " !!$PROCESS failed to start!! "
fi
echo
exit 1

You can achieve almost everything in PROCESS_NUM with this one-liner:

[ `pgrep $1` ] && return 1 || return 0

if you're looking for a partial match, i.e. program is named foobar and you want your $1 to be just foo you can add the -f switch to pgrep:

[[ `pgrep -f $1` ]] && return 1 || return 0

Putting it all together your script could be reworked like this:

#!/bin/bash

check_process() {
  echo "$ts: checking $1"
  [ "$1" = "" ]  && return 0
  [ `pgrep -n $1` ] && return 1 || return 0
}

while [ 1 ]; do 
  # timestamp
  ts=`date +%T`

  echo "$ts: begin checking..."
  check_process "dropbox"
  [ $? -eq 0 ] && echo "$ts: not running, restarting..." && `dropbox start -i > /dev/null`
  sleep 5
done

Running it would look like this:

# SHELL #1
22:07:26: begin checking...
22:07:26: checking dropbox
22:07:31: begin checking...
22:07:31: checking dropbox

# SHELL #2
$ dropbox stop
Dropbox daemon stopped.

# SHELL #1
22:07:36: begin checking...
22:07:36: checking dropbox
22:07:36: not running, restarting...
22:07:42: begin checking...
22:07:42: checking dropbox

Hope this helps!