2

am facing some weird behavior with my bash script. It's basically a script that tries to ping a remote host a number of time if it fails the first time. I do this so as to rule out any false alert. I thought I would quickly achieve this by writing a recursive function that calls itself and attempts the ping again.

My problem is with the returned value. I've found out that the function returns the returned value multiple times corresponding to the number of times the recursion was made. This is very odd. For instance in my code below, the ip_up() function is supposed to return 1 for remote host up, and 0 for down. However, when the remote host is down, the function returns 0 twice which corresponds to the recursions made.

What could be the problem with my code or is this how bash works?

#!/bin/bash
    ip_up(){
            server_ip=$1
            trials=$2
            max_trials=2
            status=0
            echo "server ip is: $server_ip, trial $trials" >&2
            if ping -i 1 -c 3 "$server_ip" &> /dev/null
            then
                status=1
            else
                status=0
                while ((  "$trials" < "$max_trials" )); do
                    echo -e "$server_ip is down: Trial $trials, checking again after 1 sec" >&2
                    sleep 1
                    ((trials++))
                    ip_up "$server_ip" "$trials"
                done
            fi
            echo "$status"
    }

    status=$(ip_up "$ip" 1)
    echo -e "the returned status is: ====$status====\n"
   if [ "$server_status" -eq 0 ]; then
        msg="$timestamp: Server $hostname ($ip) is DOWN"; echo "$msg"
   fi

    <<'COMMENT'
    //results

     $ ./check_servers.sh
    checking box1(173.36.232.6)
    server ip is: 173.36.232.6, trial 1
    173.36.232.6 is down: Trial 1, checking again after 1 sec
    server ip is: 173.36.232.6, trial 2
    the returned status is: ====0
    0====

    ./check_servers.sh: line 41: [: 0
    0: integer expression expected
    Sat Jun  4 15:16:11 EAT 2016 box2 (173.36.232.7) is UP
    checking box2 (173.36.232.7)
    server ip is: 173.36.232.7, trial 1
    the returned status is: ====1====

    COMMENT
3
  • 2
    It's mostly a question of how recursion works in general, and therefore how Bash works when it implements recursion. Each recursive call to the function eventually returns; there is one return per invocation. If you don't want recursion and multiple returns, use iteration instead. In fact, you already have iteration in your code; it just iteratively recurses — which is pretty weird. Where you have if ping you should use a while loop which can test iteration count and ping status. The whole lot needs a serious rewrite. Commented Jun 4, 2016 at 14:21
  • Thanks @JonathanLeffler for your answer. It certainly makes sense and I've done a write based on your advise. For the sake of learning recursion in bash, how would you implement it? Commented Jun 6, 2016 at 5:12
  • 1
    With all due respect, I'd not implement the code recursively — period. Well, not unless it was to be written in LISP, when I might not have a choice. Learn recursion on a problem defined recursively. I'll look at providing an answer to show how I'd write the process, but it won't involve recursion. If you're adamant that you need recursion, find someone else to help. Commented Jun 6, 2016 at 5:26

2 Answers 2

2

I can't imagine many circumstances where I'd be using code with a one second delay in the loop often enough to make it worth writing as a function — I'd use a relatively straight-forward (iterative) script. However, it is far from impossible to turn the script into a function if you're sure that's a benefit to you; your circumstances are different from mine.

#!/bin/sh

[ $# = 1 ] || [ $# = 2 ] || { echo "Usage: $0 ip-address [max-trials]" >&2; exit 1; }
server_ip="$1"
maxtrials="${2:-2}"
trial=1

while echo "server: $server_ip, trial $trial" >&2
      ! ping -i 1 -c 3 "$server_ip" > /dev/null 2>&1 || exit 0
do
    trial=$(($trial + 1))
    [ "$trial" -gt "$maxtrials" ] && break
    echo "$0: $server_ip is down: checking again after 1 sec" >&2
    sleep 1
done

echo "$(date +'%Y-%m-%d %H:%M:%S'): Server $server_ip is DOWN"
exit 1

The first block of code sets up the controls, defaulting to 2 attempts.

The while loop control contains the echo and then attempts to ping the IP address (or host name). If the command succeeds (the host is pingable), then the ! ping status is false, so the || exit 0 is executed, and the script exits with a 0 status, indicating success (the host is pingable). If the command fails (the host is not pingable), then the ! ping status is true, so the || exit 0 is not executed, and the body of the loop is entered. It increments the trial number and breaks the loop if the limit is reached. Otherwise, it prints its message and sleeps and goes back to the start of the loop.

The end block is only reached if the exit 0 was not executed, so the ping failed and the server is 'down' (or non-existent). You then get a time-stamped message indicating that the server is down, and exit with a non-zero status to indicate failure.

There are probably a myriad other ways to do this. I'd probably be more consistent with the error messaging — for example, I might well save arg0="$(basename "$0" .sh)" and then use $arg0 as a prefix to all messages (or possibly add it after the timestamp). It's possible to adapt this to report that the server is up. The code works with POSIX shells, not just Bash (so dash accepts it, for example, as does Korn shell, but the Heirloom (Bourne) Shell doesn't because it doesn't like either $(…) or $((…))).

It would also be possible to write it as a simple counting loop which tests the status of ping, exiting on success, and doing the reporting and retry. However, it's tricky to avoid a last sleep 1 when the loop will exit without double testing the value of $trial. That isn't expensive at run-time, but it is a source of repetition and DRY — Don't Repeat Yourself — is a worthwhile principle to live up to.

#!/bin/bash

[ $# = 1 ] || [ $# = 2 ] || { echo "Usage: $0 ip-address [max-trials]" >&2; exit 1; }
server_ip="$1"
maxtrials="${2:-2}"

for ((trial = 1; trial <= maxtrials; trial++))
do
    echo "server: $server_ip, trial $trial" >&2
    if ping -i 1 -c 3 "$server_ip" > /dev/null 2>&1
    then exit 0
    elif [ "$trial" -lt "$maxtrials" ]
    then
        echo "$0: $server_ip is down: checking again after 1 sec" >&2
        sleep 1
    fi
done

echo "$(date +'%Y-%m-%d %H:%M:%S'): Server $server_ip is DOWN"
exit 1

I'm not entirely keen on that, but it works with Bash and Korn shell.

Converting the last script to a function is basically trivial — change the exit statements into return statements, and wrap a function start and end around it:

#!/bin/bash

function upip()
{
    [ $# = 1 ] || [ $# = 2 ] || { echo "Usage: $0 ip-address [max-trials]" >&2; return 1; }
    server_ip="$1"
    maxtrials="${2:-2}"

    for ((trial = 1; trial <= maxtrials; trial++))
    do
        echo "server: $server_ip, trial $trial" >&2
        if ping -i 1 -c 3 "$server_ip" > /dev/null 2>&1
        then return 0
        elif [ "$trial" -lt "$maxtrials" ]
        then
            echo "$0: $server_ip is down: checking again after 1 sec" >&2
            sleep 1
        fi
    done

    echo "$(date +'%Y-%m-%d %H:%M:%S'): Server $server_ip is DOWN"
    return 1
}

Saved in upip-func.sh, I read the function:

$ . upip-func.sh
$ upip www.google.com
server: www.google.com, trial 1
$ echo $?
0
$ upip ping.google.com
server: ping.google.com, trial 1
bash: ping.google.com is down: checking again after 1 sec
server: ping.google.com, trial 2
2016-06-06 00:35:18: Server ping.google.com is DOWN
$ echo $?
1
$ if upip www.google.com; then echo OK; else echo Fail; fi
server: www.google.com, trial 1
OK
$ if upip ping.google.com; then echo OK; else echo Fail; fi
server: ping.google.com, trial 1
bash: ping.google.com is down: checking again after 1 sec
server: ping.google.com, trial 2
2016-06-06 00:38:32: Server ping.google.com is DOWN
Fail
$
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks @jonathan. This is well explained. If I were to wrap this around a bash function say ip_up which is called at a certain part of the script, how do I test for success or failure? Can I for instance use if statement such as if ip_up; then echo "server is up"; fi.
You'd need to provide the 'ip-address' or hostname as an argument, but yes: if ip_up www.google.com; then echo server is up; fi. I used 'www.google.com' and 'ping.google.com' for positive and negative testing. You'd use return 0 in place of exit 0 (and return 1 in place of exit 1) in a function, of course. The rest should be substantially unchanged, I believe.
I've replaced exit with return, but it doesn't return anything. If I want to return a value, I've to echo it. I think I've just have to go with the if test on the function. Thanks a lot.
Yes, it does: it returns either 0 (success) or 1 (failure) as the status, which can be tested by if up_ip www.google.com; then echo "OK"; fi. It's a testable status, not a character string in the output. You don't see any output from a test command [ do you? You shouldn't expect to see output from the up_ip function either — unless you decide to keep all the verbosity going to standard error.
1

Your function is not "returning" anything. It prints a value to stdout, and each invocation will do that.

If you want to simulate a function return with this mechanism, you need to capture and resend the value:

Bash functions return an exit status, and this works as you might expect (as long as you expect 0 to be success). If you don't specify otherwise, the return value is that of the last command. So the following would work:

tryn() {
  if (($1 == 0)); then return 2; fi
  "$@" || tryn $(($1-1)) "$@"
}

if tryn 2 ping $host; then
   # success
fi

1 Comment

Thanks @rici. I've read that to return a value in bash, you echo the value instead of using return. In fact, I tried to use return, but it was never returning the desired value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.