19

I am trying to create a script which will start many background command. For each background command I need to get the return code.

I have been trying the following script :

 #!/bin/bash
set -x
pid=()
return=()


for i in 1 2
do
 echo start $i
 ssh mysql "/root/test$i.sh" &
 pid[$i]=$!
done

for i in ${#pid[@]}
do
echo ${pid[$i]}
wait ${pid[$i]}
return[$i]=$?

if [ ${return[$i]} -ne 0 ]
then
  echo mail error
fi

done

echo ${return[1]}
echo ${return[2]}

My issue is during the wait loop, if the second pid finish before the first one, I'll not be able to get the return code.

I know that I can run wait pid1 pid2, but with this command I can't get the return code of all commands.

Any idea ?

5 Answers 5

12

The issue is more with your

for i in ${#pid[@]}

Which is for i in 2.

It should rather be:

for i in 1 2

or

for ((i = 1; i <= ${#pid[@]}; i++))

wait "$pid" will return the exit code of the job with bash (and POSIX shells, but not zsh) even if the job had already terminated when wait was started.

8

You can do this by using a temporary directory.

# Create a temporary directory to store the statuses
dir=$(mktemp -d)

# Execute the backgrouded code. Create a file that contains the exit status.
# The filename is the PID of this group's subshell.
for i in 1 2; do
    { ssh mysql "/root/test$i.sh" ; echo "$?" > "$dir/$BASHPID" ; } &
done

# Wait for all jobs to complete
wait

# Get return information for each pid
for file in "$dir"/*; do
    printf 'PID %d returned %d\n' "${file##*/}" "$(<"$file")"
done

# Remove the temporary directory
rm -r "$dir"
6

A generic implementation without temporary files.

#!/usr/bin/env bash

## associative array for job status
declare -A JOBS

## run command in the background
background() {
  eval $1 & JOBS[$!]="$1"
}

## check exit status of each job
## preserve exit status in ${JOBS}
## returns 1 if any job failed
reap() {
  local cmd
  local status=0
  for pid in ${!JOBS[@]}; do
    cmd=${JOBS[${pid}]}
    wait ${pid} ; JOBS[${pid}]=$?
    if [[ ${JOBS[${pid}]} -ne 0 ]]; then
      status=${JOBS[${pid}]}
      echo -e "[${pid}] Exited with status: ${status}\n${cmd}"
    fi
  done
  return ${status}
}

background 'sleep 1 ; false'
background 'sleep 3 ; true'
background 'sleep 2 ; exit 5'
background 'sleep 5 ; true'

reap || echo "Ooops! Some jobs failed"
1
  • Thank you :-) This is exactly what I was looking for! Commented Dec 7, 2018 at 2:40
1

Bash 4.3 added -n to the wait builtin, and -p was added in version 5.1.

From https://www.gnu.org/software/bash/manual/html_node/Job-Control-Builtins.html

wait -n

If the -n option is supplied, wait waits for a single job from the list of pids or jobspecs or, if no arguments are supplied, any job, to complete and returns its exit status. [...]

wait -p

If the -p option is supplied, the process or job identifier of the job for which the exit status is returned is assigned to the variable varname named by the option argument. [...]

The combination of the two options means Bash 5.1+ is actually quite decent at basic multiprocessing. The main drawback now is really just tracking/managing stdout/stderr.

_job1 () { sleep "$( shuf -i 1-3 -n 1 )"s ; true ; }
_job2 () { sleep "$( shuf -i 1-3 -n 1 )"s ; return 42 ; }

limit="2"
i="0"

set -- _job1 _job2
while [ "$#" -gt "0" ] ;do

    until [ "$i" -eq "$limit" ] ;do
        printf 'starting %s\n' "$1"
        "$1" &
        pids[$!]="$1"
        i="$(( i + 1 ))"
        shift
    done

    if wait -n -p ended_pid ;then
        return_code="$?"
        printf '%s succeeded, returning "%s"\n' "${pids[ended_pid]}" "$return_code"
    else
        return_code="$?"
        printf '%s FAILED, returning "%s"\n' "${pids[ended_pid]}" "$return_code"
    fi
    unset 'pids[ended_pid]'
    i="$(( i - 1 ))"

done

while [ "${#pids[@]}" -gt "0" ] ;do
    if wait -n -p ended_pid ;then
        printf '%s succeeded, returning "%s"\n' "${pids[ended_pid]}" "$?"
    else
        printf '%s FAILED, returning "%s"\n' "${pids[ended_pid]}" "$?"
    fi
    unset 'pids[ended_pid]'
done

More information (though not on wait -p): http://mywiki.wooledge.org/ProcessManagement

1

Stéphane's answer is good, but I would prefer

for i in ${!pid[@]}
do
    wait "${pid[i]}"
    return_status[i]=$?
    unset "pid[$i]"
done

which will iterate over the keys of the pid array, regardless of which entries still exist, so you can adapt it, break out of the loop, and re-start the whole loop and it'll just work. And you don't need consecutive values of i to begin with.

Of course, if you're dealing with thousands of processes then perhaps Stépane's approach would be fractionally more efficient when you have a non-sparse list.

3
  • naming you array return freaked me out, man! Commented Dec 20, 2022 at 19:21
  • @ToddiusZho is return_status better? Commented Jan 2, 2023 at 20:44
  • (To be fair, the original question names the variable return; I was just copying that.) Commented Jan 2, 2023 at 20:51

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.