13

I use SSH to run some commands on multiple remote machines in a for loop. It executes the same command(s) for a list of IP addresses. Some of the IP addresses might be unreachable, so I used the ConnectTimeout option.

However, my script didn't work the way I wanted. Actually it got stuck at the first unreachable IP instead of giving up and trying the next IP address on my list.

Here is the relevant part of my script:

for ip in ${IP} ; do
    ssh  -o BatchMode=yes \
         -o StrictHostKeyChecking=no \
         -o ConnectTimeout=10 \
         -l ${USERNAME} \
         ${SCRIPT_HOST} \
         "${COMMAND} -i $ip || echo timeout" \
         >> ./myscript.out
done

It is working fine for reachable IPs, but if a specific IP is down, it waits for a while (much more than 10s, maybe 35-40 seconds) and displays an error message to my terminal:

ERROR connecting : Connection timed out

So I'm wondering which option I didn't use correctly.

2
  • can't it run in background ?? and ignore error by doing <your command> 2>/dev/null Commented Mar 20, 2014 at 14:47
  • Have you tried executing ssh in debugging mode (i.e. verbose mode)? Commented Apr 17, 2014 at 0:37

2 Answers 2

15

Your use of ConnectTimeout is correct, so it is not obvious why it only times out after 30 or more seconds.

Here's how I would change your script to avoid the timeout problem entirely:

  • Use GNU parallel to connect to more than one destination host at the same time.
  • Use the -f option to SSH to process it in the background.

Here is a solution with GNU parallel, running at most 50 connections at the same time:

parallel --gnu --bg --jobs 50 \
ssh -o BatchMode=yes \
    -o StrictHostKeyChecking=no \
    -o ConnectTimeout=10 \
    -l ${USERNAME} \
    {} \
    "${COMMAND} -i {} || echo timeout" \
::: ${IP}

parallel <command> ::: <arguments> will execute <command> <argument> many times in parallel by splitting the <arguments> list. The placeholder for <argument> is {}.

Use parallel --jobs n to limit the number of parallel connections.

Sign up to request clarification or add additional context in comments.

Comments

1

The connection timeout is for when you have already established a connection and if the connection stays idle for that amount of time in seconds, then it will disconnect (That is if you did not also activate the KEEP_ALIVE ssh parameter that prevent a connection from ever being idle).

The reason it takes 30+ seconds before you get a time out is because it is the TCP protocol internal timer that try to connect for that amount of time and return that error message that he cannot connect to the sftp server. It does not comes from ssh.

1 Comment

This answer contradicts the SSH documentation (also, even if the OS does not allow you to shorten the timeout on the socket proper, you could still run your own timer and drop the attempt after any time).Here's the relevant part from the ssh_config(5) manual page about the ConnectTimeout option: "Specifies the timeout (in seconds) used when connecting to the SSH server, instead of using the default system TCP timeout. This value is used only when the target is down or really unreachable, not when it refuses the connection."

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.