Use Bash ssh to run a Java program on multiple remote machines

Question

I need to run a Java program on many remote machines. I'm using ssh in a loop and calling a remote script that runs the Java program.

As you can imagine, this is used for testing a distributed system on a cluster.

Problem is, the script hangs right after I input the password for the first ssh session. It's probably a bash error, as the Java program runs fine on local.

The exact structure is this, a local bash script running many remote bash scripts. Each remote script compiles and runs a Java program. This Java program starts a separate thread to do some work. When a SIGINT signal is received, the Java thread is informed so it can exit cleanly.

I made a simplified working example.

EDIT: the code below now works (fixed for posterity)

Please, if you want to answer, don't change the structure of the code too much, or it won't resemble the original one and I won't be able to understand what's wrong.

Bash script that's run by hand

#!/bin/bash

function startBatch()
{
    #the problem was using -n
    ssh -f "$1" "cd $projectDir;./startBatch.sh $2"
}

function stopBatch()
{
    #the problem was using -n
    ssh -f "$1" "pkill -f jnode_.*"
}

projectDir=NetBeansProjects/Runner

#start nodes
nodeNumber=0
while read node; do
    startBatch "$node" "$nodeNumber"
    nodeNumber=$(($nodeNumber + 1))
done < ./nodes.txt

sleep 3

#stop nodes
while read node; do
    stopBatch "$node"
done < ./nodes.txt

Bash script that is run by the other script

#!/bin/bash

#this is a simplified working example
myNumber=$1
$(exec -a jnode_"$myNumber" java -cp build/classes runner.Runner "$myNumber.txt")

Here's a less simplified version of the above script. Check the second part of the accepted answer if you want proper logging.

#!/bin/bash

batchNumber=$1
procNumber=0
batchSize=3
while [ "$procNumber" -lt "$batchSize" ]; do
    procName="$batchNumber"_"$procNumber"
    #this line was no good
    #$(exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" &)
    #this line works fine
    exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" 1>/dev/null 2>/dev/null &
    procNumber=$(($procNumber + 1))
done

Java Runner (the thing that starts the thread)

import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintStream;

public class Runner {

    public static void main(String[] args) throws FileNotFoundException, InterruptedException {
        //redirect all outputs to a given file
        PrintStream output = new PrintStream(new File(args[0]));
        System.setOut(output);
        System.setErr(output);

        //controlled object
        final MyRunnable myRunnable = new MyRunnable();

        //shutdown the controlled process on command
        Runtime.getRuntime().addShutdownHook(new Thread() {
            @Override
            public void run() {
                myRunnable.stop = true;
            }
        });

        //run the process
        new Thread(myRunnable).start();
    }
}

Java MyRunnable (the running thread)

public class MyRunnable implements Runnable {

    public boolean stop = false;

    @Override
    public void run() {
        while (!stop) {
            try {
                System.out.println("running");
                Thread.sleep(1000);
            } catch (InterruptedException ex) {
                System.out.println("interrupted");
            }
        }
        System.out.println("stopping");
    }
}

Do not to use System.exit() in your Java program, or the shutdown hook will not be properly called (or completely executed). Send a SIGINT message from outside.

As it was mentioned int the comments, inputting passwords can be boring. Password-less RSA keys are an option, but we can do better. Let's add some security features.

Create the public/private key pair

ssh-keygen -t rsa
Enter file in which to save the key (home/your_user/.ssh/id_rsa): [input ~/.ssh/nameOfKey]
Enter passphrase (empty for no passphrase): [input a passphrase not weaker than your ssh password]

Add the public key to the authorized_keys file of the remote hosts, so it can be authenticated.

#first option (use proper command)
ssh-copy-id [email protected]

#second option (append the key at the end of the file)
cat ~/.ssh/nameOfKey.pub | ssh [email protected] "cat >> ~/.ssh/authorized_keys"

Now, if we use ssh-agent, we can make it so that the passphrase(s) will be asked only once (when executing the first command). Notice, it will ask for the passphrases (the ones inputted when creating the keys), not for the actual ssh passwords.

#activate the agent
eval `ssh-agent`

#add the key, its passphrase will be asked
ssh-add ~/.ssh/keyName1

#add more keys, if needed
ssh-add ~/.ssh/keyName2

You now have a very simple yet functional testing framework for your distributed system. Have fun.

I know this isn't a direct answer to your question, but one possibility is to create a passwordless key (ssh-keygen -t rsa) on your machine then stick the public key in the authorized_keys2 on each of the remote machines, then you won't have to deal with the passwords when connecting from your machine. The SSH password prompt tends to wreak havoc on script interactivity sometimes. Comes with associated security pitfalls, but they may not matter for your situation. — Jason C
– Jason C, Commented Mar 5, 2014 at 17:15

Jason C · Accepted Answer · 2014-03-10 15:45:20Z

1

When executing remote commands, SSH won't exit until the remote command is complete. Your remote script won't exit until the Java program is complete, and a Java program won't exit until all its non-daemon threads exit, and your Java program runs forever. Therefore, your server-side invocation of SSH runs forever (well, until you kill it through some other means) and your script hangs.

You need to decide on a way to make your SSH remote command return immediately. You have options. The easiest is probably just to invoke it with & on the server script, as:

ssh -n "$1" "cd $projectDir;./startBatch.sh $2 &"

A more robust option is to invoke java with & in the remote script, and let the server-side run as you have it now (no &), that way you have a chance to completely read e.g. error messages produced by the remote script.

Side Note: As for the password itself (which you will ultimately have to deal with once you get past the current hurdle), as mentioned in my comment on the question: One possibility is to create a passwordless key (ssh-keygen -t rsa) on your machine then stick the public key in the authorized_keys2 on each of the remote machines, then you won't have to deal with the passwords when connecting from your machine. The SSH password prompt tends to wreak havoc on script interactivity sometimes. Comes with associated security pitfalls, but they may not matter for your situation.

Responding to comments below. You have a couple of options. If you want to capture everything to the same log file, with append, don't redirect your program outputs, and just redirect everything the while loop does to a log, e.g.:

while [ "$procNumber" -lt "$batchSize" ]; do
    procName="$batchNumber"_"$procNumber"
    exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" &
    procNumber=$(($procNumber + 1))
done >> "$myLog" 2>&1

If you want one log per process, with append:

while [ "$procNumber" -lt "$batchSize" ]; do
    procName="$batchNumber"_"$procNumber"
    exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" >> "$myLog.$procNumber" 2>&1 &
    procNumber=$(($procNumber + 1))
done

You could also combine the above two, if you want to separate application output from the output of other commands in the loop.

edited Mar 10, 2014 at 15:45

answered Mar 5, 2014 at 17:23

Jason C

40.6k16 gold badges136 silver badges201 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Agostino Over a year ago

The small example now works, the production version seems to start starts only one process per remote machine. I added a less simplified version of the remote script that shows the problem. All the log files are created immediately, but only the first instance on each machine starts filling them immediately, while the other 2 only start filling them once the 1st instance is shut down.

Agostino Over a year ago

I think I got it. I simplified the execution line and threw away the output (redirect to dev/null). Now it works, apparently. Weird!

Jason C Over a year ago

@Agostino Glad you got it. A bottom-up approach to testing would be the best approach in a situation like this. First make sure your Java program is functioning as expected. Then develop and test your remote script, make sure it starts all required instances and exits asap, printing/returning relevant status info (if you care) and leaving your program running in the background. Then put together your server-side script, and watch everything fall into place.

Jason C Over a year ago

BTW if this is some kind of "service" type application, you might also consider using a System V style (/etc/init.d) or upstart (if your system has it) job to manage local instances on the remote machines instead of custom scripts. The init systems have nice easy-to-use frameworks for starting and stopping background applications smoothly.

Agostino Over a year ago

I tried to go by steps and make this work on local at first. Not enough, I watched it fall to pieces :D I remember I saw some warnings on the bigger application complaining about some missing file that was really there. Is this OK to capture such complainings? 1>"$myLog";2>>"$myLog"; before the while and then exec ... 1>>"$myLog" 2>>"$myLog" & inside it.

|

Richard Miskin · Accepted Answer · 2014-03-05 17:21:53Z

The man page for ssh suggests that using -n will not work if ssh needs to ask for a password. You should be using -f, or set up passwordless ssh so you don't need to enter the passwords.

Quoting from the Mac OS X man page for ssh gives:

 -n      Redirects stdin from /dev/null (actually, prevents reading from stdin).  This must be used when ssh is run in the background.  A common trick is to use this to run X11
         programs on a remote machine.  For example, ssh -n shadows.cs.hut.fi emacs & will start an emacs on shadows.cs.hut.fi, and the X11 connection will be automatically for-
         warded over an encrypted channel.  The ssh program will be put in the background.  (This does not work if ssh needs to ask for a password or passphrase; see also the -f
         option.)

And also:

 -f      Requests ssh to go to background just before command execution.  This is useful if ssh is going to ask for passwords or passphrases, but the user wants it in the back-
         ground.  This implies -n.  The recommended way to start X11 programs at a remote site is with something like ssh -f host xterm.

         If the ExitOnForwardFailure configuration option is set to ``yes'', then a client started with -f will wait for all remote port forwards to be successfully established
         before placing itself in the background.

Collectives™ on Stack Overflow

Use Bash ssh to run a Java program on multiple remote machines

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related