0

I can't think of a way of doing what I am trying to do and hoping for a little advice. I am working with data on a computing cluster, and would like to process individual files on separate computing nodes. The workflow I have right now is something like the following:

**file1.py**

Get files, parameters, other info from user

Then Call: file2.sh

**file2.sh**

Submit file3.py to computing node

**file3.py**

Process input file with parameters given

What I am trying to do is call file2.sh and pass it each input data file one at a time so that there are multiple instances of file3.py running, one per file. Is there a good way to do this?

I suppose that the root of the problem is that if i were to iterate through a list of input files in file1.py I don't know how to then pass that information to file2.sh and then on to file3.py.

1
  • use subprocess Commented Feb 3, 2016 at 0:22

1 Answer 1

1

From this description, I'd say the the straightforward way is to call file2.sh directly from Python.

status, result = commands.getstatusoutput("file2.sh" + arg_string)

Is that enough of a start to get you moving? Are the nodes conversant enough for one to launch a command directly on another? If not, you may want to consider looking up "interprocess communication" on Linux. If they're not even on the same Internet node, you'll likely need REST commands (post and get operations), from whence things grow more overhead.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.