2

Is it possible to parallelize the following code?

for word in $(cat FileNames.txt)
do 
   for i in {1..22}
   do  
      Rscript assoc_test.R...........

   done >> log.txt
done 

I have been trying to parallelize it but have not been lucky so far. I have tried putting () around the Rscript assoc_test.R........... followed by & but it is not giving the results, and the log file turns out to be empty. Any suggestions/help would be appreciated. TIA.

1
  • 1
    how is you code look like followed by &, that should do the paralyzing. but you may need to wait a while for the Rscript finish their job Commented Sep 5, 2019 at 10:29

2 Answers 2

3

You can change your script to output the commands to run, and feed the results into GNU parallel:

for word in $(cat FileNames.txt)
do 
   for i in {1..22}
   do  
      echo Rscript assoc_test.R........... \> log.$word.$i
   done
done | parallel -j 4

Some details:

  • parallel -j 4 will keep 4 jobs running at a time - replace 4 by the number of CPUs you want to use.
  • Notice I redirect the output to log.$word.$i and escape the redirection operator > by using \>. I need to test and make sure it works, but the point is that since you're going parallel, you don't want to jumble all your outputs together.
  • Make sure you escape anything else the echo might interpret. The output should be valid command lines that parallel can run.

As an alternative to parallel, you can also use xargs -i. See this question for more information.

Sign up to request clarification or add additional context in comments.

3 Comments

Okay this is much better than a system I've been using, which backgrounds a block of code, tracks if it's complete through a complicated mess of pid's and counters. I'll be rewriting it with this. I like spending time looking at random questions, because I learn things. =)
Thankyou Joanis. I have tried the above solution but its giving the syntax error. The same error which I have experienced with "&" command.
Troubleshooting ideas: 1) please cut and paste the syntax error you are getting here 2) can you show the exact Rscript... command you are running? maybe that's where the problem lies 3) can you show what it looks like in the output of the loop's echo? that might tell us if something is not escaped right 4) is the error from the loop or from running the commands? save the output of the loop in a file and then cat that file to parallel to find out when the error is printed.
2

GNU Parallel is made for replacing loops, so the double loop can be replaced by:

parallel Rscript assoc_test.R... \> log.{1}.{2} :::: FileNames.txt ::: {1..22} > log.txt 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.