3

How can I run a perl script in parallel with different input params each time:

Illustration:

perl example.pl param1 param2
perl example.pl param3 param4

i want to run the perl script example.pl 2 or more times with different input paramsX. Everytime it should run in parallel.

A sample algo is as under:

my $params='1,2,3,4,5';   
my @all_params = split(/\;/, $params);
foreach my $entry (@all_param)
    {
      perl example.pl $entry
    }

i want to run the perl script in parallel for each loop.

6
  • split using for comma or semicolon Commented Apr 18, 2017 at 5:06
  • @ssr1012, kindly request you to please elaborate a bit.. Commented Apr 18, 2017 at 5:07
  • perl example.pl $entry instead of system("call perl example.pl $entry"). try to find out some information's in search engine. Commented Apr 18, 2017 at 5:08
  • There are many ways. Perhaps try Parallel::ForkManager. You will need to learn a bit about what is behind it but this may be an easiest one to start with. Search this site, there are many many posts about what you are asking. Commented Apr 18, 2017 at 5:50
  • 1
    Voting to close as duplicate of how-to-call-single-perl-script-to-run-parallely-through-loop-for-different-input Commented Apr 18, 2017 at 5:59

2 Answers 2

10

You're asking about something that seems pretty simple, but is actually altogether more complicated than it seems.

It's not too hard to parallelise in perl, but ... here be dragons. Parallel code introduces a whole new set of bugs and race conditions as your program becomes non deterministic. You can no longer know the sequence of execution reliably. (And if you assume that you do, you'll create a race condition).

But with that in mind - there's really 3 (ish?) ways go go about it.

Fork

Use Parallel::ForkManager and enclose your inner loop in a fork. This works nicely for 'simple' parallelism, but communicating between your forks is difficult.

#!/usr/bin/env perl

use strict;
use warnings;

use Parallel::ForkManager;

my $manager = Parallel::ForkManager->new(2);    #2 concurrent

my $params = '1,2,3,4,5';
my @all_params = split( /,/, $params );

foreach my $entry (@all_param) {
   $manager->start and next;
   #your code to run in parallel here;
   print $entry;
   $manager->finish;
}

You can just roll your own using fork but you're probably going to trip over doing that. So Parallel::ForkManager is the tool for the job.

Thread:

#!/usr/bin/env perl

use strict;
use warnings;

use threads;
use Thread::Queue

  my $work_q = Thread::Queue->new;

sub worker {
   while ( my $item = $work_q->dequeue ) {
      print $item, "\n";
   }
}

my $params = '1,2,3,4,5';
my @all_params = split( /,/, $params );
$work_q->enqueue(@all_params);
$work_q->end;

threads->create( \&worker ) for 1 .. 2;    #2 in parallel
foreach my $thr ( threads->list ) {
   $thr->join;
}

This is more suitable if you need to do more IPC - threading is (IMO) generally better for that. However, you shouldn't treat threads as lightweight (like forks) because despite what you may think from other languages - perl threading doesn't work like that.

Using IO::Select and multiple open calls to parallelise:

#!/usr/bin/env perl

use strict;
use warnings;

use IO::Select; 

my $params = '1,2,3,4,5';
my @all_params = split( /,/, $params );

foreach my $param ( @all_params ) { 
   open ( my $io, '-|', "program_name $param" ); 
   $select -> add ( $io ); 
}

while ( my $fh = $select -> can_read ) { 
   my $line = <$fh>;
   print $line; 
}      

You can do something similar via IPC::Run2 to open file descriptors for STDIN and STDERR.

Should I?

Parallel code isn't a magic bullet. What it does is reduce 'blocks' and lets you consume resources. If your limiting resource is CPU, and you have 10 CPUs, then using 10 in parallel is going to speed you up.

... but if your limiting resource is IO - network or disk bandwidth - it often doesn't help, because contention actually makes the problem worse. Disk controllers in particular already parallelise, prefetch and cache quite efficiently, so your gains from hitting them in parallel are often quite marginal.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot @Sobrique for a detailed explanation, it helped a lot in understanding the intrinsics. For now I am using the ForkManager method for parallel builds and its giving me expected results. Just for inquisitiveness, is it also possible to implement nested forks? I want to do this to parallelise the builds even more..
Yes. You can fork and fork again - each time you 'fork' you split your process into two identical copies with exactly the same state, aside from the return code of fork. Doing this however, is a pretty good way to go exponential and 'fork bomb' if you're not REALLY careful. So better to stick with Parallel::ForkManager generally.
Thanks @Sobrique !
7

There's no real need to write any code (Perl or otherwise) to run your scripts in parallel, you can just use GNU Parallel and control how many run at time, how many different servers the scripts are run across and where the results go and just about any other aspect.

So, if you have a file called params.txt which contains:

param1 param2
param3 param4

you can just do this in the Terminal:

parallel -a params.txt perl {1} {2}

If you want a progress bar, just add --bar:

parallel --bar ...

If you want to run exactly 8 at a time:

parallel -j 8 ...

If you want to see what it would do without actually doing anything:

parallel --dry-run ...

1 Comment

Thanks Mark, yes we can do it this way. I generally use this way while running scripts directly from bash prompt. This usecase that i want to implement is in Jenkins2.x where i want to run some build scripts in parallel. Unfortunately i'm facing issues in using the built in jenkins "parallel" command for this purpose.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.