bash scripting: parallel for loops

Question

I am writing a bash script to loop through 2 identical directory trees in order to do run diffs on matching files. I just need to know what is correct syntax to setup 2 loops that will run in parallel. I'd also like to iterated through directories recursively, so if there's an additional step to accomplish that, can you mention it?

#!/bin/bash
FILES1=/path/to/one
FILES2=/path/to/two
for f1 in $FILES1; f2 in $FILES2
do
  echo "Processing $f1 $f2 file..."
done

Do you actually want to process the first file from $FILES1 with the first from $FILES2, then second with second, and so on, or each file from $FILES1 with each from $FILES2 (i.e. a cross product)? — Paŭlo Ebermann
– Paŭlo Ebermann, Commented Nov 7, 2011 at 16:21
Paulo, I want the former: first with first, second with second, etc. — ted.strauss
– ted.strauss, Commented Nov 7, 2011 at 16:49
jamessan, you solved my problem the best possible way, a one-liner. Why didn't you post it as an answer? I'll leave the question as it is b/c the loop question may be useful for others. — ted.strauss
– ted.strauss, Commented Nov 7, 2011 at 20:01

jamessan · Accepted Answer · 2011-11-08 01:37:42Z

3

diff already has the builtin functionality to run recursively against directories. You can run diff -Naur /path/to/one /path/to/two.

-N will show the diff for new files, instead of just saying it exists in the second path
-a treats all files as text, so you may or may not need this.
-u uses the common unified diff format
-r is the important, in this case, recursive flag

answered Nov 8, 2011 at 1:37

jamessan

42.9k8 gold badges89 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Karoly Horvath · Accepted Answer · 2011-11-07 16:36:47Z

3

There is no guarantee that the files will be in the same order in the two directories.

Do it like this:

FILES1=/path/to/one
FILES2=/path/to/two
for f1 in $FILES1/*
do
  f2=$FILES2/`basename f1`
  # ...
done

And if you need to do it recursively:

cd $FILES1
for f1 in `find`; do
  f2=$FILES2/$f1
  # ...
done

answered Nov 7, 2011 at 16:36

Karoly Horvath

96.7k11 gold badges123 silver badges181 bronze badges

Comments

jamessan · Accepted Answer · 2011-11-07 17:19:01Z

The closest you can get is to nest the for loops, i.e.

   for f in $FILES1 ; do
      for f2 in $FILES2 ; do
         if [ ${f##*/} = ${f2##*/} ]; then
             diff $f $f2
             break
         fi
      done
  done

to iterate thru recursive dirs, you could rely on find to return a list of files for processing

 PATH1=/path/to/one
 PATH2=/path/to/two

 for f in $(find $PATH1 -print ) ; do
    for f2 in $(find $PATH2 -print ) ; do
       if [ ${f##*/} = ${f2##*/} ]; then
           diff $f $f2
           break
       fi
    done
done

Sorry I don't have a way to test this right now, the ${f##*/} may not be exactly right. The intention is to create just the basename of a file (I guess you could use that too, $(basename $f ) at the expense of an 2 processes for each.

Both of these are very expensive processes, checking lots of non-matching filename pairs. The break inside loop2 helps a little.

An alternate solution is to rely on one path to create the driver list and then use the supplied names to generate alternate relative paths, i.e.

for f in $FILES1 ; do
   f2=../two/$f{##*/}
   if [ -f ${f2} ] ; the
       diff $f $f2
   else
       printf "no alternate file found in ../two for f=${f}"
   fi
done

If need be, you can duplicate that loop and reverse the location of f2 and f to check the 'other-side' of your system.

I hope this helps.

Collectives™ on Stack Overflow

bash scripting: parallel for loops

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related