How Can I Loop Edit Multiple Files in Bash script?

Question

I have 40 csv files that I need to edit. 20 have matching format and the names only differ by one character, e.g., docA.csv, docB.csv, etc. The other 20 also match and are named pair_docA.csv, pair_docB.csv, etc.

I have the code written to edit and combine docA.csv and pair_docA.csv, but I'm struggling writing a loop that calls both the above files, edits them, and combines them under the name combinedA.csv, then goes on the the next pair.

Can anyone help my rudimentary bash scripting? Here's what I have thus far. I've tried in a single for loop, and now I'm trying in 2 (probably 3) for loops. I'd prefer to keep it in a single loop.

set -x
DIR=/path/to/file/location

for file in `ls $DIR/doc?.csv`
do

#code to edit the doc*.csv files ie $file

done

for pairdoc in `ls $DIR/pair_doc?.csv`
do

#code to edit the piar_doc*.csv files ie $pairdoc

done

#still need to combine the files. I have the join written for a single iteration, 
#but how do I loop the code to save each join as a different file corresponding
#to combined*.csv

Do not parse ls.

KamilCuk
– KamilCuk

2021-02-28 12:15:27 +00:00
Commented Feb 28, 2021 at 12:15 — KamilCuk
– KamilCuk, Commented Feb 28, 2021 at 12:15

M. Nejat Aydin · Accepted Answer · 2021-02-28 13:02:26Z

3

Something along these lines:

#!/bin/bash

dir=/path/to/file/location
 
cd "$dir" || exit
for file in doc?.csv; do
    pair=pair_$file
    # "${file#doc}" deletes the prefix "doc"
    combined=combined_${file#doc}
    cat "$file" "$pair" >> "$combined" 
done

ls, on principle, shouldn't be used in a shell script in order to iterate over the files. It is intended to be used interactively and nearly never needed within a script. Also, all-capitalized variable names shouldn't be used as ordinary variables, since they may collide with internal shell variables or environment variables.

Below is a version without changing the directory.

#!/bin/bash

dir=/path/to/file/location

for file in "$dir/"doc?.csv; do
    basename=${file#"$dir/"}
    pair=$dir/pair_$basename
    combined=$dir/combined_${basename#doc}
    cat "$file" "$pair" >> "$combined"
done

edited Feb 28, 2021 at 13:02

answered Feb 28, 2021 at 0:11

M. Nejat Aydin

10.3k1 gold badge10 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

spencergadd Over a year ago

Bless you! Thank you. That makes sense. Is it a preferential thing to change directories rather than add the directory $DIR in the for loop setup, or is it seen as bad practice to call the $DIR like I did? Also, what if I need to append each output to combined_A.csv, combined_B.csv, etc.?

spencergadd Over a year ago

Actually, I'd rather append each output into a single combined.csv file

M. Nejat Aydin Over a year ago

@swgRrr Please see the updated answer. Changing the directory is not strictly necessary but it makes things easy for this particular task. Otherwise, some string manipulation via shell parameter expansions would be required in order to split the pathname into a path prefix (the directory name) and a basename (the non-directory portion of the pathname).

potong · Accepted Answer · 2021-02-28 01:06:30Z

0

This might work for you (GNU parallel):

parallel cat {1} {2} \> join_{1}_{2} ::: doc{A..T}.csv :::+ pair_doc{A..T}.csv

Change the cat commands to your chosen commands where {1} represents the docX.csv files and {2} represents the pair_docX.csv file.

N.B. X represents the letters A thru T

edited Feb 28, 2021 at 1:06

answered Feb 28, 2021 at 0:59

potong

59.3k6 gold badges55 silver badges92 bronze badges

2 Comments

spencergadd Over a year ago

Whoo, this one may be a bit above my current paygrade. I understand the flow and such, but could you explain what parallel does and what the series of colon's do? If not, I'll look it up! Thanks for your input!

potong Over a year ago

@swgRrr Gnu parallel is a tool worth investing time in. It provides provides the possibilities of loops in a one-liner format but much more besides.

Collectives™ on Stack Overflow

How Can I Loop Edit Multiple Files in Bash script?

2 Answers 2

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related