passing bash array elements to awk regex inside loop

Question

I'm trying to search through a file using awk, by looping over elements of a bash array. This is what I'm currently doing

myarray[1] = 441
myarray[2] = 123

for i in "${myarray[@]}"
do
awk '{if ($4 == '"$i"') print $0}' myfile.txt > newfile.txt
done

Is it possible to access elements of a bash array in awk in this way?

You aren't accessing an array element there. You are accessing a normal shell variable. And yes, that works just fine assuming you quote the string in the awk context (i.e. you need "'"$i"'" instead of just '"$i"'). — Etan Reisner
– Etan Reisner, Commented Aug 22, 2014 at 13:59
Doesn't seem to work, i've replaced "'"$i"'" with "441" which I know is in the file, and this works. So I think the problem is still with defining the element in the loop. — tclarke
– tclarke, Commented Aug 22, 2014 at 14:06
Once the quoting is fixed, note that awk is still not "accessing shell variables". Rather, you are using the value of a shell variable in a string that you pass to awk. The value of $i is expanded by the shell before the awk command is executed. — Mark Reed
– Mark Reed, Commented Aug 22, 2014 at 14:07
You have an extra single quote before the close-quote in your expansion"${myarray[@]}'" which could be what's messing you up. — Mark Reed
– Mark Reed, Commented Aug 22, 2014 at 14:09
I've edited out the extra single quote, that wasn't what was messing me up. — tclarke
– tclarke, Commented Aug 22, 2014 at 14:16

anubhava · Accepted Answer · 2014-08-22 14:05:11Z

5

This is not the right way to pass a shell variable (or BASH array element) to awk. Use it with -v option:

myarray=(441 123)

for i in "${myarray[@]}'"; do
   awk -v i="$i" '$4 == i' myfile.txt > newfile.txt
done

-v i="$i" makes shell variable $i available inside awk as an awk variable i
$4 == i is equivalent of {if ($4 == i) print $0} since print $0 is the default action

answered Aug 22, 2014 at 14:05

anubhava

790k67 gold badges603 silver badges671 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

anubhava Over a year ago

Just a small note that use >> newfile.txt instead of > newfile.txt otherwise you will see output of last awk command only.

tclarke Over a year ago

This doesn't work either. I can replace the i in '$4 == i' with 441 and get the ouput, but it doesn't recognize the i

Mark Reed Over a year ago

You could also just move the > newfile.txt from the awk to the done so the file is only written once instead of written and then appended each time.

Ed Morton Over a year ago

@tclarke something to consider - rather than writing a shell loop, just pass the whole array contents to awk at once and let it do everything else.

iruvar Over a year ago

@anubhava, i think i may have contrived a way to deal with the embedded space problem

|

Tom Fenech · Accepted Answer · 2018-02-02 09:21:50Z

5

There's no need for a bash loop; you can do the whole thing in awk:

my_array=(441 123)
awk -varr="${my_array[*]}" 'BEGIN{split(arr,a); for(i in a)b[a[i]]} $4 in b' file

The contents of the shell array are passed to awk as a single string, with a space in between each element. split is used to create an awk array from the string. Array a looks like this:

a[1]=441; a[2]=123

The for loop creates an array b with two keys, b[441] and b[123].

Lines are printed when the 4th column matches one of the array keys.

Bear in mind that this approach fails when the elements in the array contain spaces.

edited Feb 2, 2018 at 9:21

answered Aug 22, 2014 at 14:37

Tom Fenech

75.1k13 gold badges119 silver badges154 bronze badges

Comments

iruvar · Accepted Answer · 2014-08-22 17:48:46Z

2

You can avoid looping through the bash array elements externally. In the following, the array elements are passed at one shot to awk and accessed within awk using ARGV. Also, there's no reason why awk cannot write to the output file directly

awk -v len="${#myarray[@]}" '
BEGIN{t=ARGC; ARGC-=len; for(i=2; i<t; ++i) b[ARGV[i]]++ };
$4 in b { print > "newfile.txt"}' myfile.txt  "${myarray[@]}"

edited Aug 22, 2014 at 17:48

answered Aug 22, 2014 at 15:32

iruvar

23.5k7 gold badges58 silver badges83 bronze badges

Comments

lihao · Accepted Answer · 2014-08-22 14:45:59Z

0

you can also construct an awk regex:

myarray=(441 123)
regex=$(IFS=\|;echo "^(${myarray[*]})\$")
awk -v regex="$regex" '$4 ~ regex' myfile.txt > newfile.txt

However, do be careful if there are metacharacters (i.e. '*', '\', '?' etc) in any element of the array.

answered Aug 22, 2014 at 14:45

lihao

8835 silver badges12 bronze badges

Collectives™ on Stack Overflow

passing bash array elements to awk regex inside loop

4 Answers 4

9 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

9 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related