7

I'm trying to search through a file using awk, by looping over elements of a bash array. This is what I'm currently doing

myarray[1] = 441
myarray[2] = 123

for i in "${myarray[@]}"
do
awk '{if ($4 == '"$i"') print $0}' myfile.txt > newfile.txt
done

Is it possible to access elements of a bash array in awk in this way?

5
  • 1
    You aren't accessing an array element there. You are accessing a normal shell variable. And yes, that works just fine assuming you quote the string in the awk context (i.e. you need "'"$i"'" instead of just '"$i"'). Commented Aug 22, 2014 at 13:59
  • Doesn't seem to work, i've replaced "'"$i"'" with "441" which I know is in the file, and this works. So I think the problem is still with defining the element in the loop. Commented Aug 22, 2014 at 14:06
  • Once the quoting is fixed, note that awk is still not "accessing shell variables". Rather, you are using the value of a shell variable in a string that you pass to awk. The value of $i is expanded by the shell before the awk command is executed. Commented Aug 22, 2014 at 14:07
  • You have an extra single quote before the close-quote in your expansion"${myarray[@]}'" which could be what's messing you up. Commented Aug 22, 2014 at 14:09
  • I've edited out the extra single quote, that wasn't what was messing me up. Commented Aug 22, 2014 at 14:16

4 Answers 4

5

This is not the right way to pass a shell variable (or BASH array element) to awk. Use it with -v option:

myarray=(441 123)

for i in "${myarray[@]}'"; do
   awk -v i="$i" '$4 == i' myfile.txt > newfile.txt
done
  • -v i="$i" makes shell variable $i available inside awk as an awk variable i
  • $4 == i is equivalent of {if ($4 == i) print $0} since print $0 is the default action
Sign up to request clarification or add additional context in comments.

9 Comments

Just a small note that use >> newfile.txt instead of > newfile.txt otherwise you will see output of last awk command only.
This doesn't work either. I can replace the i in '$4 == i' with 441 and get the ouput, but it doesn't recognize the i
You could also just move the > newfile.txt from the awk to the done so the file is only written once instead of written and then appended each time.
@tclarke something to consider - rather than writing a shell loop, just pass the whole array contents to awk at once and let it do everything else.
@anubhava, i think i may have contrived a way to deal with the embedded space problem
|
5

There's no need for a bash loop; you can do the whole thing in awk:

my_array=(441 123)
awk -varr="${my_array[*]}" 'BEGIN{split(arr,a); for(i in a)b[a[i]]} $4 in b' file

The contents of the shell array are passed to awk as a single string, with a space in between each element. split is used to create an awk array from the string. Array a looks like this:

a[1]=441; a[2]=123

The for loop creates an array b with two keys, b[441] and b[123].

Lines are printed when the 4th column matches one of the array keys.

Bear in mind that this approach fails when the elements in the array contain spaces.

Comments

2

You can avoid looping through the bash array elements externally. In the following, the array elements are passed at one shot to awk and accessed within awk using ARGV. Also, there's no reason why awk cannot write to the output file directly

awk -v len="${#myarray[@]}" '
BEGIN{t=ARGC; ARGC-=len; for(i=2; i<t; ++i) b[ARGV[i]]++ };
$4 in b { print > "newfile.txt"}' myfile.txt  "${myarray[@]}"

Comments

0

you can also construct an awk regex:

myarray=(441 123)
regex=$(IFS=\|;echo "^(${myarray[*]})\$")
awk -v regex="$regex" '$4 ~ regex' myfile.txt > newfile.txt

However, do be careful if there are metacharacters (i.e. '*', '\', '?' etc) in any element of the array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.