1

I have thousands of .tsv files where I am extracting the rows where column 2 is equal column 6.

I can use the below bash script, but I could not append column names (header) in the output.

What is the way to include header?

for x in *.tsv; do
   awk '$2==$6' <"$x" >"$x.tmp"
   mv "$x.tmp" "$x"
done

1 Answer 1

2

If you want to print based on two conditions, say so:

awk 'FNR==1 || $2==$6' file

This will print those lines that either of these:

  • match the $2==$6 condition.
  • are the first line.

Also, note you don't need to loop with bash, awk can do it:

awk '(FNR==1 || $2==$6) {print > FILENAME".bk"}' *.tsv
Sign up to request clarification or add additional context in comments.

5 Comments

Beautiful! Exactly what I needed.Thanks
Also, How do you invert this equal condition? I mean, to extract the rows that are not equal.
@MAPK you can say $2 != $6.
Thanks! I am new to awk/bash. Sounds similar to perl/python or R with most regex and conditions.
@MAPK it is sooooo nice. Interesting reading: Idiomatic awk

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.