2

I have a situation in awk where I need to convert an input format into another format and later use the number of records processed separately. Is there any way I can use a shell variable to get the value of NR in the END section? Something like:

cat file1 | awk 'some processing END{SHELL_VARIABLE=NR}' > file2

Then later use SHELL_VARIABLE outside awk.

I do not want to process the file and then do a wc -l separately as the files are huge.

2 Answers 2

3

One way: Use the redirection inside your command and print your result in the END block. And use command substitution to read the result in a shell variable:

my_var=$(awk '{ some processing; print "your output" >>file2 } END { print NR }' file1)
Sign up to request clarification or add additional context in comments.

Comments

3

No subprocess can affect the parent's environment variables. What you can do is have awk write output to the file directly, then have it print the value you want to stdout and capture it. Or if you prefer, you could reverse that and have awk just print it to a file and read it back afterwards.

Incidentally, you have a UUOC.

rows=$(awk '{ ...; print > "file2"} END {print NR}' file1)

Or

awk '... END{print NR > "rows"}' file1 >file2
rows=$(<rows)
rm rows

1 Comment

This is exactly what I am trying to do now. But the problem now is that awk is not able to write the out to a /tmp/ file, but its working in my home dir. Its giving me error: awk: i/o error occurred on numlines

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.