Using shell commands within an awk script which must access the awk commands

Question

This is essentially the command I want, All of it works except that I want to print something special in my third column that would use shell commands(or just more awk commands I guess but I don't know how I would fit this into the original awk statement). All I need help with is the pseudo command substitution between $2, and ar[$4,$1] in the print statement but left the rest in for the sake of specificity.

awk 'NR==FNR{ar[$3,$2]=$1+ar[$3,$2]; }
     NR>FNR && ar[$4,$1] {print "hs"$1,$2,`awk '$1 == #$1 from outer awk command# file2 | tail -n 1 | awk '{print $3}'`, ar[$4,$1]}' file1 file2

file1 will look like

5   8       t11 
15  7       t12 
3   7       t14

file2 will look like

8 4520 5560 t11 
8 5560 6610 t12 
8 6610 7400 t13 
7 9350 10610 t11 
7 10610 11770 t12 
7 11770 14627 t13
7 14627 16789 t14

And output should look like

8 4520 7400 5
7 10610 16789 15
7 14647 16789 3

Thank-You!

You're thinking about this wrong. Awk is a tool to manipulate text. It is not an environment from which to call tools (including other awk instances) - that's what a shell is for. edit your question to describe what you want to do to your input to create your output (as opposed to how you think you need to do it) so we can help you. — Ed Morton
– Ed Morton, Commented Jun 9, 2016 at 22:59
I don't see how ar[$4,$1] can return the values in last column of your sample output. For that matter, I don't see anything in the input can could turn out to generate the values of 5 and 15. So, as EdM says, lets see some rules about how to process your data. You can almost certainly achieve your required results either just in awk, or be refactoring your shell code and how it calls awk. update your Q please. Good luck. — shellter
– shellter, Commented Jun 10, 2016 at 1:59
@shelter I'm sorry, I edited my question so one of the input files has t12 in it's third column instead of t11, the output should hopefully make more sense now, and the part with ar[$4,$1] does work, I used an adjusted part of it successfully but can't figure this one part out — Sam
– Sam, Commented Jun 13, 2016 at 17:19

agc · Accepted Answer · 2016-06-14 18:33:30Z

2

Non-awk, inefficient shell tools code:

while read a b c ; do \
    echo -n "$b " ; \
    egrep "^$b " file2 | \
      grep -A 9999999 " $c" | \
      cut -d' ' -f2,3 | \
      sed '1{s/ .*//;t}
           ${s/.* //;t};d' | \
      xargs echo -n  ; \
    echo " $a" ; \
done < file1 | \
  column -t

Output:

8  4520  7400   5
7  10610 16789  15

The main loop inputs file1 which controls what in file2 needs to be printed. file1 has 3 fields, so read needs 3 variables: $a, $b, and $c. The output uses $b and $a, so those two variables come "for free" -- the first and last lines of the main loop, (both echos), prefix $b and suffix $a to the two numbers in the middle of each line.

The egrep prints every line in file2 that begins with $b, but of those lines we only want the one that ends in $c plus the lines after that, which is what grep -A ... prints. Only the middle two columns are needed, so cut prints just those columns. Now we have a two column block of numbers, and we only want the upper left corner, or the lower right corner, which the sed code prints...

Any sed code automatically counts lines as it runs. When sed hits the first line, it runs what's in the first set of curly brackets, ('1{<code>}'). If that fails sed checks if it's the last line, ($ means last line), if it is, sed runs what's in the second set of curly brackets, ('${<code>}'). If it's not the first or last line sed deletes it.

Inside those curly brackets: s/ .*// works just like cut -f 1 would. The closing t means 'GOTO label', but when there's no 'label' sed just starts a new cycle, reading another line -- without t, the code would run the d, and print nothing. With two fields, s/.* // works like cut -f 2, etc.

Each pass of the main while loop sed prints two numbers, but each is on it's own line. Piping that to xargs echo -n puts both numbers on the same line as the $b was printed on.

edited Jun 14, 2016 at 18:33

answered Jun 10, 2016 at 5:56

agc

8,5342 gold badges33 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Sam Over a year ago

Thank-you, i realized i typed one of my input files wrong though, it should have been t12 in column 3 instead of t11 in the row starting with 7 so that the output has 10610 in it's second column second row(that part was there before) Do you know how to adjust this code accordingly

Sam Over a year ago

Also when I run this I get sed: 1: "1{s/ .*//;t};${s/.* //; ...": unexpected EOF (pending }'s), and I'm not too familiar with sed

agc Over a year ago

Copy the whole block, then paste it all at once to the command line, it should work.

agc Over a year ago

On row2,col3 of file1 being 't12', it makes no difference. The above code does not use column #3 of file1, or column #4 of file2.

Sam Over a year ago

copied and pasted the whole thing and still got this sed: 1: "1{s/ .*//;t};${s/.* //; ...": unexpected EOF (pending }'s) sed: 1: "1{s/ .*//;t};${s/.* //; ...": unexpected EOF (pending }'s)

|

Collectives™ on Stack Overflow

Using shell commands within an awk script which must access the awk commands

1 Answer 1

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related