Using AWK and setting results to bash variables/arrays?

Question

I have a file that replicates the results from show processlist command from mySQL. The file looks like this:

*************************** 1. row ***************************
Id: 1
User: system user
Host:
db: NULL
Command: Connect
Time: 1030455
State: Waiting for master to send event
Info: NULL
*************************** 2. row ***************************
Id: 2
User: system user
Host:
db: NULL
Command: Connect
Time: 1004
State: Has read all relay log; waiting for the slave
       I/O thread to update it
Info: NULL

And it keeps going on for a few more times in the same structure.

I want to use AWK to only get these parameters: Time,ID,Command and State, and store every one of these parameters into a different variable or array so that I can later use / print them in my bash shell.

The problem is, I am pretty bad with AWK, I dont know how to both seperate the parameters I want from the file and also set them as a bash variable or array.

Many thanks in advance for the help!

EDIT: Here is my code so far

echo "Enter age"
read age
cat data | awk 'BEGIN{ RS="row"
FS="\n"
OFS="\n"}
{ print $2,$7}
' | awk 'BEGIN{ RS="Id"}
{if ($4 > $age){print $2}}'

The file 'data' contains blocks like I have pasted above. The code should, if the 'age' entered is smaller than the Time parameter in the data file (which is $4 in my awk code), return the ID parameter, but it returns nothing.

If I remove the if statement and print $4 instead of $2 this is my output

Enter age
1

1030455
1004
2144
2086
0

So I was thinking maybe that blank line is somehow messing up my AWK print? Is there a simple way to ignore that blank line while keeping my other data?

Paste the output into the question, mark it with the mouse, then use the {} tool in the toolbar to mark it as literal code. — Barmar
– Barmar, Commented Feb 8, 2015 at 6:51
awk can't set shell variables directly. What you can do is pipe the output of awk to a shell while read var1 var2 var3 loop. — Barmar
– Barmar, Commented Feb 8, 2015 at 6:53
I did not really understand your solution though, could you elaborate? — gotner
– gotner, Commented Feb 8, 2015 at 7:28
You might want to switch to another language for this: perl or python, or... and ditch bash and awk altogether. — gniourf_gniourf
– gniourf_gniourf, Commented Feb 8, 2015 at 12:16

Ed Morton · Accepted Answer · 2015-02-08 16:40:27Z

This is how you'd use awk to produce the values you want as a set of tab-separated fields on each line per "row" block from the input:

$ cat tst.awk
BEGIN {
    RS="[*]+ [[:digit:]]+[]. row [*]+\n"
    FS="\n"
    OFS="\t"
}
NR>1 {
    sub(/\n$/,"")     # remove the trailing newline
    gsub(/\n\s+/," ") # compress all multi-line fields into single lines
    gsub(OFS," ")     # ensure the only OFS in the output IS between fields

    delete n2v
    for (i=1; i<=NF; i++) {
        name = gensub(/:.*/,"","",$i)
        value = gensub(/^[^:]+:\s+/,"","",$i)
        n2v[name] = value
    }

    if (n2v["Time"]+0 > age) {  # force a numeric comparison
        print n2v["Time"], n2v["Id"], n2v["Command"], n2v["State"]
    }
}

$ awk -v age=2000 -f tst.awk file
1030455 1       Connect Waiting for master to send event

If the target age is already stored in a shell variable just init the awk variable from the shell variable of the same name:

$ age="2000"
$ awk -v age="$age" -f tst.awk file

The above uses GNU awk for multi-char RS (which you already had), gensub(), \s, and delete array.

When you say "and store every one of these parameters into a different variable or array" it could mean one of several things so I'll leave that part up to you but you might be looking for something like:

arr=( $(awk '...') )

or

awk '...' |
while IFS="\t" read -r Time Id Command State
do
    <do something with those 4 vars>
done

but by far the most likely situation is that you don't want to use shell at all but instead just stay inside awk.

Remember - every time you write a loop in shell just to manipulate text you have the wrong approach. UNIX shell is an environment from which to call UNIX tools and the UNIX tool for general text manipulation is awk.

Until you edit your question to tell us more about your problem though, we can't guess what the right solution is from this point on.

pawel7318 · Accepted Answer · 2015-02-08 11:02:47Z

1

At the first level you have your shell which you use to run any other child process. It's impossible to modify parents environment from within child process. When you run your bash script file (which has +x right) it's spawned as new process (child). It can set it's own environment but when it ends its live you'll get back to the original (parent).

You can set some variables on bash and export them to it's environment. It'll be inherited by it's children. However it can't be done in opposite direction (parent can't inherit from its child).

If you wish to execute some commands from the script file in the current bash's context you can source the script file. source ./your_script.sh or . ./your_script.sh will do that for you.

If you need to run awk to filter some data for you and keep results in the bash you can do:

awk ... | read foo

This works as read is shell buildin function rather than external process (check type read, help, help read, man bash to check it by yourself).

or:

foo=`awk ....`

There are many other constructions you can use. Whatever bash script you do please compare your code with bash pitfalls webpage.

answered Feb 8, 2015 at 11:02

pawel7318

3,6433 gold badges31 silver badges48 bronze badges

1 Comment

pawel7318 Over a year ago

One simple (but not much efficient) way to deal with empty lines is strings.

Collectives™ on Stack Overflow

Using AWK and setting results to bash variables/arrays?

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related