Multiple awk output to csv columns

Question

I have a number of commands that all run well independently. The 5 below awk commands currently output the required data, separated by commas. I would like to combine all of below awk outputs so they are presented in a single csv file but I am not having much luck.

The below commands simply search for lines containing particular words and the output of required fields are extracted. For example.. I would like $4,$5,$6,$7,$8,$9 from first command to fill first 6 columns followed by output of $2,$4 from second command in the next 2 columns.

awk '/start/ { print $4,$5,$6,$7,$8,$9 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/PacketLoss/ { print $2,$4 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/PacketOutOfSequence/ { print $2,$4,$6 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/JitterSD/ { print $3,$6,$9 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/NumOfRTT/ { print $2,$4,$6,$8 }' 07.08.2014.txt | sed -e "s/ /,/g"

Example input data

Aggregation start time 11:45:47.893 BST Thu Aug 7 2014
NumOfRTT: 360000        RTTAvg: 145     RTTMin: 144     RTTMax: 171
PacketLossSD: 0 PacketLossDS: 0
PacketOutOfSequence: 3  PacketMIA: 0    PacketLateArrival: 0
Jitter Avg: 1   JitterSD Avg: 1 JitterDS Avg: 1

Example output

11:45:47.893,BST,Thu,Aug,7,2014,0,0,3,0,0,1,1,1,360000,145,144,171

It would also be nice to label each column as below but that isn't critical if too complicated as I can do it manually

START_TIME,BST,DAY,MONTH,DATE,YEAR,PacketLossSD,PacketLossDS,PacketOutOfSeq,PacketMIA,PacketLateArrival,JitterAvg,JitterSD_Avg,JitterSD_Avg,NumOfRTT,RTTAvg,RTTMin,RTTMax
11:45:47.893,BST,Thu,Aug,7,2014,0,0,3,0,0,1,1,1,360000,145,144,171

Thanks in advance for any assistance :)

Your output doesn't match your input. And your column labels don't match either. (You have PacketLossSD as 2.) — ooga
– ooga, Commented Aug 8, 2014 at 14:51

ooga · Accepted Answer · 2014-08-08 23:52:34Z

awk '
  NR==1 { print "START_TIME,BST,DAY,MONTH,DATE,YEAR,PacketLossSD,PacketLossDS,PacketOutOfSeq,PacketMIA,PacketLateArrival,JitterAvg,JitterSD_Avg,JitterSD_Avg,NumOfRTT,RTTAvg,RTTMin,RTTMax" }
  /start/               { tm = sprintf("%s,%s,%s,%s,%s,%s",$4,$5,$6,$7,$8,$9) }
  /PacketLoss/          { pl  = sprintf("%s,%s",$2,$4) }
  /PacketOutOfSequence/ { pos = sprintf("%s,%s,%s",$2,$4,$6) }
  /NumOfRTT/            { rtt = sprintf("%s,%s,%s,%s",$2,$4,$6,$8) }
  /JitterSD/            { printf("%s,%s,%s,%s,%s,%s,%s\n",tm,pl,pos,$3,$6,$9,rtt) }
' 07.08.2014.txt

The idea is to save the data in strings as the lines are read and only print it out when the last line (assumed to be the one containing "JitterSD") is read.

Alternate idea:

awk '
  BEGIN { RS=""; FS="\n"; OFS="," }
  {
    split($1,a," "); L1=a[4]","a[5]","a[6]","a[7]","a[8]","a[9]
    split($2,a," "); L2=a[2]","a[4]","a[6]","a[8]
    split($3,a," "); L3=a[2]","a[4]
    split($4,a," "); L4=a[2]","a[4]","a[6]
    split($5,a," "); L5=a[3]","a[6]","a[9]
    print L1, L3, L4, L5, L2
  }
' 07.08.2014.txt

RS="" is a special setting that splits records on one or more blank lines, used for multiline records.

FS="\n" will set $1, $2, etc, to line 1, line 2, etc of the multiline record.

split($1,a," ") splits the line into fields separated by spaces, putting them in the array a.

Collectives™ on Stack Overflow

Multiple awk output to csv columns

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related