1

I have reports that are generated in the following tab-delimited format:

UNIT  TC    CC    PC    TCP   FTX   FRX   
HOUSE 55    65    75    85    95    105
CAR   100   200   300   400   500   600
H2    5     10    15    20    25    30
C2    10    20    30    40    50    60

I need to change them to the following format:

HOUSE TC    55
HOUSE CC    65
HOUSE PC    75
HOUSE TCP   85
HOUSE FTX   95
HOUSE FRX   105
CAR   TC    100
CAR   CC    200
CAR   PC    300
CAR   TCP   400
CAR   FTX   500
CAR   FRX   600

And so on.

I would like to use standard tools such as SED AWK BASH but any suggestions are welcome. The code will be inserted into a BASH script that I'm already using to parse and concatenate the data beforehand. The number so entries will always be the same, the reports don't change.

1 Answer 1

1

Try:

$ awk 'BEGIN { FS="\t" } NR==1 { split($0,header,"\t") ; next } { for(i=2;i<=NF;i++) print $1,header[i],$i }' data
HOUSE TC 55
HOUSE CC 65
HOUSE PC 75
HOUSE TCP 85
HOUSE FTX 95
HOUSE FRX 105
CAR TC 100
CAR CC 200
CAR PC 300
CAR TCP 400
CAR FTX 500
CAR FRX 600
H2 TC 5
H2 CC 10
H2 PC 15
H2 TCP 20
H2 FTX 25
H2 FRX 30
C2 TC 10
C2 CC 20
C2 PC 30
C2 TCP 40
C2 FTX 50
C2 FRX 60

The oneliner broken into pieces:

Set tab char as field separator of input files:

BEGIN { FS="\t" }

If first line (NR==1) split it into fields and store them in array header. This simpy is shorter than copying all fields $1, $2, ... in a for loop and store them. The next command prevents line 1 from being processed by the following code too, which is for the other lines only. (FS instead of "\t" would have been more consequent...)

NR==1 { split($0,header,"\t") ; next }

For each other line (NR!=1) print all fields ($2...$NF) prefixed by $1 and the field's name (header[i]).

{ for(i=2;i<=NF;i++) print $1,header[i],$i }

Setting OFS=FS="\t" in the BEGIN block will make print use a tab between the fields. I did not change this in the answer because it would need to reformat all output lines too.

2
  • This works almost perfectly. It does output in space-delimited format, instead of tab-delimited, but I can use it. An explanation as to how it works its magic would be appreciated. Thanks. Commented Jun 12, 2014 at 13:13
  • Replace BEGIN { FS="\t" } by BEGIN { OFS=FS="\t" }. Commented Jun 12, 2014 at 13:35

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.