Adding columns to a csv table with AWK from multiple files

Question

I'm looking to build a csv table by getting values from several files with AWK. I have it working with two files, but I can't scale it beyond that. I'm currently taking the output of the second file, and appending the third, and so on.

Here are example files:

#file1  #file2  #file3  #file4
100     45      1       5
200     23      1       2
300     29      2       1
400     0       1       2
500     74      4       5

This is the goal:

#data.csv
1,100,45,1,5
2,200,23,1,2
3,300,29,2,1
4,400,0,1,2
5,500,74,4,5

This is what I have working:

awk 'FNR==NR { a[FNR""] = NR", " $0","; next } { print a[FNR""], $0}' $file1 $file2

With the result:

But when I try and get it to work on 3 or more files, like so:

awk 'FNR==NR { a[FNR""] = NR", " $0","; next } { print a[FNR""], $0; next } { print a[FNR""], $0}' $file1 $file2 $file3

I get this output:

In the first column the line count restarts, and the second column it also repeats the first file. In the third column is where it adds the third and subsequent files as new rows, where I would expect these should be added as columns. No new rows required.

Any help would be greatly appreciated. I have learned most of my AWK from Stack Exchange, and I know I'm missing something fundamental here. Thanks,

thanasisp · Accepted Answer · 2017-11-21 20:22:00Z

5

as already answered you can use paste. To get the exact output with comma delimited line numbering, you can do this

paste -d, file{1..4} | nl -s, -w1

-s, sets number separator as comma (default is tab).
-w1 sets number width, so there are no initial spaces (because default is bigger)

another solution with awk

awk    '{a[FNR]=a[FNR] "," $0} 
    END {for (i=1;i<=length(a);i++) print i a[i]}' file{1..4}

edited Nov 21, 2017 at 20:22

answered Nov 20, 2017 at 18:51

thanasisp

6,0053 gold badges18 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

dawg Over a year ago

Both of these are great solutions. Use this.

karakfa Over a year ago

One note for distinction between {1..4} vs [1-4]; the former will complain if any of the files is missing, the latter will do with whatever is there.

Campbell McGrouther Over a year ago

@thanasisp these are both great! I wasn't aware of the paste command.

Campbell McGrouther Over a year ago

The -s, flag gives it the line numbers, but only single digits. Cycling back to 0 after 9. If I use -s it gives me the line numbers correctly but adds the other flag. 8,477,3461 9,3474,6100 0,471,6602 1,3662,3628 And the Awk example is great too. However, is there a way to get the lines sorted correctly? They are numbered right, but the rows get muddled. See example below of the first 5 rows of a 23 row table.

22,485,6871,6183,465,3954 2,458,3178,2621,477,707 23,951,5063,3163,476,707 3,454,4128,2323,462,1009 4,454,3715,2873,447,666

Thanks!

Yoda · Accepted Answer · 2017-11-20 18:28:37Z

1

Why don't you use paste and then simply number each row:-

paste -d"," file1 file2 file3 file4
100,45,1,5
200,23,1,2
300,29,2,1
400,0 ,1,2
500,74,4,5

answered Nov 20, 2017 at 18:28

Yoda

4452 silver badges7 bronze badges

Comments

randomir · Accepted Answer · 2017-11-20 18:53:43Z

1

An awk solution for a variable number of files:

awk '{ !line[FNR] && line[FNR]=FNR; line[FNR]=line[FNR]","$0 }
     END { for (i=1; i<=length(line); i++) print line[i] }' file1 file2 ... fileN

For example:

$ awk '{ !line[FNR] && line[FNR]=FNR; line[FNR]=line[FNR]","$0 }
      END { for (i=1; i<=length(line); i++) print line[i] }' \
      <(seq 1 5) <(seq 11 15) <(seq 21 25) <(seq 31 35)
1,1,11,21,31
2,2,12,22,32
3,3,13,23,33
4,4,14,24,34
5,5,15,25,35

answered Nov 20, 2017 at 18:53

randomir

18.8k1 gold badge46 silver badges60 bronze badges

Comments

Niall Cosgrove · Accepted Answer · 2017-11-20 21:07:06Z

1

Here is a beginner friendly solution. If you need to manipulate the data on the way in you can clearly see which file is being read.
ARGIND is gawk specific. It tells us which file we are processing. We fill two arrays a and b from file1 and file2 and then print your desired output while processing file3.

awk '
ARGIND == 1 { a[FNR] = $0 ; next }
ARGIND == 2 { b[FNR] = $0 ; next }
ARGIND == 3 { print FNR "," a[FNR] "," b[FNR] "," $0 }
' file1 file2 file3

Output:

1,100,45,1
2,200,23,1
3,300,29,2
4,400,0,1
5,500,74,4

edited Nov 20, 2017 at 21:07

answered Nov 20, 2017 at 19:17

Niall Cosgrove

1,2961 gold badge15 silver badges25 bronze badges

Collectives™ on Stack Overflow

Adding columns to a csv table with AWK from multiple files

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related