Split csv file based on column from command line

Question

I have some data in a file in the form of csv of the form:

ID,DATE,EARNING
1,12 May 2018,5
1,13 May 2018,15
2,12 May 2018,25

I want to split this into multiple files such that file_1_May_report contains:

ID,DATE,EARNING
1,12 May 2018,5
1,13 May 2018,15

and another file file_2_May_report that contains:

ID,DATE,EARNING
2,12 May 2018,25

I have tried :

awk -F, '{print >> $1}' input.csv

However I only get one file 1 with only one record, that is the last record in the input file. How do I get it to split into multiple files based on ID?

anubhava · Accepted Answer · 2018-05-24 19:57:33Z

1

You may use this awk:

awk -F, 'NR==1{hdr=$0; next} !seen[$1]++{fn="file_" $1 "_May_report"; print hdr > fn} {print > fn}' input.csv

Or with a more readable format:

awk -F, 'NR == 1 {
   hdr = $0
   next
}
!seen[$1]++ {
   fn = "file_" $1 "_May_report"
   print hdr > fn
}
{
   print > fn
}' input.csv

answered May 24, 2018 at 19:57

anubhava

790k67 gold badges603 silver badges671 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

anubhava Over a year ago

I had properly tested this awk before posting. Can you clarify what didn't work?

user2759617 Over a year ago

It doesnt create any files

user2759617 Over a year ago

This worked. The problem was my file has the wrong line terminators. had to run tr '^M' '\n' <input.csv > unix-input.csv

Collectives™ on Stack Overflow

Split csv file based on column from command line

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related