2

I am attempting to split a csv file based on unique column values into multiple files using awk. I am able to split the csv successfully with awk -F\, '{print > $2".csv"}' example.csv however it is committing the header column from the new files.

For example:

example.csv

Color,Car,Make
Red,Kia,Spectra
Orange,Kia,Sportage
Green,Ford,Explorer
Black,Ford,F-150

Result:

Kia.csv

Red,Kia,Spectra
Orange,Kia,Sportage
___________________
Ford.csv

Green,Ford,Explorer
Black,Ford,F-150

My desired output:

Kia.csv

Color,Car,Make
Red,Kia,Spectra
Orange,Kia,Sportage
___________________
Ford.csv

Color,Car,Make
Green,Ford,Explorer
Black,Ford,F-150

To attempt getting the header column passed to the new files, I attempted something like this awk -F'|' 'FNR==1{hdr=$0;next} {if (!seen[$1]++) print hdr>$2; print>$2}' example.csv but unfortunately this did not have the intended result.

2 Answers 2

2

You are almost there. Would you please try:

awk -F, '
    FNR==1 {header = $0; next}
    !seen[$2]++ {print header > $2".csv"}
    {print > $2".csv"}
' example.csv

If you have many varieties of car makes, "too many open files error" may occur. In such cases, please close files referring @RavinderSingh13's answer.

Sign up to request clarification or add additional context in comments.

Comments

2

1st solution: With your shown samples, please try following awk code.

awk -F, '
FNR==NR{
  header=$0
  next
}
{
  outputFile=$2".csv"
}
prev!=$2".csv" || !prev{
  close(prev)
  print header > (outputFile)
}
{
  print $0 > (outputFile)
  prev=outputFile
}
' <(head -1 Input_file) <(tail -n +2 Input_file | sort -t, -k2)


2nd solution: Adding solution with only 1 pass of reading Input_file.

awk -F, -v header=$(head -1 Input_file) '
{
  outputFile=$2".csv"
}
prev!=$2".csv" || !prev{
  close(prev)
  print header > (outputFile)
}
{
  print $0 > (outputFile)
  prev=outputFile
}
' <(tail -n +2 Input_file | sort -t, -k2)

1 Comment

It is a good idea (and is mandatory in most cases) to close files to avoid too many open files error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.