1

I need to parse the content of a CSV file, that changes on its content, without any advise about. Therefore, sometimes I'll receive a file with this content:

"No,Latitude,Longitude,Name,Altitude,Satellites,Date,Time"

and some others with this..

"No,Latitude,Longitude,Name,Altitude,Date,Time"

or maybe other (always with the same content.. but sometimes 5 columns.. some others 6 columns... etc. I mean.. sometimes I miss Name.. or Satellites... or both)

Therefore.. I need to read and understand what is inside of each column for elaborating a fixed CSV with (always) the same content:

"No,Latitude,Longitude,Name,Altitude,Satellites,Date,Time"

In order to do that I was thinking on:

  1. Read the header and assign some values to each column:

IFS=",";read -r a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] a[10] a[11]

This is for reading the header

Same process for the header I need to create (the one I know):

IFS=",";read -r b[1] b[2] b[3] b[4] b[5] b[6] b[7] <<< "No,Latitude,Longitude,Name,Altitude,Satellites,Date,Time"

  1. Run a loop for each non empty values of the A array.. trying to match any of the headers of B array
do

for x in [1..7]

do

if [[ ${a[i]} = ${b[x]} ]] 

then

# The column i from my source file belongs to the column x of my output file 

output[$x]=$i

fi

done

done
  1. Store this on my new file.. with something like this.
IFS=","; while read -r a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] a[10] a[11]

do
 
echo "${a[output[1]]}, ${a[output[2]]},${a[output[3]]},${a[output[4]]},${a[output[5]]},${a[output[6]]},${a[output[7]]}" >> new.csv

done < source.csv

I think this should work since it will record the appropriated source column into the desired final column....

But.. I'm sure that there are many ways easier and more elegant for doing this..

Any help?

Thanks

6
  • if name and satellites are both missing, as you say, then where are you reading their values? Commented Jun 12, 2022 at 1:20
  • what's the ultimate objective ... add a null/empty field for 'missing' columns, with the intention being to end up with a final csv that has, what, 7 total columns? Commented Jun 12, 2022 at 2:04
  • Are the column headings valid shell variable names? If so, KamilCuk's answer to "How to process all or selected rows in a csv file where column headers and order are dynamic?" (mostly the first part) might work (though I'd recommend using a plain redirect, instead of piping from cat). Also see rici's answer to "Use array elements as names of the variables i want to read to from a file". Commented Jun 12, 2022 at 3:34
  • the main idea is not produce a truncated CSV. I need that in any case latitude, longitude and time there, but the rest.. are not really relevant. If one of those values are missing, i can paste a 0 o a null. Commented Jun 12, 2022 at 11:25
  • Thanks for the comments.. I'll review KamilCuk's answer Commented Jun 12, 2022 at 11:26

1 Answer 1

2

Miller possesses the template verb for exactly this kind of task:

mlr template [options]
     Places input-record fields in the order specified by list of column names.
     If the input record is missing a specified field, it will be filled with the fill-with.
     If the input record possesses an unspecified field, it will be discarded.
Options:
     -f {a,b,c} Comma-separated field names for template.
     -t {filename} CSV file whose header line will be used for template.
     --fill-with {filler string} What to fill absent fields with. Defaults to the empty string.

So you could do:

mlr --csv template \
    -f 'No,Latitude,Longitude,Name,Altitude,Satellites,Date,Time' \
    --fill-with '0' \
    file.csv

Or use the header of an other file as reference:

mlr --csv template -t header.csv --fill-with '0' file.csv
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.