2

Right now I can extract 1 column (column 6) from the csv file. How could I edit the script below to extract more than 1 column? Let's say I also want to extract column 9 and 10 as well as 6. I would want the output to be such that column 6 ends up in column 1 of the output file, 9 in the 2nd column of the output file, and column 10 in the 3rd column of the output file.

ruby -rcsv -e 'CSV.foreach(ARGV.shift) {|row| puts row [5]}' input.csv &> output.csv
2
  • 1
    Create a normal script that reads fro a file and writes to another file instead of shell oneliner, otherwise you’ll experience issues with escaping quotes, commas etc. Commented Sep 13, 2019 at 4:25
  • Unrelated answer, but maybe useful to you: If your file has headers, you should be able to pass in a headers: true argument and then specify the columns you want rather than their index row[:first_name],row[:last_name] etc. Commented Sep 13, 2019 at 18:58

2 Answers 2

2

Since row is an array, your question boils down to how to pick certain elements from an array; this is not related to CSV.

You can use values_at:

row.values_at(5,6,9,10)

returns the fields 5,6,9 and 10.

If you want to present these picked fields in a different order, it is however easier to map each index explicitly:

output_row = Array.new(row.size) # Or row.dup, depending on your needs
output_row[1] = row[6]
# Or, if you have used row.dup and want to swap the rows:
output_row[1],output_row[6] = row[6],row[1]
# and so on
out_csv.puts(output_row)

This assumes that you have defined before

out_csv=CSV.new(STDOUT)

since you want to have your new CSV be created on standard output.

Sign up to request clarification or add additional context in comments.

Comments

1

Let's first create a (header-less) CSV file:

enum = 1.step
FNameIn = 't_in.csv'

CSV.open(FNameIn, "wb") { |csv| 3.times { csv << 5.times.map { enum.next } } }
  #=> 3   

I've assumed the file contains string representations of integers.

The file contains the three lines:

File.read(FNameIn).each_line { |line| p line }
"1,2,3,4,5\n"
"6,7,8,9,10\n"
"11,12,13,14,15\n"

Now let's extract the columns at indices 1 and 3. These columns are to be written to the output file in that order.

cols = [1, 3]

Now write to the CSV output file.

arr = CSV.read(FNameIn, converters: :integer).
          map { |row| row.values_at(*cols) }
  #=> [[2, 4], [7, 9], [12, 14]] 

FNameOut = 't_out.csv'
CSV.open(FNameOut, 'wb') { |csv| arr.each { |row| csv << row } }

We have written three lines:

File.read(FNameOut).each_line { |line| p line }
"2,4\n"
"7,9\n"
"12,14\n"

which we can read back into an array:

CSV.read(FNameOut, converters: :integer)
  #=> [[2, 4], [7, 9], [12, 14]]

A straightforward transformation of these operations is required to perform these operations from the command line.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.