1

I have a CSV where the first column gives the weekday. But the problem is that I need to use the CSV in MATLAB, which does not allow string values. So, I want to convert the Sunday to 7, ..., Monday to 1 in the CSV.

But I can't seem to find a way to do that in Ruby. And I can't open the excel manually to do it either because the file size is huge. Specifically, I'm having problem in figuring out the syntax of how to access and point to the specific column I want in Ruby. For example, if the file would have loaded in MATLAB, and if I was still required to convert the weekdays into numbers for some reason, I would have written a simple code like this:

for i=1:length(Columns(:,1))
    if Columns(i,1)=='sunday'
        Columns(i,1)=7
    elseif Columns(i,1)=='saturday'
        Columns(i,1)=6
    elseif Columns(i,1)=='friday'
        Columns(i,1)=5
    elseif Columns(i,1)=='thursday'
        Columns(i,1)=4
    elseif Columns(i,1)=='wednesday'
        Columns(i,1)=3
    elseif Columns(i,1)=='tuesday'
        Columns(i,1)=2
    elseif Columns(i,1)=='monday'
        Columns(i,1)=1
    end
end

So, I am having problem in figuring out the Ruby equivalent of this statement:

for i=1:length(Columns(:,1))

Any help is appreciated. Thanks.

0

1 Answer 1

3

There are two good CSV libraries in Ruby, I'm assuming based on what you said that the CSV file has no header (if it did, SmarterCSV makes things a bit easier).

That said, you wanted to get the first column:

require 'csv'

your_csv = CSV.open("your_csv.csv")
# This is the line you wanted:
first_column = your_csv.map(&:first)

# Then to do the weekday conversion (with a Hash):
convert_weekdays = { "sunday" => 1, "monday" => 2, "tuesday" => 3, "wednesday" => 4, "thursday" => 5, "friday" => 6, "saturday" => 7 }

converted = first_column.map { |row| convert_weekdays[row] }

Not sure if that's exactly what you wanted, there are a lot of ways to work with CSV files in Ruby.

To save the CSV, you'll want to open a new (or the same) CSV file using CSV.open:

your_csv = CSV.open("your_csv.csv")
CSV.open("saved_csv.csv", "w") do |csv|
  your_csv.each { |row| csv << [convert_weekdays[row.first], *row[1...row.size]] }
end

Sorry it's a bit inelegant, writing CSV in Ruby is not always the easiest thing! Note the curly braces {} are the same as do end, but are conventionally used when the inner block is only one line.

Edit: this method is perhaps a bit faster with large files:

your_csv = CSV.parse("your_csv.csv")
convert_weekdays = { "sunday" => 1, "monday" => 2, "tuesday" => 3, "wednesday" => 4, "thursday" => 5, "friday" => 6, "saturday" => 7 }
by_columns = your_csv.transpose
by_columns.first.map! { |row| convert_weekdays[row] }
CSV.open("saved_csv.csv", "w") do |csv|
  by_columns.transpose.each { |row| csv << row }
end

This way, you load the CSV as a two dimensional array, transpose it, and only operate on the first column.

Sign up to request clarification or add additional context in comments.

10 Comments

Hello. Many thanks for that, I am running the code, let's see if it works. What is &: though, and how does it work? And is it going to edit in the same csv, or create a new csv with some default name? I ask because as of now, when the code is running, I don't see any change in the "date modified" section for the file when I refresh the folder.
Nope, this is just going to create a local object. You have to resave the CSV on your own. Try using CSV.open. This comment thing doesn't really let me elaborate, I'll add to my answer.
As for &, this stands for symbol to proc. You can pass this as a block in ruby, similar to saying your_csv.map { |row| row.first }, the &: is a handy shortcut when you are calling a single method on each object. The :first is a symbol that represents the first method, and & makes map interpret this as a block.
Wow. Thanks a lot. It works. I'll need some time to understand it. But if I understand correctly, there's no way to index arrays in Ruby, right? I mean, one can do it in the way you mentioned above, but no Array(i,j) operation is possible here, right? In other words, one can't access only some specific values of an array without having some kind of common condition relevant to those specific values. Am I correct?
And by the way, I have just noticed that the speed with which the new file gets created is actually faster if your_csv is smaller in size, and the speed is much slower if your_csv is huge. I mean, if you refresh the folder and see the file size changing, its much faster if the original csv is smaller, and vice versa. Like, it takes 10 times the time for a csv 3 times the size. Why is that so?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.