0

I want to loop over a csv file using CSV.foreach, read the data, perform some operation with it, and write the result to the last column of that row, using the Row object.

So let's say I have a csv with data I need to save to a database using Rails ActiveRecord, I validate the record, if it is valid, I write true in the last column, if not I write the errors.

Example csv:

id,title
1,some title
2,another title
3,yet another title
CSV.foreach(path, "r+", headers: true) do |row|
  archive = Archive.new(
    title: row["title"]
  )
  archive.save!
  row["valid"] = true
  
rescue ActiveRecord::RecordInvalid => e
  row["valid"] = archive.errors.full_messages.join(";")
end

When I run the code it reads the data, but it does not write anything to the csv. Is this possible?

Is it possible to write in the same csv file?

Using:

  • Ruby 3.0.4
3
  • The row is just a copy of the data, changing it won't do anything to the original file. You need to open a second file for writing and output each row to it once you know what it should look like Commented Sep 14, 2022 at 21:53
  • @DaveSlutzkin I understand that the row is just an in memory object. But isn't there way to read and write in the same loop to the same file? Commented Sep 16, 2022 at 19:02
  • how would you expect that to work? Changing a file while you're reading from it isn't a very well defined procedure. If you naively wrote back to the file it would either append a row to the end of it (not what you want) or overwrite the entire file with the new row (also not what you want until you've finished reading the original rows). There's a reason unix pipes create a new abstracted file at each step of the pipeline. Commented Sep 17, 2022 at 22:14

2 Answers 2

3

The row variable in your iterator exists only in memory. You need to write the information back to the file like this:

new_csv = ["id,title,valid\n"]

CSV.foreach(path, 'r+', headers: true) do |row|  # error here, see edit note below
  row["valid"] = 'foo'
  new_csv << row.to_s
end

File.open(path, 'w+') do |f|
  f.write new_csv
end

[EDIT] the 'r+' option to foreach is not valid, it should be 'r'

Sign up to request clarification or add additional context in comments.

4 Comments

so it is not possible to do this on the same CSV while we are reading? So whats the point of mode r+ in CSV.foreach? If you could update your answer with this I would be very grateful.
You chose to put r+ in the options, but such a value is not documented in the ruby docs (ruby-doc.org/stdlib-3.0.4/libdoc/csv/rdoc/…) so I would suggest that it is not a valid option. I should not have included it in my answer.
I believe it is, take a look at Open Mode. I guess it should work with CSV.foreach.
Well I think the fact that the CSV documentation indicates the mode should be 'open for reading' combined with the fact that, as you discovered, it doesn't work like the IO.open 'r+' option, is pretty convincing to me that it's not a legit option! But, yeah if you find a way, I'd love to see it. The file is writable inside the iterator, but you don't have access to the csv object that facilitates writing, only the row object, which you can mutate but it doesn't write to the file.
1

Maybe this is over-engineering things a bit. But I would do the following:

  1. Read the original CSV file.
  2. Create a temporary CSV file.
  3. Insert the updated headers into the temporary CSV file.
  4. Insert the updated records into the temporary CSV file.
  5. Replace the original CSV file with the temporary CSV file.
csv_path = 'archives.csv'
input_csv = CSV.read(csv_path, headers: true)
input_headers = input_csv.headers

# using an UUID to prevent file conflicts
tmp_csv_path = "#{csv_path}.#{SecureRandom.uuid}.tmp"
output_headers = input_headers + %w[errors]

CSV.open(tmp_csv_path, 'w', write_headers: true, headers: output_headers) do |output_csv|
  input_csv.each do |archive_data|
    values  = archive_data.values_at(*input_headers)
    archive = Archive.new(archive_data.to_h)

    archive.valid?
    # error_messages is an empty string if there are no errors
    error_messages = archive.errors.full_messages.join(';')

    output_csv << values + [error_messages]
  end
end

FileUtils.move(tmp_csv_path, csv_path)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.