2

I exported tables and queries from SQL, where some of the fields are multi-line.

The Ruby (1.9+) way to read CSV appears to be:

require 'csv'

CSV.foreach("exported_mysql_table.csv", {:headers=>true}) do |row|
    puts row
end

Which works great if my data is like this:

"id","name","email","potato"
1,"Bob","[email protected]","omnomnom"
2,"Charlie","[email protected]","andcheese"
4,"Doug","[email protected]","usemeltattack"

(The first line is the header/attributes)

But if I have:

"id","name","address","email","potato"
1,"Bob","--- 
- 101 Cottage row
- Lovely Village
- \"\"
","[email protected]","omnomnom"
2,"Charlie","--- 
- 102 Flame Street
- \"\"
- \"\"
","[email protected]","andcheese"
4,"Doug","--- 
- 103 Dark Cave
- Next to some geo dude
- So many bats
","[email protected]","usemeltattack"

Then I get the error:

.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1894:in `block (2 levels) in shift': Missing or stray quote in line 2 (CSV::MalformedCSVError)

This seems to be because the end of the line doesn't have a close quote, as it spans several lines.

(I tried 'FasterCSV', that gem became 'csv' since ruby 1.9)

1
  • First you have to reconstruct spanned fields before passing to csv parser. Commented Oct 16, 2012 at 14:54

1 Answer 1

5

Your problem is not the multiline but malformed CSV.

Replace the \" and end space after a line end like this:

require 'csv' 

ml = %q{"id","name","address","email","potato" 
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- \"\" 
","[email protected]","omnomnom" 
2,"Charlie","---  
- 102 Flame Street 
- \"\" 
- \"\" 
","[email protected]","andcheese" 
4,"Doug","---  
- 103 Dark Cave 
- Next to some geo dude 
- So many bats 
","[email protected]","usemeltattack"}

ml.gsub!(/\" \n/,"\"\n").gsub!(/\\\"/,"__")

CSV.parse(ml, {:headers=>true}) do |row|
  puts row
end

This gives:

"id","name","address","email","potato"
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- ____
","[email protected]","omnomnom"
etc

If you have no control over the program that delivers the CSV, you have to open the file, read the contents, do a replace and then parse the CSV. I use __ here but you can use other non-conflicting characters.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks it worked! But I think it should be gsub rather than gsub!
you could do with ml.gsub!(/\" \n/,"\"\n").gsub(/\\\"/,"__") but the first is necessary or you would have to use the non ruby ml = ml.gsub

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.