4

I'm trying to allow users to upload a CSV/Excel document, and parse it using Roo (The most suggested one I've seen), but I'm having a bit of issues figuring it out.

Current Script

require 'roo'

xlsx = Roo::Excelx.new("./TestAppXL.xlsx")
xlsx.each_row_streaming do |row|
  puts row.inspect # Array of Excelx::Cell objects
end

This was the only one I was able to get work - It returns what looks to be JSONB.

What I'm trying to do is a few part process:

A) User Uploads a list of 'cards' to my website.(Trying to allow as many options as possible. CSV, Excel, etc)

B) It instantly returns a list of the headers and asks 'Which header is name, quantity, etc etc?'

C) I parse the data for specifics headers and do 'X'.

B is what I primarily need help with. I'm struggling to figure out Roo exactly. I won't have control over the headers so I can't use numerical column numbers.

(Adding in Rails tag since this will be in a controller in the end, maybe an easier way to do it.)

Updated Script

I've actually made a lot of progress. Still trying to get closer to my original request.

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
headers =  xlsx.first_row
puts xlsx.row(headers)
puts "Which number header is the Card Name?"
CardName = gets
puts xlsx.column(CardName.to_i)
# => Returns basic info about the spreadsheet file

Need a lot more logic on the gets but currently if I put in '3' it will return all content of Column 'CardName'. Working on iterating over the rows now.

Psuedo working script

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
headers =  xlsx.first_row
puts xlsx.row(headers)
puts "Which number header is the Card Name?"
CardName = gets.to_i
specHeader = xlsx.cell(headers,CardName)
xlsx.column(CardName).drop(0).each_with_index do |item, index|
    if index == 0

    else
        puts item
    end
end

This is actually performing as expected, and I can start feeding the file into a Rake job now. Still working on some of the iteration but this is very close.

4
  • no control over the headers ? can't you ask the users to name the header name and quantity etc ? it will be easier for them and surely for you to keep to that convention Commented Dec 10, 2016 at 20:36
  • @peter Maybe poor wording on my behalf. I could require them to put "CardName" specifically, but I would rather take in the xlsx.row(1) and return the choices, and them specifically tell me 'which is which', or put in logic on my side so if it's like 'name' it's probably CardName Commented Dec 10, 2016 at 20:48
  • thats what I always dislike if I import a textfile in Excel, also don't count on being row 1 the headerrow, I get many headers on the 2th to 5th row, there is often a general header with a title. I can give you example of reading with correct headernames, but not by letting them choose, you do realise that the puts and gets in your example are not going to work on a Rails site ? Commented Dec 10, 2016 at 21:26
  • @peter Any example would be good. As I'm doing the research I'm starting to lean more towards forcing them to match -my- Header name requirements. The .first_row is a helper for Roo that actually goes to the correct first row. In this Demo file my header row is on row 5 (Specifically to test it) Commented Dec 10, 2016 at 21:28

1 Answer 1

4

I made you a generic way to extract data out of a Roo spreadsheet based on a few header names which would be the convention to use by your uploaders.

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
first_row = xlsx.first_row
headers = ['CardName', 'Item']
headers.each{|h|Kernel.const_set(h, xlsx.row(first_row).index{|e| e =~ /#{h}/i})}
begin
  xlsx.drop(first_row).each do |row|
    p [row[CardName], row[Item]]
  end
rescue
  # the required headers are not all present
end

I suppose the only line that needs explaining is headers.each{|h|Kernel.const_set(h, xlsx.row(first_row).index{|e| e =~ /#{h}/i})}

for each headername assign to it with const_set the index of it in xlsx.row(first_row) (our headerrow) where the regular expression /h/i returns an index, the #{} around h is to expand the h into its value, 'CardName' in the first case, the i at the end of /h/i means the case is to be ignored, so the constant CardName is assigned the index of the string CardName in the headerrow. Instead of the rather clumsy begin rescue structure you could check if all required constants are present with const_get and act upon that instead of catching the error.

EDIT

instead of the p [row[CardName], row[Item]] you could check and do anything, only keep in mind that if this is going to be part of a Rails or other website the interaction with the user is going to be tickier than your puts and get example. Eg something like

headers = ['CardName', 'Item', 'Condition', 'Collection']
...
xlsx.drop(first_row).each do |row|
  if row[CardName].nil? || row[Item].nil?
    # let the user know or skip
  else
    condition, collection = row[Condition], row[Collection]
    # and do something with it
  end
end
Sign up to request clarification or add additional context in comments.

3 Comments

Wow, that's really interesting. The .index line definitely threw me off but I kind of understand what it's doing. Similar to the .each |item, index| I was doing before - Well, similar result. How could I amend this so I do 'X' if 'Y' header is there? For instance: 'CardName', and 'Count' is 100% required, but a line like 'Condition' 'Collection' etc etc is optional, but I might want? If you could give me a singular example I could scale it.
did add some to my answer, not sure if it's what you meant, otherwise elaborate please, see also what I said about checking if each header is present or not
Ahh, true. Thanks a lot for the additional explaination! I'll have to work on moving it into a controller, but the Ruby version is working great.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.