3

I am dealing with a CSV file similar to this one

foo; val1; position1
bar; name1; address1; phone_nbr1
bar; name2; address2; phone_nbr2
foo; val2; position2
bar; name3; address3; phone_nbr3
bar; name4; address4; phone_nbr4
bar; name5; address5; phone_nbr5
bar; name6; address6; phone_nbr6
foo; val3; position3

Needless to say, I cannot modify the CSV.

Instances displayed in foo lines are different from the ones with bar lines (notice they don't even have the same number of fields)

I need simply reading this data, no need to write it.

My first idea was to separate the file into two temporary files and then read each one separately with a csv.DictReader, however I really don't like this approach.

Is there a simpler way to do this? I would like to avoid if possible having to write files to disk.

For the record, I'm using Python2.7 on a Solaris 10 machine.

6 Answers 6

6

You could collect the records from a csv.reader in two different lists, depending on their length (or whatever criterion you use to distinguish the two streams):

list1 = []
list2 = []
with open("input.csv", "rb") as f:
    for record in csv.reader(f, delimiter=";"):
        if len(record) == 3:
            list1.append(record)
        else:
            list2.append(record)
Sign up to request clarification or add additional context in comments.

Comments

4

csv.reader() has no problem with this:

import csv
foo = []
bar = []
with open("test.csv", 'r') as f:
    c = csv.reader(f, delimiter = ";")
    for row in c:
        if row[0] == "foo":
            foo.append(row[1:])
        elif row[0] == "bar":
            bar.append(row[1:])
print(foo)
print(bar)

results in

[[' val1', ' position1'], [' val2', ' position2'], [' val3', ' position3']]
[[' name1', ' address1', ' phone_nbr1'], [' name2', ' address2', ' phone_nbr2'], [' name3', ' address3', ' phone_nbr3'], [' name4', ' address4', ' phone_nbr4'], [' name5', ' address5', ' phone_nbr5'], [' name6', ' address6', ' phone_nbr6']]

Comments

1

What about just using str.split on each line?

items = line.split(";")

Then if the first item in the items list is foo you do one thing, and if it's bar you do something else.

Comments

0

The fact that the lines are different is not a problem for csv module, but you'll have to analyze line content differently depending on first 'cell'.

Example of code:

with open(input_file, 'rb') as fin:
    c = csv.reader(fin)
    for line in c:
         if line[0] == 'foo':
              # do some treatment
         elif line[0] == 'bar':
              # do something else
    c.close()

Comments

0

It's not clear from your question what it is you actually want to achieve, but I'm not sure you need the csv module here.

for row in myfile.readlines():
    cols = [r.strip() for r in row.split(';')]
    if (cols[0] == "foo"):
        # Do something for foo
    elif (cols[0] == "bar"):
        # Do something for bar

Comments

0

What about something like:

foos = []
bars = []
for line in csv.reader(open("file.csv","rb"), delimiter=";"):
  if line[0] == "foo":
    foos.append(Foo(line[1], line[2]))
  else:
    bars.append(Bar(line[1], line[2], line[3]))

Assuming you have a Foo and a Bar class taking the rest of your line cells as arguments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.