3

I am attempting to merge several rows of csv data into one long row, given two cells contain the same data. For instance, take the following csv:

one, two, three
1, 2, 3
4, 5, 6
7, 8, 9
1, 1, 1
4, 4, 4

If two rows share the same value at row[0], I want the second row appended to the first. So my end product should look like this:

one, two, three
1, 2, 3, 1, 1, 1
4, 5, 6, 4, 4, 4
7, 8, 9

Here is my attempt so far:

import csv

uniqueNum = []
uniqueMaster = []
count = -1
with open("Test.csv", "rb") as source:
    reader = csv.reader(source)
    header = next(reader)
    for row in reader:
        if row[0] not in uniqueNum:
            uniqueMaster.append(row)
            uniqueNum.append(row[0])
            count = count + 1
            for row in reader:
                if row[0] in uniqueNum:
                    uniqueMaster[count].append(row)

with open("holding.csv","wb") as result:
    writer = csv.writer(result)
    writer.writerow(header)
    for row in uniqueMaster:
        writer.writerow(row) 

Things LOOK ok to me, but my script only outputs the following:

one, two, three
1, 2, 3, ['1', '1', '1']

This is obviously wrong for two reasons. First, it doesn't iterate through the entire csv, and second, the appended values are being squeezed into one cell, rather than individual cells. If anyone had any advice on getting this to work right I'd highly appreciate it!

0

2 Answers 2

4

Use a dictionary instead. Starting from the middle of your code(assume I have declared a dict called my_dict):

 for row in reader:
    if row[0] in my_dict.keys():
       my_dict[row[0]].extend(row)
    else:
       my_dict[row[0]]=row
  #...now we are at the bottom of your code, writing to the csv
 for v in my_dict.values():
    writer.writerow(v)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for the response. This nearly does it - the rows fall into the right spots, but they're not formatted properly. For example, the first key/value pairs are: {'1': ['1', '2', '3', ['1', '1', '1']]} when they should look like this: {'1': ['1', '2', '3', '1', '1', '1']}. Any advice on fixing this last part?
Fixed it. It should use extend() instead of append(). For more information see this post. Specifically Aaron Hall gives a great explanation with runtimes included.
That did it! Thanks for the crash course in dictionaries!
1
import csv

csv_dict = {}

with open("Test.csv", "r") as source:
    reader = csv.reader(source)
    header = next(reader)

    for row in reader:
        if row[0] in csv_dict:
            csv_dict[row[0]] += row
        else:
            csv_dict[row[0]] = row

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.