Write data from one csv to another python

Question

I have three CSV files with attributes Product_ID, Name, Cost, Description. Each file contains Product_ID. I want to combine Name (file1), Cost(file2), Description(File3) to new CSV file with Product_ID and all three above attributes. I need efficient code as files contains over 130000 rows.

After combining all data to new file, I have to load that data in a dictionary. Like: Product_Id as Key and Name,Cost,Description as Value.

And what have you tried so far? Show us your code, so we might be able to help you better. — Oliver W.
– Oliver W., Commented Apr 8, 2016 at 21:46
All I have tried is to combine the data from three files to a dictionary and then write it, but I am getting error. In below code I am writing a file to dictionary with row[1] as key and row[2],row[3] as value. But I am not able to append another file to same dictionary. with open('train_1.csv', 'r',encoding="utf8") as file: text_file = csv.reader(file) next(text_file) for rows in text_file: maindict[rows[1]] = rows[2],rows[3] — Sameer
– Sameer, Commented Apr 8, 2016 at 21:56
@Sameer May want to edit your question with that code, comments aren't exactly easy on the eyes. — kirkpatt
– kirkpatt, Commented Apr 8, 2016 at 22:01
I am doing this approach for feature extraction, after that I have to apply Multinominal Naive Bayes. Although I have no idea about this method, I am learning it. — Sameer
– Sameer, Commented Apr 8, 2016 at 22:01

Steffi Keran Rani J · Accepted Answer · 2017-11-07 14:41:01Z

1

It might be more efficient to read each input .csv into a dictionary before creating your aggregated result.

Here's a solution for reading in each file and storing the columns in a dictionary with Product_IDs as the keys. I assume that each Product_ID value exists in each file and that headers are included. I also assume that there are no duplicate columns across the files aside from Product_ID.

import csv
from collections import defaultdict

entries = defaultdict(list)
files = ['names.csv', 'costs.csv', 'descriptions.csv']
headers = ['Product_ID']

for filename in files:
   with open(filename, 'rU') as f:      # Open each file in files.
      reader = csv.reader(f)            # Create a reader to iterate csv lines
      heads = next(reader)              # Grab first line (headers)

      pk = heads.index(headers[0])      # Get the position of 'Product_ID' in
                                        # the list of headers
      # Add the rest of the headers to the list of collected columns (skip 'Product_ID')
      headers.extend([x for i,x in enumerate(heads) if i != pk])

      for row in reader:
         # For each line, add new values (except 'Product_ID') to the
         # entries dict with the line's Product_ID value as the key
         entries[row[pk]].extend([x for i,x in enumerate(row) if i != pk])

writer = csv.writer(open('result.csv', 'wb'))    # Open file to write csv lines
writer.writerow(headers)                         # Write the headers first
for key, value in entries.items():
   writer.writerow([key] + value)      # Write the product IDs
   # concatenated with the other values

edited Nov 7, 2017 at 14:41

Steffi Keran Rani J

4,1734 gold badges41 silver badges63 bronze badges

answered Apr 8, 2016 at 22:09

dnix

1214 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Sameer Over a year ago

if I want to append more than one row from a CSV then the above code wont work. Suppose names.csv contains Product_ID, Names, Tags. If i want to append both row 1, row2 ??

dnix Over a year ago

You didn't include much information about your csv columns. I assumed that there was no other data included with them. You can read in the headers from the first line, rather than skipping them, in order to find the correct row indices for the key and the value to append. To clarify, you want every column from each file added, with the product ID as their key?

dnix Over a year ago

I've edited my answer to include every column from each file.

Sameer Over a year ago

Thanks for the help, I will look into the code you have provided. Will comment further if needed.

Sameer Over a year ago

With the above code, I am getting some error. heads = reader.next() AttributeError: '_csv.reader' object has no attribute 'next'

|

gboffi · Accepted Answer · 2016-04-08 22:26:39Z

0

A general solution that produces a record, maybe incomplete, for each id it encounters processing the 3 files needs the use of a specialized data structure that fortunately is just a list, with a preassigned number of slots

d = {id:[name,None,None] for id, name in [line.strip().split(',') for line in open(fn1)]}
for line in open(fn2):
    id, cost = line.strip().split(',')
    if id in d:
        d[id][1] = cost
    else:
        d[id] = [None, cost, None]
for line in open(fn3):
    id, desc = line.strip().split(',')
    if id in d:
        d[id][2] = desc
    else:
        d[id] = [None, None, desc]

for id in d:
    if all(d[id]): 
       print ','.join([id]+d[id])
    else: # for this id you have not complete info,
          # so you have to decide on your own what you want, I have to
        pass

If you are sure that you don't want to further process incomplete records, the code above can be simplified

d = {id:[name] for id, name in [line.strip().split(',') for line in open(fn1)]}
for line in open(fn2):
    id, cost = line.strip().split(',')
    if id in d: d[id].append(name)
for line in open(fn3):
    id, desc = line.strip().split(',')
    if id in d: d[id].append(desc)

for id in d:
    if len(d[id])==3: print ','.join([id]+d[id])

answered Apr 8, 2016 at 22:26

gboffi

25.4k10 gold badges62 silver badges98 bronze badges

3 Comments

Sameer Over a year ago

@ gboffi, I will look into the code today, Thanks for the help.

user10753862 Over a year ago

could you please, check this question out? stackoverflow.com/questions/54192260/…

gboffi Over a year ago

@Barbie I've checked that question of yours but I have no working knowledge of pandas and I do not clearly understand the issue, so I'm afraid that I cannot help you, sorry...

Collectives™ on Stack Overflow

Write data from one csv to another python

2 Answers 2

7 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related