0

I would like to add a new column to my CSV in Python. I understand that in Python I will have to run the CSV and rewrite a new one as an output to add a column.

I want to run though each UserID and assign a UniqueID to it.

This is my input:

UserID,name
a,alice
a,alice
b,ben
c,calvin
c,calvin
c,calvin

This is my desired output:

UniqueID,UserID,name
1,a,alice
1,a,alice
2,b,ben
3,c,calvin
3,c,calvin
3,c,calvin

I am new to Python and was wondering if anyone can show me how this can be done. Thanks.

Here is my code so far: 
import csv
import operator

temp_index = 0

with open("./coordinates.csv") as all_coordinates_csv:
    coordinate_reader = csv.reader(all_coordinates_csv, delimiter=",")

    sort = sorted(coordinate_reader,key=operator.itemgetter(0))

with open("./sorteduserid.csv","wb") as sorteduser_csv:
    csv_writer = csv.writer(sorteduser_csv,delimiter=",")
    csv_writer.writerows(sort)
5
  • 3
    Welcome to Stack Overflow. We expect you to show some effort. A good start would be reading the documentation of the csv module. Commented Dec 15, 2017 at 3:04
  • Is the only difference between the UserID and the UniqueID that UniqueID is a number? So UserID is already unique, but a string and you want an integer as ID? Commented Dec 15, 2017 at 3:04
  • Thanks for your reply. I figured out how to import my CSV using python and has sorted my UserIDs using the sorted function. I know I need to create an index, but can't seem to figure out how to proceed. The UniqueID runs through the UserIDs and assigns a unique number to it, so all a's are 1, all b's are 2, etc. Commented Dec 15, 2017 at 3:09
  • Nice, so from a UserID you now have a unique integer that you want to use as UniqueID? Then, for each row in sort, you can add the unique id to the row with something like csv_writer.writerow([unique_id] + row). Note the difference between writerow() and writerows(). Here, a singel row from your example would be e.g. ['a', 'alice']. Commented Dec 15, 2017 at 3:26
  • This question already exists with great answers over here: stackoverflow.com/questions/11070527/… Commented Aug 7, 2024 at 5:06

2 Answers 2

3

Take a try with my code:

import csv
import uuid

is_first = True
with open('test.csv', newline='') as input_file:  
    with open('output.csv', 'w', newline='') as output_file:
        writer = csv.writer(output_file)
        reader = csv.reader(input_file)
        for row in reader:
            if is_first:
                row.insert(0, 'UniqueID')
                is_first = False
            else:
                row.insert(0, str(uuid.uuid4()))

            writer.writerow(row)
Sign up to request clarification or add additional context in comments.

Comments

0

This solution is based on your own attempt:

import csv
import operator
from itertools import groupby

with open("./coordinates.csv") as all_coordinates_csv:
    coordinate_reader = csv.reader(all_coordinates_csv, delimiter=",")

    sort = sorted(coordinate_reader,key=operator.itemgetter(0))

grouped = []
for key, group in groupby(sort, lambda x: x[0]):
    grouped.append(list(group))

data_out = [];
data_out.append(['Unique_ID', (grouped[0])[0][0], (grouped[0])[0][1]])

user_id = 1
for group in grouped[1:]:
    for user in group:
        data_out.append([user_id,user[0],user[1]])
    user_id += 1

with open("./sorteduserid.csv","wb") as sorteduser_csv:
    csv_writer = csv.writer(sorteduser_csv,delimiter=",")
    csv_writer.writerows(data_out)

After you sort your input the program uses groupby to group the values by UserID in sort. It then uses those grouped values in the loops to assign a unique ID to each UserID. Before the loops, it extends the header with User_ID entry.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.