1

Using DictReader and DictWriter I need to find matching values between file1.csv and file2.csv. If a match is found, remove it from file1.csv

file1.csv

UserName,LastIP,LastLogon
Jessica_Alba,10.10.10.11,11/14/2019
Karen_Edwards,10.10.10.12,11/14/2019
Tracy_Chung,10.10.10.25,11/15/2019

file2.csv

Department,UserName,LastPasswordReset,LastIP
IT,Jessica_Alba,9/14/2019,10.10.10.11
Accounting,Karen_Edwards,9/14/2019,10.10.10.12

Expected output after comparison of two files that file1.csv is updated by removing the matching users

UserName,LastIP,LastLogon
Tracy_Chung,10.10.10.25,11/15/2019

However, it doesn't seem to the case with my code. What am I doing wrong?

data3 = []

with open("file1.csv","r") as in_file1, open("file2.csv", "r") as in_file2:
    reader1 = csv.DictReader(in_file1)
    reader2 = csv.DictReader(in_file2)
    for row2 in reader2:
        for row1 in reader1:
            print(row1['UserName'])
            if row2['UserName'] != row1['UserName']:

                data3.append(row1)


print(data3)
3
  • For the first row in reader2 your code is iterating over all rows of reader1 and row2 is appended to data3 every time, except for the one time the usernames are equal. Commented Sep 29, 2020 at 17:55
  • @Wups I made a typo in the code. It's supposed to be row1. Commented Sep 29, 2020 at 18:10
  • What @Wups is trying to say is that there is a more fundamental error in your logic. You would be adding the same user many times, if it weren't for the fact that the loop is empty for other reasons (see my answer). Commented Sep 29, 2020 at 18:15

1 Answer 1

3

An open file handle is a stream; you can read as long as there are more lines you have not read, but once you have read all the lines, once, you are at the end of the stream, and further reads will produce nothing at all.

Rather than attempt to loop over the inner file more than once, read it into memory, then loop as many times as you like over the data structure you have in memory ... or better yet, produce a data structure in memory which lets you directly see if the user was present in the second file, so you don't have to loop at all to search for a user.

import csv


data3 = []

with open("file2.csv", "r") as in_file2:
    reader2 = csv.DictReader(in_file2)
    # Create a set of users
    users = {row2['UserName'] for row2 in reader2}

with open("file1.csv","r") as in_file1:
    reader1 = csv.DictReader(in_file1)
    for row1 in reader1:
        if row1['UserName'] not in users:
            data3.append(row1)

print(data3)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.