Python CSV Reader - Compare Each Row with Each Other Row Within One Column

Question

I want to compare each row of a CSV file with itself and every other row within a column.
For example, if the column values are like this:

Value_1
Value_2
Value_3

The code should pick Value_1 and compare it with Value_1 (yes, with itself too), Value_2 and then with Value_3. Then it should pick up Value_2 and compare it with Value_1, Value_2, Value_3, and so on.

I've written following code for this purpose:

csvfile = "c:\temp\temp.csv"
with open(csvfile, newline='') as f:
    reader = csv.reader(f, delimiter=',')
    for row in reader:
        for compare_row in reader:
            if row == compare_row
                print(row,'is equal to',compare_row)
            else:
                print(row,'is not equal to',compare_row)

The code gives the following output:

['Value_1'] is not equal to ['Value_2']
['Value_1'] is not equal to ['Value_3']

The code compares Value_1 to Value_2 and Value_3 and then stops. Loop 1 does not pick Value_2, and Value_3. In short, the first loop appears to iterate over only the first row of the CSV file before stopping.

Also, I can't compare Value_1 to itself using this code. Any suggestions for the solution?

Your indentation looks weird, but I assume it is not like this in your real code. Could you try to create a new reader inside the first loop for compare_row instead of using the same for both loops? — Yanick Nedderhoff
– Yanick Nedderhoff, Commented Oct 4, 2015 at 0:57

Martin Swanepoel · Accepted Answer · 2015-10-04 01:32:06Z

3

I would have suggested loading the CSV into memory but this is not an option considering the size.

Instead think of it like a SQL statement, for every row in the left table you want to match it to a value in the right table. So you would only scan through the left table once and start re-scanning the right table until left has reached EoF.

with open(csvfile, newline='') as f_left:
    reader_left = csv.reader(f_left, delimiter=',')
    with open(csvfile, newline='') as f_right:
        reader_right = csv.reader(f_right, delimiter=',')
        for row in reader_left:
            for compare_row in reader_right:
                if row == compare_row:
                    print(row,'is equal to',compare_row)
                else:
                    print(row,'is not equal to',compare_row)
            f_right.seek(0)

answered Oct 4, 2015 at 1:32

Martin Swanepoel

3322 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Rishabh Over a year ago

Thanks for reply, but the program is going into infinite loop as result of f_right.seek(0) function. I tried to find any workaround but couldn't. Can you please suggest what could be the issue?

Rishabh Over a year ago

I am really sorry, I was putting the seek(0) at wrong place. Thanks a lot for the answer! Will mark green once I am done with testing. :)

Ajay Gupta · Accepted Answer · 2015-10-04 03:27:54Z

Try to use inbuilt package from Python : Itertools

from itertools import product

with open("abcTest.txt") as inputFile:
    aList = inputFile.read().split("\n")
    aProduct = product(aList,aList)
    for aElem,bElem in aProduct:
        if aElem == bElem:
            print aElem,'is equal to',bElem
        else:
            print aElem,'is not equal to',bElem

The problem you are facing is called Cartesian product in Python where we need to compare the row of data with itself and every other row.

For this if you are doing multiple time read from source then it will cause signficant performance issue if the file is big. Instead you can store the the data in list and iterate it over multiple time but this also will have huge performance over head.

The itertool package is useful in this case as it is optimized for these kind of problems.

Collectives™ on Stack Overflow

Python CSV Reader - Compare Each Row with Each Other Row Within One Column

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related