Pandas - Faster way to find same value in different columns in CSV file?

Question

I need to find rows that have circular referenced in the CSV input file like:

start,end,weather
california,arizona,hot
colorado,kansas,cold
arizona,california,hot

The above should detect that the 1st and 3rd row a circular reference. I'm currently loading the csv into database and running a self-join query to determine that the data has circular reference. But looking to see if there is any way to handle this using Python Pandas.

Thanks!

How about california -> arizona, arizona -> kansas, kansas -> california? Do you need to handle this loop? — awesoon
– awesoon, Commented Aug 13, 2018 at 6:51
No, only the first level circular reference and not transitive loop. Thanks! — user2727704
– user2727704, Commented Aug 13, 2018 at 6:54
Does right / left matter? What if the last row has right relation? — awesoon
– awesoon, Commented Aug 13, 2018 at 6:56
yes, it needs to be the same relation. Updated the sample in the question posted. — user2727704
– user2727704, Commented Aug 13, 2018 at 7:00

Charles R · Accepted Answer · 2018-08-13 08:26:30Z

1

You can filter the rows where the value of df.start Serie is contain in the df.end Serie. Then you appy a second filter to get the rows where the value of df.end Serie is contain in the df.start Serie :

df = df.loc[df.start.isin(df.end),:]
df = df.loc[df.end.isin(df.start),:]
df["way"] = df.apply(lambda x: sorted([x["start"], x["end"]]), axis=1)
print(df)

The output will give you line 0 and 2.

edited Aug 13, 2018 at 8:26

answered Aug 13, 2018 at 7:19

Charles R

1,6511 gold badge11 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user2727704 Over a year ago

Is there anyway to ensure the 3rd column is the same?

Charles R Over a year ago

I updated my answer by adding a new Series that does the job

Collectives™ on Stack Overflow

Pandas - Faster way to find same value in different columns in CSV file?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related