0

What is the best way to figure out how two dataframes differ based on a combination of multiple columns. So if I have the following:

df1:

  A B C
0 1 2 3
1 3 4 2

df2:

  A B C
0 1 2 3
1 3 5 2

Want to show all rows where there is a difference such as (3,4,2) vs. (3,5,2) from above example. I've tried using the pd.merge() thinking that if I use all columns as the key to join using outer join, I would end up with dataframe that would help me get what I want but it doesn't turn out that way.

Thanks to EdChum I was able to use a mask from a boolean diff as below but first had to make sure indexes were comparable.

df1 = df1.set_index('A')
df2 = df2.set_index('A') #this gave me a nice index using one of the keys.
                  #if there are different rows than I would get nulls. 
df1 = df1.reindex_like(df2)
df1[~(df1==df2).all(axis=1)] #this gave me all rows that differed. 

1 Answer 1

1

We can use .all and pass axis=1 to perform row comparisons, we can then use this boolean index to show the rows that differ by negating ~ the boolean index:

In [43]:

df[~(df==df1).all(axis=1)]
Out[43]:
   A  B  C
1  3  4  2

breaking this down:

In [44]:

df==df1
Out[44]:
      A      B     C
0  True   True  True
1  True  False  True
In [45]:

(df==df1).all(axis=1)
Out[45]:
0     True
1    False
dtype: bool

We can then pass the above as a boolean index to df and invert it using ~

Sign up to request clarification or add additional context in comments.

6 Comments

only thing is that my two dataframes are not identically labelled. Can only compare identically-labeled DataFrame objects. Is there a quick solution to this? Was thinking to a reindex_like perhaps?
So what exactly will be different the column names? The number of rows?
rows would be different. columns are same
In what way are the rows different? more or fewer rows or either?
it could be either fewer or more. basically get a new version of a dataset every month and want to be able to get a sense of how the records have shifted or changed in anyway.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.