There are quite a few similar questions out there, but I am not sure if there is one that tackles both index and row values. (relevant to binary classification df)
So what I am trying to do is compare the columns with the same name to have the same values and index. If not, simply return an error.
Let's say DataFrame df has columns a, b and c and df_orginal has columns from a to z.
How can we first find the columns that have the same name between those 2 DataFrames, and then check the contents of those columns such that they match row by row in value and index between a, b and c from df and df_orginal
The contents of all the columns are numerical, that's why I want to compare the combination of index and values
Demo:
In [1]: df
Out[1]:
a b c
0 0 1 2
1 1 2 0
2 0 1 0
3 1 1 0
4 3 1 0
In [3]: df_orginal
Out[3]:
a b c d e f g ......
0 4 3 1 1 0 0 0
1 3 1 2 1 1 2 1
2 1 2 1 1 1 2 1
3 3 4 1 1 1 2 1
4 0 3 0 0 1 1 1
In the above example, for those columns that have the same column name, compare the combination of index and value and flag an error if the combination of index and value is not correct