1

I need to compare 3 csv files to compare against 3 columns (all three columns have the same name on all 3 csv files), to count 1)what is duplicated and 2) what is different (counts only are fine).

Ex. csv1 colB needs checked and compared to csv2 colB and csv3 colb for count totals duplicated(matched on csv2,3) and coutn totals (matched on csv2/3).

All 3 csv's have same column names and colB has ip addreses, colC has hash values, and colD has domain names.

I have tried this for a test at matching colB with failure:

print(df[~df.colB.isin(df1.colB)]) #prints out all columns from df

Tried to add:

print(df[~df.colB.isin(df1.colB).count()]) #get multiple traceback errors
2
  • Are the indexes in your dataframes the same? I mean as in are the indexes the usual 0,1,2,3 etc. Commented Mar 29, 2021 at 16:10
  • Yes. They are all the same. Commented Mar 29, 2021 at 17:05

2 Answers 2

1

Try with value_counts() you will get the values of True and False.

df.colB.isin(df1.colB).value_counts()

I hope this is what you are looking for.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Karthik. Are the "False" value counts the duplicated counts? Can this be done adding a third csv lookup? i.e df2 along with df1.
1

Let's call the dataframes df1, df2, df3.

Each column in a dataframe is a series, so you can compare them to get a boolean series:

checkB12 = (df1.colB == df2.colB)

This would give a Pandas series object that has (True, True, False,...) or something like that.

Similarly,

checkB13 = (df1.colB == df3.colB)

Then,

duplicated = checkB12 or checkB13

This gives you a series of boolean values, with true when there is at least one match of df1 with df2 or df3. Doing duplicated.sum() will give you the total number of True values i.e. the total number of cases in df1 which is duplicated at least once in df2 and df3.

I don't really understand what you mean by "what is different" between the dataframes, so I can't be sure what code you need.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.