1

Let's say that I have a df1 like (there are more columns but only this one is relatable):

A
a1
a2
a3

and a df2 like:

A
a1
a3
a4
a7

The case is that df2 contains in column A (column names are the same both in df1 and df2) some of the values in df1, but not all of them. Now, I'd like to add a column "Found in df2?" to a df1, representing if the value was found or not. Example:

df1
A  Found in df2?
a1       Y
a2       N
a3       Y

I've tried np.where and some merging magic but couldn't wrap my head around this.

3 Answers 3

4

You can use isin:

df['found in df2'] = df['A'].isin(df2['A'].values)

print(df)

    A   found in df2
0   a1  True
1   a2  False
2   a3  True

Setup

df = pd.DataFrame({'A':['a1','a2','a3']})
df2 = pd.DataFrame({'A':['a1','a3','a4','a7']})
Sign up to request clarification or add additional context in comments.

Comments

1

If using pandas cannot solve your problem, I suggest you to turn them into list/arrays such that it is easier to work with.

Here is one way to work around:

import pandas as pd
df1 = pd.DataFrame({
    'A': [1,2,3]
})
df2 = pd.DataFrame({
    'A': [1,3,4,7]
})

df1['Found in df2?'] = ['Y' if x in df2['A'].values else 'N' for x in df1['A'].values]
display(df1)

Comments

1

Pandas merge has an indicator feature that tells you in which sides of the merge a value is found.

Like this:

dfm = pd.merge(df1, df2[['A']], how="left", indicator=True)

dfm['Found in df2'] = dfm["_merge"] == "both"

You get a column _merge that tells you "left_only" or "both" depending on where the A values are found. Use how="outer" if you want to know which values are only in df2 too.

Now, merge may reorder rows in df1 based on the order of A. See other questions on SO how to counteract that, if needed.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.