Pandas finding index of string in one column in a df in a list of strings in another df

Question

I have a data frame that looks like this:

id	tag	cnt
123	Lorem	34
123	Ipsum	12
456	Ipsum	10
456	Dolor	2

And another data frame that looks like this:

id	tags
123	['Ipsum','Lorem']
456	['Lorem', 'Dolor']

I need to find the index of each tag in df one in the list of tags in df two. So the new df one would look like:

id	tag	cnt	Rank
123	Lorem	34	2
123	Ipsum	12	1
456	Ipsum	10
456	Dolor	2	2

jezrael · Accepted Answer · 2022-05-19 11:54:51Z

1

Use DataFrame.explode with rename for possible add Rank column by GroupBy.cumcount and append it to df1 by left join:

df = df2.explode('tags').rename(columns={'tags':'tag'})
df['Rank'] = df.groupby('id').cumcount().add(1)

df = df1.merge(df, how='left')

print (df)
    id    tag  cnt  Rank
0  123  Lorem   34   2.0
1  123  Ipsum   12   1.0
2  456  Ipsum   10   NaN
3  456  Dolor    2   2.0

df['Rank'] = df['Rank'].astype('Int64')
print (df)
    id    tag  cnt  Rank
0  123  Lorem   34     2
1  123  Ipsum   12     1
2  456  Ipsum   10  <NA>
3  456  Dolor    2     2

edited May 19, 2022 at 11:54

answered May 19, 2022 at 11:48

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Hamza · Accepted Answer · 2022-05-19 12:04:55Z

0

You can do this via a simple lambda function as follows:

df = df1.merge(df2, on='id')
df['Rank'] = df.apply(lambda x: x.tags.index(x.tag)+1 if x.tag in x.tags else np.nan, axis=1).astype('Int64')

Resultant dataframe will look like this:

     id   tag  cnt            tags  Rank
0   123 Lorem   34  [Ipsum, Lorem]  2
1   123 Ipsum   12  [Ipsum, Lorem]  1
2   456 Ipsum   10  [Lorem, Dolor]  <NA>
3   456 Dolor   2   [Lorem, Dolor]  2

drop the tags column if you want with:

df.drop(columns = ['tags'])

and resultant dataframe looks like:

     id   tag  cnt  Rank
0   123 Lorem   34  2
1   123 Ipsum   12  1
2   456 Ipsum   10  <NA>
3   456 Dolor   2   2

answered May 19, 2022 at 12:04

Hamza

6,1354 gold badges37 silver badges52 bronze badges

Collectives™ on Stack Overflow

Pandas finding index of string in one column in a df in a list of strings in another df

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related