1

I have a data frame that looks like this:

id tag cnt
123 Lorem 34
123 Ipsum 12
456 Ipsum 10
456 Dolor 2

And another data frame that looks like this:

id tags
123 ['Ipsum','Lorem']
456 ['Lorem', 'Dolor']

I need to find the index of each tag in df one in the list of tags in df two. So the new df one would look like:

id tag cnt Rank
123 Lorem 34 2
123 Ipsum 12 1
456 Ipsum 10
456 Dolor 2 2

2 Answers 2

1

Use DataFrame.explode with rename for possible add Rank column by GroupBy.cumcount and append it to df1 by left join:

df = df2.explode('tags').rename(columns={'tags':'tag'})
df['Rank'] = df.groupby('id').cumcount().add(1)

df = df1.merge(df, how='left')

print (df)
    id    tag  cnt  Rank
0  123  Lorem   34   2.0
1  123  Ipsum   12   1.0
2  456  Ipsum   10   NaN
3  456  Dolor    2   2.0

df['Rank'] = df['Rank'].astype('Int64')
print (df)
    id    tag  cnt  Rank
0  123  Lorem   34     2
1  123  Ipsum   12     1
2  456  Ipsum   10  <NA>
3  456  Dolor    2     2
Sign up to request clarification or add additional context in comments.

Comments

0

You can do this via a simple lambda function as follows:

df = df1.merge(df2, on='id')
df['Rank'] = df.apply(lambda x: x.tags.index(x.tag)+1 if x.tag in x.tags else np.nan, axis=1).astype('Int64')

Resultant dataframe will look like this:

     id   tag  cnt            tags  Rank
0   123 Lorem   34  [Ipsum, Lorem]  2
1   123 Ipsum   12  [Ipsum, Lorem]  1
2   456 Ipsum   10  [Lorem, Dolor]  <NA>
3   456 Dolor   2   [Lorem, Dolor]  2

drop the tags column if you want with:

df.drop(columns = ['tags'])

and resultant dataframe looks like:

     id   tag  cnt  Rank
0   123 Lorem   34  2
1   123 Ipsum   12  1
2   456 Ipsum   10  <NA>
3   456 Dolor   2   2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.