1

I have two DataFrames:

df1:

       node        ids
0   ab          [978]
1   bc          [978, 121]

df2:

       name        id
0   alpha          978
1   bravo          121

I would like to add a new column called names in df1 where I get the list of names corresponding to ids column like this

   node            ids             names
0   ab            [978]            [alpha]
1   bc            [978, 121]       [alpha,bravo]

Would apprreciate help.

4
  • Is possible value not match? Commented Feb 20, 2020 at 12:19
  • sorry I dont really understand what you meant to ask. Commented Feb 20, 2020 at 12:23
  • I think e.g. first row ab [978] ic changed to ab [10] and 10 is no in df2['id'] Commented Feb 20, 2020 at 12:25
  • 1
    No, it works fine. I'm accepting your answer. Thankyou. Commented Feb 20, 2020 at 12:33

2 Answers 2

4

Use if both id values are integers (or both strings, same types):

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y) for y in x] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab       [978]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

If possible value in list not match value of df2['id'] is replaced some no match value:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y, 'no match') for y in x] for x in df1['ids']]
print (df1)
  node         ids              names
0   ab   [978, 10]  [alpha, no match]
1   bc  [978, 121]     [alpha, bravo]

Or is possible omit this values:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d[y] for y in x if y in d.keys()] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab   [978, 10]         [alpha]
1   bc  [978, 121]  [alpha, bravo]
Sign up to request clarification or add additional context in comments.

3 Comments

if you are using the get method you do not really need the if y in d.keys(), right?
@Ev.Kounis - yes, it depends what happens if no match
@Ev.Kounis - I add 2 possible ideas, thank you for pointing it.
0

How about you try with this alternative solution?

df1 = (df1.reset_index()).merge(
        ((df1['ids'].explode().reset_index()).merge(
                df2,how='left',left_on='ids',right_on='id').groupby('index')['name','ids'].agg(
                        lambda x: list(x)).reset_index()),
                how='left',on='index').drop(
                        columns=['index','ids_y']).rename(
                                columns={'ids_x':'ids'})
print(df1)

Output:

  node         ids            name
0   ab       [978]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.