0

currently working on python and newbie on it. I have a data frame consisting of two columns id and parent id

id   | parent
1    | A
2    | B
3    | C
4    | A
5    | A
6    | C
A    | NaN
B    | NaN
C    | NaN

And the expected output is like the table given below:

id   | parent | child
1    | A      | NaN
2    | B      | NaN
3    | C      | NaN
4    | A      | NaN
5    | A      | NaN
6    | C      | NaN
A    | NaN    | 1 ; 4 ; 5
B    | NaN    | 2 
C    | NaN    | 3 ; 6

I have tried using fillna() function on it but couldn't got expected results.

1 Answer 1

1

I think you should use groupby and merge function on it.

print(df1)

  id parent
0  1      A
1  2      B
2  3      C
3  4      A
4  5      A
5  6      C
6  A    NaN
7  B    NaN
8  C    NaN

Then search their child:

df2 = df1.groupby('parent').agg({'id': lambda x: x.tolist()}).reset_index()
print(df2)

  parent      child
0      A  [1, 4, 5]
1      B        [2]
2      C     [3, 6]

finally merge them:

df2.columns = ['id', 'child']
df3 = pd.merge(df1,df2,on='id',how='left')
print(df3)
  id parent      child
0  1      A        NaN
1  2      B        NaN
2  3      C        NaN
3  4      A        NaN
4  5      A        NaN
5  6      C        NaN
6  A    NaN  [1, 4, 5]
7  B    NaN        [2]
8  C    NaN     [3, 6]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.