Filling null values within a Pandas DataFrame with values from the same column that have a matching value in another column

Question

I have a DataFrame that has nulls within a given column, within the same index, there is another column with repeating non Null values. What I am trying to figure out is what's the proper way of filling those null values using the ID column as reference using Pandas native functions.

Thank you for your help.

Original:

    Company ID
    AAA 100
    BBB 200
    CCC 150
    **NULL  100
    FFF 375
    **NULL  150

Formatted:

    AAA 100
    BBB 200
    CCC 150
    **AAA   100
    FFF 375
    **CCC   150

Quang Hoang · Accepted Answer · 2019-08-01 21:23:07Z

4

You can try:

df['Company'] = df.groupby('ID')['Company'].transform('first')

As commented, the above will replace all Company not just those with nan. So it may give wrong result if you have several Company for an ID. Instead, you can do:

df['Company'] = df['Company'].fillna(df.groupby('ID')['Company'].transform('first'))

edited Aug 1, 2019 at 21:23

answered Aug 1, 2019 at 21:11

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ayoub ZAROU Over a year ago

I think this would transform all the rows having the same value and not necesseraly just those having null value.

Quang Hoang Over a year ago

@AyoubZAROU That is true. I was under the impression that Company should be unique for each ID. But this can be fixed easily.

Collectives™ on Stack Overflow

Filling null values within a Pandas DataFrame with values from the same column that have a matching value in another column

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related