3

I have a DataFrame that has nulls within a given column, within the same index, there is another column with repeating non Null values. What I am trying to figure out is what's the proper way of filling those null values using the ID column as reference using Pandas native functions.

Thank you for your help.

Original:

    Company ID
    AAA 100
    BBB 200
    CCC 150
    **NULL  100
    FFF 375
    **NULL  150

Formatted:

    AAA 100
    BBB 200
    CCC 150
    **AAA   100
    FFF 375
    **CCC   150

1 Answer 1

4

You can try:

df['Company'] = df.groupby('ID')['Company'].transform('first')

As commented, the above will replace all Company not just those with nan. So it may give wrong result if you have several Company for an ID. Instead, you can do:

df['Company'] = df['Company'].fillna(df.groupby('ID')['Company'].transform('first'))
Sign up to request clarification or add additional context in comments.

2 Comments

I think this would transform all the rows having the same value and not necesseraly just those having null value.
@AyoubZAROU That is true. I was under the impression that Company should be unique for each ID. But this can be fixed easily.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.