1

I have a data frame

df = pd.DataFrame({'Color': 'Red Red Blue'.split(), 'Value': [100, 150, 50]})
>>> df
  Color  Value
0   Red    100
1   Red    150
2  Blue     50

I have second data frame dfmain

dfmain = pd.DataFrame({'Color': ["Red","Blue","Yellow"]})
>>> dfmain
    Color
0     Red
1    Blue
2  Yellow

i want to get result data frame with sum of each colors my expected result is

>>> result
    Color  sum
0     Red  250
1    Blue   50
2  Yellow    0

Now i am using loop. its getting slow when run for large data set . I would like to get typical pandas(or numpy) solution for this

1 Answer 1

2

You can use groupby with aggregating sum with reindex:

df = df.groupby('Color')['Value'].sum().reindex(dfmain.Color, fill_value=0).reset_index()
print (df)

    Color  Value
0     Red    250
1    Blue     50
2  Yellow      0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.