0

I wish to apply a groupby to multiple columns while running an aggregate function.

Data

country type    en  start   
Japan   aa      25  8/1/2022    
Japan   cc      1   9/1/2022    
US      bb      5   8/1/2022    
US      bb      5   8/1/2022    
                
            

Desired

country type    en  start    
Japan   aa      25  8/1/2022    
Japan   cc      1   9/1/2022    
US      bb      10  8/1/2022    
            

Doing

df.groupby(['country','type','date'])['en'].sum()

However, this is creating some blank rows. Any suggestion is appreciated.

2
  • Hi @Naveed the date is part of the groupby.. so if same country has same date, same type, we sum the en column. Commented Nov 2, 2022 at 23:45
  • 1
    are you sure the rows are blank and not just displaying as a hierarchical index? If it's the latter, setting the styler.sparse.index option to True would fix your problem. Commented Nov 3, 2022 at 0:21

1 Answer 1

2
out=df.groupby(['country','type','start'], as_index=False).agg({'en': sum})
out
country     type    start   en
0   Japan   aa  8/1/2022    25
1   Japan   cc  9/1/2022    1
2   US      bb  8/1/2022    10
Sign up to request clarification or add additional context in comments.

4 Comments

wonder could you groupby date as well, I think it is more accurate - because we group by date type and country, then take sum of en col
just updated the solution
@Lynn, what you had as a solution was just as good as mine. Why was it failing?
yes but my rows were coming up blank - it grouped by the country but it produced blank rows

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.