1

I try to learn how to work with pandas dataframes. My dataframe has 4 columns A,B,C,D.

For index (A,B,C) there are multiple values of D. I want to merge these rows and sum the values of D.

I have:

╔═══╦═══╦═══╦═══╦═══╗
║   ║ A ║ B ║ C ║ D ║   
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 5 ║
║ 1 ║ 1 ║ 2 ║ 3 ║ 3 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 3 ║
╚═══╩═══╩═══╩═══╩═══╝

I want to get:

╔═══╦═══╦═══╦═══╦═══╗
║   ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 8 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 5 ║
╚═══╩═══╩═══╩═══╩═══╝

I tried to do it this way:

df=df.groupby(['A','B','C'])['D'].sum()

But it gives me a Series instead.

2
  • 1
    so are you wanting df=df.groupby(['A','B','C'])['D'].sum().reset_index() or df=df.groupby(['A','B','C'], as_index=False)['D'].sum() Commented Jun 29, 2016 at 7:39
  • Yes it works :) you can put it into answer and I will mark it as solved Commented Jun 29, 2016 at 7:41

1 Answer 1

1

If you want to retain the columns after groupby you can call reset_index:

In [185]:
df.groupby(['A','B','C'])['D'].sum().reset_index()

Out[185]:
   A  B  C  D
0  1  2  3  8
1  1  2  4  7
2  1  5  4  2

or pass arg as_index=False

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.