I try to learn how to work with pandas dataframes. My dataframe has 4 columns A,B,C,D.
For index (A,B,C) there are multiple values of D. I want to merge these rows and sum the values of D.
I have:
╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 5 ║
║ 1 ║ 1 ║ 2 ║ 3 ║ 3 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 3 ║
╚═══╩═══╩═══╩═══╩═══╝
I want to get:
╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 8 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 5 ║
╚═══╩═══╩═══╩═══╩═══╝
I tried to do it this way:
df=df.groupby(['A','B','C'])['D'].sum()
But it gives me a Series instead.
df=df.groupby(['A','B','C'])['D'].sum().reset_index()ordf=df.groupby(['A','B','C'], as_index=False)['D'].sum()