Sum only certain rows in a given column of pandas dataframe

Question

I can sum the first 310 rows in a 5 column pandas dataframe and get a tidy summary by using:

df.[0:310].sum

Is there an easy way whereby I can sum the first 310 rows in a certain column of my choosing? I just can't figure out how to combine a column selection and row slice selection in the expression. It would be ideal to specify the column by column name, but column index is fine too.

In an attempt to sum the 1st 310 rows of the 5th column, I tried

df.iloc[0:310, 4].sum

but just got a printout of 310 rows from that column. Thank you.

jezrael · Accepted Answer · 2018-03-16 06:18:29Z

4

I think need DataFrame.iloc for select rows by positions with get_indexer for positions of columns by names:

#data borrowed from Akshay Nevrekar answer, but changed index values
data = {'x':[1,2,3,4,5], 
        'y':[2,5,7,9,11], 
        'z':[2,6,7,3,4]}
df=pd.DataFrame(data, index=list('abcde'))
print (df)
   x   y  z
a  1   2  2
b  2   5  6
c  3   7  7
d  4   9  3
e  5  11  4

a = df.iloc[:3, df.columns.get_indexer(['x','z'])].sum()

What is same as:

a = df.iloc[:3, [0,2]].sum()

print (a)
x     6
z    15
dtype: int64

Detail:

print (df.iloc[:3, df.columns.get_indexer(['x','z'])])
   x  z
a  1  2
b  2  6
c  3  7

If want only one column use get_loc for position:

b = df.iloc[:3, df.columns.get_loc('x')].sum()

What is same as:

b = df.iloc[:3, 0].sum()

print (b)
6

Detail:

print (df.iloc[:3, df.columns.get_loc('x')])
a    1
b    2
c    3
Name: x, dtype: int64

edited Mar 16, 2018 at 6:18

answered Mar 16, 2018 at 6:12

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Sarfraaz Ahmed Over a year ago

I am curious to know, what is different in the proposed solution b = df.iloc[:3, 0].sum() compared to what is mentioned in the question as something that was tried df.iloc[0:310, 4].sum. The brackets at the end are missing ? Is that the only difference ? Am I missing something here ?

jezrael Over a year ago

@SarfraazAhmed - Do you think b = df.iloc[:3, 0].sum() vs b = df.iloc[0:3, 0].sum() ? I think it is same ;)

Sarfraaz Ahmed Over a year ago

Yes, it is same. Funny thing is I tried running that code without the ending brackets and I got a bunch of rows printed !! I think the only mistake in what the user tried was just a pair of missing closing brackets. Am I right ?

Sociopath · Accepted Answer · 2018-03-16 05:49:12Z

2

You need something like this:

import pandas as pd
data = {'x':[1,2,3,4,5], 'y':[2,5,7,9,11], 'z':[2,6,7,3,4]}
df=pd.DataFrame(data)

Use list of columns along with rows:

df.loc[0:310][['x','z']].sum()

output:

x    15
z    22
dtype: int64

answered Mar 16, 2018 at 5:49

Sociopath

13.4k22 gold badges53 silver badges82 bronze badges

Collectives™ on Stack Overflow

Sum only certain rows in a given column of pandas dataframe

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related