3

I can sum the first 310 rows in a 5 column pandas dataframe and get a tidy summary by using:

df.[0:310].sum

Is there an easy way whereby I can sum the first 310 rows in a certain column of my choosing? I just can't figure out how to combine a column selection and row slice selection in the expression. It would be ideal to specify the column by column name, but column index is fine too.

In an attempt to sum the 1st 310 rows of the 5th column, I tried

df.iloc[0:310, 4].sum

but just got a printout of 310 rows from that column. Thank you.

2 Answers 2

4

I think need DataFrame.iloc for select rows by positions with get_indexer for positions of columns by names:

#data borrowed from Akshay Nevrekar answer, but changed index values
data = {'x':[1,2,3,4,5], 
        'y':[2,5,7,9,11], 
        'z':[2,6,7,3,4]}
df=pd.DataFrame(data, index=list('abcde'))
print (df)
   x   y  z
a  1   2  2
b  2   5  6
c  3   7  7
d  4   9  3
e  5  11  4

a = df.iloc[:3, df.columns.get_indexer(['x','z'])].sum()

What is same as:

a = df.iloc[:3, [0,2]].sum()

print (a)
x     6
z    15
dtype: int64

Detail:

print (df.iloc[:3, df.columns.get_indexer(['x','z'])])
   x  z
a  1  2
b  2  6
c  3  7

If want only one column use get_loc for position:

b = df.iloc[:3, df.columns.get_loc('x')].sum()

What is same as:

b = df.iloc[:3, 0].sum()

print (b)
6

Detail:

print (df.iloc[:3, df.columns.get_loc('x')])
a    1
b    2
c    3
Name: x, dtype: int64
Sign up to request clarification or add additional context in comments.

3 Comments

I am curious to know, what is different in the proposed solution b = df.iloc[:3, 0].sum() compared to what is mentioned in the question as something that was tried df.iloc[0:310, 4].sum. The brackets at the end are missing ? Is that the only difference ? Am I missing something here ?
@SarfraazAhmed - Do you think b = df.iloc[:3, 0].sum() vs b = df.iloc[0:3, 0].sum() ? I think it is same ;)
Yes, it is same. Funny thing is I tried running that code without the ending brackets and I got a bunch of rows printed !! I think the only mistake in what the user tried was just a pair of missing closing brackets. Am I right ?
2

You need something like this:

import pandas as pd
data = {'x':[1,2,3,4,5], 'y':[2,5,7,9,11], 'z':[2,6,7,3,4]}
df=pd.DataFrame(data)

Use list of columns along with rows:

df.loc[0:310][['x','z']].sum()

output:

x    15
z    22
dtype: int64

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.