Python/Pandas Iterating through columns

Question

I have a DataFrame which looks like this (with many additional columns)

          age1     age2      age3     age 4   \
Id#     
1001         5        6         2          8  
1002         7        6         1          0
1003        10        9         7          5
1004         9       12         5          9

I am trying write a loop that sums each column with the previous ones before it and returns it to a new DataFrame. I have started out, simply, with this:

New = pd.DataFrame()
New[0] = SFH2.ix[:,0]
for x in SFH2:
    ls = [x,x+1]
    B = SFH2[ls].sum(axis=1)
    New[x] = B

print(New)

and the error I get is

    ls = [x,x+1]

TypeError: Can't convert 'int' object to str implicitly

I know that int and str are different objects, but how can I overcome this, or is there a different way to iterate through columns? Thanks!

In other words, do you want each column to be the sum of all the columns to the left, or simply that column and a single column to the left (right?). — juanpa.arrivillaga
– juanpa.arrivillaga, Commented Aug 3, 2016 at 9:26
I want each column to be the sum of all the columns to the left. — cmf05
– cmf05, Commented Aug 3, 2016 at 9:27
@cmf05 - I think the best is add desired output to question, maybe in another question you can do it ;) — jezrael
– jezrael, Commented Aug 3, 2016 at 9:33

juanpa.arrivillaga · Accepted Answer · 2016-08-03 09:28:58Z

2

It sounds like cumsum is what you are looking for:

In [5]: df
Out[5]: 
      age1  age2  age3  age4
Id#                         
1001     5     6     2     8
1002     7     6     1     0
1003    10     9     7     5
1004     9    12     5     9

In [6]: df.cumsum(axis=1)
Out[6]: 
      age1  age2  age3  age4
Id#                         
1001     5    11    13    21
1002     7    13    14    14
1003    10    19    26    31
1004     9    21    26    35

answered Aug 3, 2016 at 9:28

juanpa.arrivillaga

97.6k14 gold badges141 silver badges190 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

cmf05 Over a year ago

Ah thank you! Clearly I need to get a bit more familiar with pandas.

juanpa.arrivillaga Over a year ago

@piRSquared Well, OP was a bit ambiguous. The code seemed to imply the rolling sum with window of 2, but the description of the desired output implied cumsum

juanpa.arrivillaga Over a year ago

@cmf05 If you find yourself writing for-loops to work with pandas objects, then there is almost always a better way.

jezrael · Accepted Answer · 2016-08-03 09:26:13Z

You can use add with shifted DataFrame:

print (df.shift(-1,axis=1))
      age1  age2  age3  age4
Id#                         
1001   6.0   2.0   8.0   NaN
1002   6.0   1.0   0.0   NaN
1003   9.0   7.0   5.0   NaN
1004  12.0   5.0   9.0   NaN

print (df.add(df.shift(-1,axis=1), fill_value=0))
      age1  age2  age3  age4
Id#                         
1001  11.0   8.0  10.0   8.0
1002  13.0   7.0   1.0   0.0
1003  19.0  16.0  12.0   5.0
1004  21.0  17.0  14.0   9.0

If need shift with 1 (default parameter, omited):

print (df.shift(axis=1))
      age1  age2  age3  age4
Id#                         
1001   NaN   5.0   6.0   2.0
1002   NaN   7.0   6.0   1.0
1003   NaN  10.0   9.0   7.0
1004   NaN   9.0  12.0   5.0

print (df.add(df.shift(axis=1), fill_value=0))
      age1  age2  age3  age4
Id#                         
1001   5.0  11.0   8.0  10.0
1002   7.0  13.0   7.0   1.0
1003  10.0  19.0  16.0  12.0
1004   9.0  21.0  17.0  14.0

Collectives™ on Stack Overflow

Python/Pandas Iterating through columns

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related