python pandas data frame if else without iterating thought data frame

Question

I want to add a column to a df. The values of this new df will be dependent upon the values of the other columns. eg

dc = {'A':[0,9,4,5],'B':[6,0,10,12],'C':[1,3,15,18]}
df = pd.DataFrame(dc)
   A   B   C
0  0   6   1
1  9   0   3
2  4  10  15
3  5  12  18

Now I want to add another column D whose values will depend on values of A,B,C. So for example if was iterating through the df I would just do:

for row in df.iterrows():
    if(row['A'] != 0 and row[B] !=0):
         row['D'] = (float(row['A'])/float(row['B']))*row['C']
    elif(row['C'] ==0 and row['A'] != 0 and row[B] ==0):
         row['D'] == 250.0
    else:
         row['D'] == 20.0

Is there a way to do this without the for loop or using where () or apply () functions.

Thanks

TomAugspurger · Accepted Answer · 2014-05-05 21:51:11Z

6

apply should work well for you:

In [20]: def func(row):
            if (row == 0).all():
                return 250.0
            elif (row[['A', 'B']] != 0).all():
                return (float(row['A']) / row['B'] ) * row['C']
            else:
                return 20
       ....:     


In [21]: df['D'] = df.apply(func, axis=1)

In [22]: df
Out[22]: 
   A   B   C     D
0  0   6   1  20.0
1  9   0   3  20.0
2  4  10  15   6.0
3  5  12  18   7.5

[4 rows x 4 columns]

answered May 5, 2014 at 21:51

TomAugspurger

29k8 gold badges89 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

cryp Over a year ago

Great thanks ! Also would something like this work as well. code elif (row[['A', 'B']] != 0 and row['C'] != None).all(): I have to check for None condition as well

TomAugspurger Over a year ago

You probably should convert those Nones to NaNs. You'll get better performance since it will be a float dtype instead of column, and pandas operations are NaN aware.

cryp Over a year ago

Ohh ok Thanks!! so something like this ? elif (row[['A', 'B']] != 0 and row['C'] != NaN).all()

TomAugspurger Over a year ago

It just depends on how you want to treat NaNs. If you return NaN when row['C'] is NaN, then you won't even need this case, since x * NaN is NaN. If you want to return 0, you can do a fillna(0) after applying fund. Also, for various reasons np.nan == np.nan is always False, so your way wouldn't quite work. Pandas gives the pd.isnull function to check for NaNs.

cryp Over a year ago

got thanks. I just wanted to see if the syntax for correct. Thanks for the help.

fantabolous · Accepted Answer · 2014-08-18 05:03:05Z

3

.where can be much faster than .apply, so if all you're doing is if/elses then I'd aim for .where. As you're returning scalars in some cases, np.where will be easier to use than pandas' own .where.

import pandas as pd
import numpy as np
df['D'] = np.where((df.A!=0) & (df.B!=0), ((df.A/df.B)*df.C),
          np.where((df.C==0) & (df.A!=0) & (df.B==0), 250,
          20))

   A   B   C     D
0  0   6   1  20.0
1  9   0   3  20.0
2  4  10  15   6.0
3  5  12  18   7.5

For a tiny df like this, you wouldn't need to worry about speed. However, on a 10000 row df of randn, this is almost 2000 times faster than the .apply solution above: 3ms vs 5850ms. That said if speed isn't a concern, then .apply can often be easier to read.

edited Aug 18, 2014 at 5:03

answered Aug 18, 2014 at 4:25

fantabolous

22.9k8 gold badges58 silver badges52 bronze badges

Comments

acushner · Accepted Answer · 2014-05-05 21:51:03Z

2

here's a start:

df['D'] = np.nan
df['D'].loc[df[(df.A != 0) & (df.B != 0)].index] = df.A / df.B.astype(np.float) * df.C

edit, you should probably just go ahead and cast the whole thing to floats unless you really care about integers for some reason:

df = df.astype(np.float)

and then you don't have to constantly keep converting in call itself

answered May 5, 2014 at 21:51

acushner

9,9461 gold badge38 silver badges37 bronze badges

Collectives™ on Stack Overflow

python pandas data frame if else without iterating thought data frame

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related