2

New to Python Data Science

The below is my Raw_data

raw_data = {'var1': ['true','false','ture'],'var2': [10,20,50], 'var3':['eggs','milk','eggs']}
df = pd.DataFrame(raw_data, columns = ['var1','var2','var3'])`

Expermented code but not working

def my_fun (var1,var2,var3,var4):
    df[var4]= np.where((df[var1] == 'true', 
                       df[var3] == 'eggs',
                       df[var2] < 10),
                       'hello',
                       'zello')
return df

Here I like to use var1, var2 and var3 conditions and get the conditional result. Please help

2
  • 1
    "but not working" - please include the complete error message. Also, you probably should use True and False instead of "true" and "false". Commented Mar 23, 2018 at 2:21
  • 1
    Not to mention True instead of "ture" Commented Mar 23, 2018 at 2:43

1 Answer 1

2

First use Boolean True / False rather than strings to simplify your logic. To apply this conversion to series 'var1':

df['var1'] = df['var1'] == 'true'

You can then use the bitwise operator & to compare Boolean series:

def my_fun (var1,var2,var3,var4):
    df[var4]= np.where(df[var1] & df[var3].eq('eggs') & df[var2].lt(10),
                       'hello', 'zello')
    return df

A less efficient alternative is to use np.logical_and.reduce:

def my_fun (var1,var2,var3,var4):
    conds = (df[var1], df[var3] == 'eggs', df[var2] < 10)
    df[var4]= np.where(np.logical_and.reduce(conds), 'hello', 'zello')
    return df
Sign up to request clarification or add additional context in comments.

4 Comments

You could also use numpy.all with an axis argument, or just & twice.
@jpp For a tiny amount of data like this there's probably no difference, but if you had any significant amount, np.logical_and would waste some significant time dealing with the Series objects during reduction, (presumably because it isn't optimized for this like Pandas bitwise-ands are). You could avoid this of course using .values everywhere but that gets pretty verbose.
@jpp Works great. Can add additional condition to np.where? (df[var1].values == 'true') & (df[var3].values == 'eggs') & (df[var2]].values == 10), 'equal'
@jpp I am accepting your answer. However, I am able to add new condition by adding addtional np.where. Thank you so much.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.