I am trying to add a new column to a dataframe based on an if statement depending on the values of two columns. i.e. if column x == None then column y else column x
below is the script I have written but doesn't work. any ideas?
dfCurrentReportResults['Retention'] = dfCurrentReportResults.apply(lambda x : x.Retention_y if x.Retention_x == None else x.Retention_x)
Also I got this error message: AttributeError: ("'Series' object has no attribute 'Retention_x'", u'occurred at index BUSINESSUNIT_NAME')
fyi: BUSINESSUNIT_NAME is the first column name
Additional Info:
My data printed out looks like this and I want to add a 3rd column to take a value if there is one else keep NaN.
Retention_x Retention_y
0 1 NaN
1 NaN 0.672183
2 NaN 1.035613
3 NaN 0.771469
4 NaN 0.916667
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
UPDATE: In the end I was having issues referencing the Null or is Null in my dataframe the final line of code I used also including the axis = 1 answered my question.
dfCurrentReportResults['RetentionLambda'] = dfCurrentReportResults.apply(lambda x : x['Retention_y'] if pd.isnull(x['Retention_x']) else x['Retention_x'], axis = 1)
Thanks @EdChum, @strim099 and @aus_lacy for all your input. As my data set gets larger I may switch to the np.where option if I notice performance issues.
Nonea string or aNaN? And could you provide a sample set of your data frame so we can better debug any issues?applyon? A sample of your data would help you get an answer much quicker.