2

Need your help on one issue. I have a dataframe df which has 3 columns of 'object' data type- opt1,opt2,opt3.

Now I need to create a new column var which will have followings-

  1. if opt2 and opt3 are None/Null/Empty and opt1 is not None/Null/Empty: then df['var'] = df['opt1']
  2. elif opt1 and opt3 are None/Null/Empty and opt2 is None/Null/Empty: then df['var'] = df['opt1'] + '|' + df['opt3']
  3. elif opt1 and opt2 are None/Null/Empty and opt3 is None/Null/Empty: then df['var'] = df['opt1'] + '|' + df['opt2']
  4. else: df['var'] = df['opt1'] + '|' + df['opt2'] + '|' + df['opt3']

Please suggest me to put above conditions in Python 3.6 or share me any better approach.

2
  • Your point 2 and 3 is missing some "not"s... Commented May 15, 2018 at 6:28
  • Actually I am unable to design a correct if-elif-else condition for this scenario, this snippet is only for understanding my test cases. If possible please share the correct version of whole condition. Commented May 15, 2018 at 6:32

1 Answer 1

1

I think need:

df = pd.DataFrame({'opt1':['',np.nan,'a','a','a',np.nan],
                       'opt2':[np.nan,'b',np.nan,'b','b',np.nan],
                        'opt3':['c','Null',np.nan,'c',np.nan,np.nan]})

print (df)
  opt1 opt2  opt3
0       NaN     c
1  NaN    b  Null
2    a  NaN   NaN
3    a    b     c
4    a    b   NaN
5  NaN  NaN   NaN

#replace strings Null and empty strins to NaN 
df1 = df.mask(df.isin(['Null','']))
#join values per rows with filter out NaNs
df['var'] = df1.apply(lambda x: '|'.join(x.dropna()), 1)
print (df)
  opt1 opt2  opt3    var
0       NaN     c      c
1  NaN    b  Null      b
2    a  NaN   NaN      a
3    a    b     c  a|b|c
4    a    b   NaN    a|b
5  NaN  NaN   NaN       
Sign up to request clarification or add additional context in comments.

5 Comments

df['var'] = df.apply(lambda x: '|'.join(x.dropna()), 1) it gives me error-KeyError: "['opt1' 'opt2' 'opt3'] not in index"
currently in opt1 there are values but in other two columns values are as None. But I cannot remove these two columns because in future they may have data.
It remove dynamically - only if None or NaN, else not.
Instead of Nan or Null, opt2 and opt3 columns have None. Will this work for this also? Because I am getting KeyError: "['opt1' 'opt2' 'opt3'] not in index"
Yes, dropna remove None and NaN if both missing values. If None is string, need df1 = df.mask(df.isin(['Null','', 'None']))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.