0

I have this data:

enter image description here

I want to make a new column where if we have an 'X' in delinquency or Suspect LTV has a 'LTV < 10%' Then I will have a 'Yes' in an entry in a new column.

Example of what the output should look like using this data:

enter image description here

Now if there is no X and no LTV < 10% then I will just have a 'No'. I am going to be applying this across many columns but I just focused on two to not make it more complicated.

This is what I have tried:

def exceptions(column):
    for i in column:
        if i == 'X':
            return 'Yes'
        elif i == 'LTV < 10%':
            return 'Yes'
        else:
            return 'No'


df1['Exceptions'] = df1.apply(lambda x : exceptions(x['Delinquency 2+ Month Change']), axis = 1)
df1['Exceptions'] = df1.apply(lambda x : exceptions(x['Suspect LTV']), axis = 1)
1
  • Don't need a for loop for this, use: df1['Exceptions'] = np.where(df1['Delinquency 2+ Month Change'].eq('X') | df1['Suspect LTV'].eq('LTV < 10%', 'Yes', 'No') Commented Sep 13, 2019 at 23:41

3 Answers 3

1

The reason that isn't working is because your "exceptions" function needs /both/ columns in order to make a decision, and you're only passing it one of those columns. Instead, you can pass the entire row to your lambda function, and index the columns you need within that function.

def exceptions(row):
    if row['Delinquency 2+ Month Change'] == 'X':
        return 'Yes'
    if row['Suspect LTV'] == 'LTV < 10%':
        return 'Yes'
    return 'No'


df1['Exceptions'] = df1.apply(exceptions, axis = 1)

I changed your exceptions function to accept one entire row instead of a single cell. There was no need to have a for loop inside it. I also changed how you call the apply function. This new code is essentially the same as df1.apply(lambda x : exceptions(x), axis = 1), but you don't need to use lambda if you're passing your data directly to a function and using the result.

Sign up to request clarification or add additional context in comments.

Comments

0

You shouldn't iterate over the rows of a dataframe for tasks like this (and apply is just another way of iterating).

You can do simple boolean arithmetic and apply the result to where:


mcve:

import pandas as pd
df = pd.DataFrame({'A': list('xx x '), 'B': list('y y  ')})

df['check'] = 'yes'
df.check.where((df.A=='x') | (df.B=='y'), 'no', inplace=True)

#    A  B check
# 0  x  y   yes                                              
# 1  x      yes                                                   
# 2     y   yes                                              
# 3  x      yes                                              
# 4          no                                          

Comments

0

As @Erfan mention in the comment section, you don't really need a for loop or lambda to achieve this, You need np.where.

  1. One way with np.where
  2. Or you can use np.where with np.logical_or

DataFrame:

>>> df
  Delinquency 2+ Month Change Suspect LTV
0                           X
1                           X
2                           X
3                           X
4                           X
5                           X   LTV < 10%

Result:

1- Use OR operator as pipe | with np.where

>>> df['Exceptions'] = np.where((df['Delinquency 2+ Month Change'] == 'X') |  (df['Suspect LTV'] == 'LTV < 10%'), 'Yes', 'No')
>>> df
  Delinquency 2+ Month Change Suspect LTV Exceptions
0                           X                    Yes
1                           X                    Yes
2                           X                    Yes
3                           X                    Yes
4                           X                    Yes
5                           X   LTV < 10%        Yes

2- Second method as i mention at the start with np.where with np.logical_or

>>> df['Exceptions'] = np.where(np.logical_or(df['Delinquency 2+ Month Change'] == 'X', df['Suspect LTV'] == 'LTV < 10%'), 'yes', 'no')
>>> df
  Delinquency 2+ Month Change Suspect LTV Exceptions
0                           X                    yes
1                           X                    yes
2                           X                    yes
3                           X                    yes
4                           X                    yes
5                           X   LTV < 10%        yes

Same but alternative with .eq rather ==

df['Exceptions'] = np.where(np.logical_or(df['Delinquency 2+ Month Change'].eq('X') ,df['Suspect LTV'].eq('LTV < 10%')), 'Yes', 'No')

OR

df['Exceptions'] = np.where(np.logical_or(df['Delinquency 2+ Month Change'].eq('X') ,df['Suspect LTV'].eq('LTV < 10%')), 'Yes', 'No')

In case, you are interested about the way with using defining a function and applying that across ..

Function definition...

def exceptions(row):
  if row['Delinquency 2+ Month Change'] == 'X':
    return 'Yes'
  if row['Suspect LTV'] == 'LTV < 10%':
    return 'Yes'
  else:
    return 'No'

Use the Function with DataFrame.apply() as follows..

df['Exceptions'] = df.apply(exceptions, axis=1)
print(df)
  Delinquency 2+ Month Change Suspect LTV Exceptions
0                           X                    Yes
1                           X                    Yes
2                           X                    Yes
3                           X                    Yes
4                           X                    Yes
5                           X   LTV < 10%        Yes

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.