3

In my dataframe I have:

Name    Sex    Height
Jackie   F       Small
John     M       Tall

I have made the following function to apply to create a new column based off combinations:

def genderfunc(x,y):
    if x =='Tall' & y=='M':
        return 'T Male'
    elif x =='Medium' & y=='M':
        return 'Male'
    elif x =='Small' & y=='M':
        return 'Male'
    elif x =='Tall' & y=='F':
        return 'T Female'
    elif x =='Medium' & y=='F':
        return 'Female'
    elif x =='Small' & y=='F':
        return 'Female'
    else:
        return y

My line of code to apply this function:

df['GenderDetails'] = df.apply(genderfunc(df['Height'],df['Sex']))

and i get the following:

TypeError: Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]

Any ideas on what im doing wrong here? this is my first go at using a function.

Thanks!

4 Answers 4

6

Here is another approach, using map.

map_ = {"TallM": "T Male", "SmallF": "Female"}

df['GenderDetails'] = (df['Height'] + df['Sex']).str.strip().map(map_)

     Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male
Sign up to request clarification or add additional context in comments.

Comments

5

or you can use np.select, if performance is a concern-

condlist = [(df['Height'] == 'Tall') & (df['Sex'] == 'M'),
            (df['Height'] == 'Medium') & (df['Sex'] == 'M'),
            (df['Height'] == 'Small') & (df['Sex'] == 'M'),
            (df['Height'] == 'Tall') & (df['Sex'] == 'F'),
            (df['Height'] == 'Medium') & (df['Sex'] == 'F'),
            (df['Height'] == 'Small') & (df['Sex'] == 'F')]
choiselist = [
    'T Male',
    'Male',
    'Male',
    'T Female',
    'Female',
    'Female'
]

df['GenderDetails'] = np.select(condlist, choiselist, df['Sex'])

Comments

2

You need to replace & with and.

1 Comment

@pythonic833 I completely disagree with you. The EXACT recommendation was followed by the selected answer. The only difference is that the selected answer had the corrected code. Nevertheless, the answer here is correct.
1

You are close, need lambda function with axis=1 and because scalar processing use and:

def genderfunc(x,y):
    if x =='Tall' and y=='M':
        return 'T Male'
    elif x =='Medium' and y=='M':
        return 'Male'
    elif x =='Small' and y=='M':
        return 'Male'
    elif x =='Tall' and y=='F':
        return 'T Female'
    elif x =='Medium' and y=='F':
        return 'Female'
    elif x =='Small' and y=='F':
        return 'Female'
    else:
        return y

df['GenderDetails'] = df.apply(lambda x: genderfunc(x['Height'],x['Sex']), axis=1)
print (df)
     Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male

Non loop solution is possible with helper DataFrame and left join:

#set values like need
L = [('Tall','M','T Male'), ('Small','F','Female')]
df1 = pd.DataFrame(L, columns=['Height','Sex','GenderDetails'])


df = df.merge(df1, how='left')
print (df)
     Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.