python create a new column using multiple conditions

Question

I'm just starting with Python and I have a big list of subjects and their (BMI) body mass index (along many more data). I need to create a new column (called OMS) where I can state if they're "normal", "overweight", "obese", etcc.

but I just can't find the correct way to do it. I tried np.when but that only works with 2 conditions.

I tried the if, elif, else without success and also the:

df['oms'] = np.nan

df['oms'].loc[(df['IMC'] <=18.5 )] = "slim"

df['oms'].loc[(df['IMC'] >= 18.5) & (df['IMC'] <25 )] = "normal"

df['oms'].loc[(df['IMC'] >= 25) & (df['IMC'] <=30 )] = "overweight"

df['oms'].loc[(df['IMC'] > 30)] = "obese"

any ideas? I'm stuck.

Just for the sake of fun and learn, I tried all 4 solutions you guys suggested and with the proper adjustments, all worked. — Micaela De León
– Micaela De León, Commented Dec 29, 2019 at 23:08

Alexander · Accepted Answer · 2019-12-29 22:38:28Z

1

df.loc[df['IMC'].lt(18.5), 'oms'] = "slim"
df.loc[df['IMC'].ge(18.5) & df['IMC'].lt(25), 'oms'] = "normal"
df.loc[df['IMC'].ge(25) & df['IMC'].lt(30), 'oms'] = "overweight"
df.loc[df['IMC'].ge(30), 'oms'] = "obese"

You can also use pd.cut.

bins = [0, 18.5, 25, 30, 9999]
labels = ['slim', 'normal', 'overweight', 'obese']

df = pd.DataFrame({'IMC': [15, 20, 27, 40]})
df['oms'] = pd.cut(df['IMC'], bins, labels=labels)
>>> df
   IMC         oms
0   15        slim
1   20      normal
2   27  overweight
3   40       obese

edited Dec 29, 2019 at 22:38

answered Dec 29, 2019 at 22:24

Alexander

111k32 gold badges212 silver badges208 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Georgina Skibinski · Accepted Answer · 2019-12-29 22:26:59Z

0

Try maybe:

df['oms'] = ""#keep it object dtype

df.loc[(df['IMC'] <=18.5 ), 'oms'] = "slim"
df.loc[(df['IMC'] >= 18.5) & (df['IMC'] <25 ), 'oms'] = "normal"
df.loc[(df['IMC'] >= 25) & (df['IMC'] <=30 ), 'oms'] = "overweight"
df.loc[(df['IMC'] > 30), 'oms'] = "obese"

answered Dec 29, 2019 at 22:26

Georgina Skibinski

13.5k2 gold badges16 silver badges44 bronze badges

Comments

Nathan Furnal · Accepted Answer · 2019-12-29 22:41:53Z

0

Use numpy.select, I like this alternative because it's very versatile and you can easily add or remove conditions.

import numpy as np

condlist = [df["IMC"] <= 18,
           (df["IMC"] >= 18.5) & (df['IMC'] <25),
           (df["IMC"] >= 25) & (df['IMC'] <=30),
            df["IMC"] > 30]

condchoice = ["slim", "normal", "overweight", "obese"]

df["oms"] = np.select(condlist, condchoice)

answered Dec 29, 2019 at 22:41

Nathan Furnal

2,4503 gold badges14 silver badges28 bronze badges

Comments

merit_2 · Accepted Answer · 2019-12-29 22:51:04Z

0

You can use lambda functions and apply with panda dataframes.

I created a dummy data file:

bmi,height
20,72
22,73
26,77
5,66
13,60

imported the data file

df = pd.read_csv('data.txt', header=0)

created a column like you did of NaNs (but you don't have to)

df["oms"] = np.nan

and then used a lambda to compare 'bmi' column to some criteria

df['oms'] = df['bmi'].apply(lambda x: 'slim' if x < 18.5 else ('normal' if x<25 else ('overweight' if x<30 else 'obese')))

the data looks like this,

print(df.head())

   bmi  height     oms
0   20      72  normal
1   22      73   obese
2   26      77   obese
3    5      66  skinny
4   13      60  skinny

edited Dec 29, 2019 at 22:51

answered Dec 29, 2019 at 22:39

merit_2

4715 silver badges17 bronze badges

Collectives™ on Stack Overflow

python create a new column using multiple conditions

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related