0

I'm just starting with Python and I have a big list of subjects and their (BMI) body mass index (along many more data). I need to create a new column (called OMS) where I can state if they're "normal", "overweight", "obese", etcc.

but I just can't find the correct way to do it. I tried np.when but that only works with 2 conditions.

I tried the if, elif, else without success and also the:

df['oms'] = np.nan

df['oms'].loc[(df['IMC'] <=18.5 )] = "slim"

df['oms'].loc[(df['IMC'] >= 18.5) & (df['IMC'] <25 )] = "normal"

df['oms'].loc[(df['IMC'] >= 25) & (df['IMC'] <=30 )] = "overweight"

df['oms'].loc[(df['IMC'] > 30)] = "obese"

any ideas? I'm stuck.

2
  • Thanks guys!!! I'll work on it and let you know. Commented Dec 29, 2019 at 22:51
  • 1
    Just for the sake of fun and learn, I tried all 4 solutions you guys suggested and with the proper adjustments, all worked. Commented Dec 29, 2019 at 23:08

4 Answers 4

1
df.loc[df['IMC'].lt(18.5), 'oms'] = "slim"
df.loc[df['IMC'].ge(18.5) & df['IMC'].lt(25), 'oms'] = "normal"
df.loc[df['IMC'].ge(25) & df['IMC'].lt(30), 'oms'] = "overweight"
df.loc[df['IMC'].ge(30), 'oms'] = "obese"

You can also use pd.cut.

bins = [0, 18.5, 25, 30, 9999]
labels = ['slim', 'normal', 'overweight', 'obese']

df = pd.DataFrame({'IMC': [15, 20, 27, 40]})
df['oms'] = pd.cut(df['IMC'], bins, labels=labels)
>>> df
   IMC         oms
0   15        slim
1   20      normal
2   27  overweight
3   40       obese
Sign up to request clarification or add additional context in comments.

Comments

0

Try maybe:

df['oms'] = ""#keep it object dtype

df.loc[(df['IMC'] <=18.5 ), 'oms'] = "slim"
df.loc[(df['IMC'] >= 18.5) & (df['IMC'] <25 ), 'oms'] = "normal"
df.loc[(df['IMC'] >= 25) & (df['IMC'] <=30 ), 'oms'] = "overweight"
df.loc[(df['IMC'] > 30), 'oms'] = "obese"

Comments

0

Use numpy.select, I like this alternative because it's very versatile and you can easily add or remove conditions.

import numpy as np

condlist = [df["IMC"] <= 18,
           (df["IMC"] >= 18.5) & (df['IMC'] <25),
           (df["IMC"] >= 25) & (df['IMC'] <=30),
            df["IMC"] > 30]

condchoice = ["slim", "normal", "overweight", "obese"]

df["oms"] = np.select(condlist, condchoice)

Comments

0

You can use lambda functions and apply with panda dataframes.

I created a dummy data file:

bmi,height
20,72
22,73
26,77
5,66
13,60

imported the data file

df = pd.read_csv('data.txt', header=0)

created a column like you did of NaNs (but you don't have to)

df["oms"] = np.nan

and then used a lambda to compare 'bmi' column to some criteria

df['oms'] = df['bmi'].apply(lambda x: 'slim' if x < 18.5 else ('normal' if x<25 else ('overweight' if x<30 else 'obese')))

the data looks like this,

print(df.head())

   bmi  height     oms
0   20      72  normal
1   22      73   obese
2   26      77   obese
3    5      66  skinny
4   13      60  skinny

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.