Background
I have a dataset where I have the following:
product_title price
Women's Pant 20.00
Men's Shirt 30.00
Women's Dress 40.00
Blue 4" Shorts 30.00
Blue Shorts 35.00
Green 2" Shorts 30.00
I created a new column called gender which contains the values Women, Men, or Unisex based on the specified string in product_title.
The output looks like this:
product_title price gender
Women's Pant 20.00 women
Men's Shirt 30.00 men
Women's Dress 40.00 women
Blue 4" Shorts 30.00 women
Blue Shorts 35.00 unisex
Green 2" Shorts 30.00 women
Approach
I approached creating a new column by using if/else statements:
df['gender'] = ['women' if 'women' in word or 'Blue 4"' in word or 'Green 2"' in word
else "men" if "men" in word
else "unisex"
for word in df.product_title.str.lower()]
Although this approach works, it becomes very long when I have a lot of conditions for labeling women vs men vs unisex. Is there cleaner way to do this? Is there a way I can pass a list of strings instead of having a long chain of or conditions?
I would really appreciate help as I am new to python and pandas library.