0

I have a column named ingredients and it has multiple items in it. How do I seperate each of them into a different column?

Eg:    Type      Ingredients
       Hybrid    18.7% THC
                 1.62% Total Terpenes
                 0.61% Myrcene
       Indica    0.61% Myrcene
                 0.35% Ocimene
                 0.18% Limonene

I want to split ingredients column into multiple column as THC, Mycrene, Ocimene, Limonene etc with each column describing it's percentage

1 Answer 1

2

I think you need Series.str.split by percentage with \s+ for one or more spaces, assign to new columns and then forward filling missing values for type column, last reshape by DataFrame.pivot:

df[['per','ingr']] = df['Ingredients'].str.split('%\s+', expand=True)
df['Type'] = df['Type'].replace('', np.nan).ffill()

df = df.pivot('Type','ingr','per').astype(float)
print (df)
ingr   Limonene Myrcene Ocimene   THC Total Terpenes
Type                                                
Hybrid      NaN    0.61     NaN  18.7           1.62
Indica     0.18    0.61    0.35   NaN            NaN
Sign up to request clarification or add additional context in comments.

2 Comments

It didn't work for me. I think it's because it also has '\n'
@HellyBhalodia - Can you create sample data by d = df.head().to_dict('list') ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.