I have the following DataFrame:
test = {'title': ['Undeclared milk in Burnbrae', 'Undeclared milk in certain Bumble', 'Certain cheese products may contain listeria', 'Ocean brand recalled due to Salmonella', 'IQF Raspberries due to Listeria']}
example = pd.DataFrame(test)
example
title
0 Undeclared milk in Burnbrae
1 Undeclared milk in certain Bumble
2 Certain cheese products may contain listeria
3 Ocean brand recalled due to Salmonella
4 IQF Raspberries due to Listeria
And I want to extract the following strings in the same column. I want my result to look like this:
test = {'hazard': ['Undeclared milk', 'Undeclared milk', 'listeria', 'Salmonella', 'Listeria'], 'title': ['Undeclared milk in Burnbrae', 'Undeclared milk in certain Bumble', 'Certain cheese products may contain listeria', 'Ocean brand recalled due to Salmonella', 'IQF Raspberries due to Listeria']}
example2 = pd.DataFrame(test)
example2
hazard title
0 Undeclared milk Undeclared milk in Burnbrae
1 Undeclared milk Undeclared milk in certain Bumble
2 listeria Certain cheese products may contain listeria
3 Salmonella Ocean brand recalled due to Salmonella
4 Listeria IQF Raspberries due to Listeria
Essentially my separators are in, may contain and due to
example['hazard'] = example['title'].str.extract(r'^(.*?) in\b')
example['hazard'] = example['title'].str.extract(r'\b may contain (.*)$')
example['hazard'] = example['title'].str.extract(r'\b due to (.*)$')
I wrote the code above to test each separator but would like to extract all in the same column.
How can I do this?
I appreciate all the help