How to rearrange my dataframe according to column names while searching for specific strings in cells?
My dataframe:
| 0 | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| apple pie | banana bread | orange juice | nan | nan |
| apple cookies | orange lemonade | nan | nan | nan |
| banana muffin | orange ice | berry candy | nan | nan |
| berry juice | nan | nan | nan | nan |
I want to arrange the rows according to a list of column names, which look for specific strings of text.
| apple | banana | orange | berry | lemon |
|---|---|---|---|---|
| apple pie | banana bread | orange juice | nan | nan |
| apple cookies | nan | orange lemonade | nan | nan |
| nan | banana muffin | orange ice | berry candy | nan |
| nan | nan | nan | berry juice | nan |
I have tried to create a column/list for each fruit, searching for the right string and adding the cell if it matches, however I do not know how to iterate through the dataframe and assign values. I just get a column of Nan's.
col_names = ['apple', 'banana', 'orange', 'berry', 'lemonade']
apples = np.where(df_fruits.str.contains("apple", case=False, na=False), df_fruits, np.nan)
bananas = np.where(df_fruits.str.contains("banana", case=False, na=False), df_fruits, np.nan)
etc...
Edit: I got the dataframe from a csv-file, so the original data format is in rows of string: "apple pie, banana bread, orange juice, nan, nan" etc.