I have the following dataframe where I would like to print the unique values of the color column.
df = pd.DataFrame({'colors': ['green', 'green', 'purple', ['yellow , red'], 'orange'], 'names': ['Terry', 'Nor', 'Franck', 'Pete', 'Agnes']})
Output:
colors names
0 green Terry
1 green Nor
2 purple Franck
3 [yellow , red] Pete
4 orange Agnes
df.colors.unique() would work fine if there wasn't the [yellow , red] row. As it is I keep getting the TypeError: unhashable type: 'list' error which is understandable.
Is there a way to still get the unique values without taking this row into account?
I tried the followings but none worked:
df = df[~df.colors.str.contains(',', na=False)] # Nothing happens
df = df[~df.colors.str.contains('[', na=False)] # Output: error: unterminated character set at position 0
df = df[~df.colors.str.contains(']', na=False)] # Nothing happens
df.loc[~df.colors.str.contains('[', na=False, regex=False), 'colors'].unique()error: unterminated character set at position 0@MahendraSingh