1

I have some data that I want to insert in a dataframe. The data is columns= ['Title', 'Category']. For each Titles I have one or more Categories, and I decided to insert the categories as a list. So my df looks like this:

In [39]: title_cat_df
Out[39]: 
    Title      Category
0  Title1  [Cat1, Cat2]
1  Title3        [Cat5]
2  Title2  [Cat3, Cat4]
...
...
...

However I don't know if this is a pythonic/pandaionic(?!) approach, since I have stumbled upon problems such as looking for specific categories using isin:

In [41]: test_df['Category'].isin(cat_list)
Out[41]: TypeError: unhashable type: 'list'

What would be a better way to represent categories in this case, and hopefully be able to look for titles in a specific category or categories?

0

1 Answer 1

2

Convert column to sets and use & for intersection with list converted to set also:

cat_list = ['Cat1','Cat2', 'Cat4']
print (test_df['Category'].apply(set) & set(cat_list))
0     True
1    False
2     True
Name: Category, dtype: bool

Last filter by boolean indexing:

test_df = test_df[test_df['Category'].apply(set) & set(cat_list)]
print (test_df)
    Title      Category
0  Title1  [Cat1, Cat2]
2  Title2  [Cat3, Cat4]
Sign up to request clarification or add additional context in comments.

2 Comments

This works great with my current approach. I have also tested it with my main dataframe of 5million rows. It did freeze my laptop for a few minutes but it came through in the end, so thanks. However from what I gathered using a list inside a dataframe isn't very idiomatic, but what else can be done when there isn't a specific number of categories, for each entry?
Yes, I agree, if use lists or sets it is not native format in pandas. Only solution for this should be create scalars from lists, but disadvantage is huger DataFrame, e.g. from sample with 3 rows get 5 :(

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.