1

I have a dataframe containing a few columns with arrays. Here's a sample of one of the columns:

   key            arraylist
0  PROJECT-13051  [value1, value2, value4]
1  PROJECT-13050  [value2, value3, value4]
2  PROJECT-13049  [value1, value2, value3]
3  PROJECT-13048  [value3, value4, value5]
4  PROJECT-13047  [value1, value2, value5]

I pull this data from a sql database as comma seperated, then use the following to set as a list:

df[arraylist] = df[arraylist].apply(literal_eval)

I'd like group by the arraylist column and get the size of each value within the array:

df.groupby('arraylist').size()

This is resulting in the error TypeError: unhashable type: 'list'

I'd like to get an output like so:

arraylist
value1      3
value2      4
value3      3
value4      3
value5      2
dtype: int64

Any help would be greatly appreciated!

1 Answer 1

6

Try with explode + value_counts:

df['arraylist'].explode().value_counts()
value2    4
value1    3
value4    3
value3    3
value5    2
Name: arraylist, dtype: int64

Optional sort_index for sorted as in OP:

df['arraylist'].explode().value_counts().sort_index()
value1    3
value2    4
value3    3
value4    3
value5    2
Name: arraylist, dtype: int64

Or with natsorted for correct natural alphanumeric sorting:

from natsort import natsorted

df['arraylist'].explode().value_counts().loc[lambda s: natsorted(s.index)]
value1    3
value2    4
value3    3
value4    3
value5    2
Name: arraylist, dtype: int64

DataFrame and Imports Used:

from ast import literal_eval

import pandas as pd

df = pd.DataFrame({
    'key': ['PROJECT-13051', 'PROJECT-13050', 'PROJECT-13049',
            'PROJECT-13048', 'PROJECT-13047'],
    'arraylist': ['["value1", "value2", "value4"]',
                  '["value2", "value3", "value4"]',
                  '["value1", "value2", "value3"]',
                  '["value3", "value14", "value5"]',
                  '["value1", "value2", "value5"]']
})
df['arraylist'] = df['arraylist'].apply(literal_eval)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Henry, looks like the explode function was what I was after!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.