Value Count String Occurrences for Pandas Column of Lists type in Python

Question

I have a pandas column that contains a list of strings that are separated by a comma and a new line " \n " if the list has multiple strings. Otherwise, the notation is simply: [\n "string" \n] (notice how each new string has a \n proceeding it)

Is it possible, for the entire column, count the number of times each string occurs?

     Outcomes
0   [\n "springs"\n]
1   [\n "to_do"\n]
2   [\n "replace"\n]
3   [\n "null"\n]
4   [\n "finance"\n]
5   [\n "finance"\n]
6   [\n "project_management" ,\n "sprints...
7   [\n "to_do" ,\n "finance...
8   [\n "remote"\n]
9   [\n "get_it_done"\n]
10  [\n "get_it_done" ,\n "remote...

Target output should be like the following:

Outcomes      Value_count
springs            21
to_do              12
replace            2
null               1
finance            24
project_management 12
get_it_done        22

Tried to do something like the following but getting an error due to the object type not being iterable

pd.Series([x for item in df['Outcomes'] for x in item]).value_counts()

jezrael · Accepted Answer · 2021-05-11 04:47:44Z

1

Use Series.str.split with Series.str.split and Series.str.strip first:

s = df['Outcomes'].str.split(',').explode().str.strip('[] ').value_counts()

Or convert values to lists by ast.literal_eval:

import ast
pd.Series([x.strip() for item in df['Outcomes'] for x in ast.literal_eval(item)]).value_counts()

edited May 11, 2021 at 4:47

answered May 11, 2021 at 4:32

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Value Count String Occurrences for Pandas Column of Lists type in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related