0

Lets say I have a Pandas DataFrame like following.

In [31]: frame = pd.DataFrame({'a' : ['A/B/C/D', 'A/B/C', 'A/E','D/E/F']})

In [32]: frame
Out[32]: 
         a
0  A/B/C/D
1    A/B/C
2      A/E
3    D/E/F

And I have string list like following.

In [33]: mylist =['A/B/C/D', 'A/B/C', 'A/B']

Here two of the patterns in mylist is available in my DataFrame. So I need to get output saying 2/3*100 = 67%

In [34]: pattern = '|'.join(mylist)
In [35]: frame.a.str.contains(pattern).count()

This is not working. Any help to get my expected output.

1 Answer 1

1

You can do this way :

In [1]: len(frame[frame.a.isin(mylist)])/float(len(mylist)) * 100
Out[1]: 66.66666666666666

Or with you method :

In [2]: pattern = '|'.join(mylist)
In [2]: count = frame.a.str.contains(pattern).sum() # will add up True values
In [3]: count/float(len(mylist))*100
Out[3]: 66.666666666666
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.