6

i have the following dataframe (df_hvl) with the columnname "FzListe" and the following data:

FzListe
7MA1, 7OS1
7MA1, 7ZJB
7MA2, 7MA3, 7OS1
76G1, 7MA1, 7OS1
7MA1, 7OS1
71E5, 71E6, 7MA1, FSS1
71E4, 7MA1, 7MB1, 7OS1
71E6, 7MA1, 7OS1
7MA1
7MA1, 7MB1, 7OS1
7MA1
7MA1, 7MA2, 7OS1
04, 7MA1
76G1, 7MA1, 7OS1
76G1, 7MA1, 7OS1
7MA1, 7OS1
7MA1
76G1, 7MA1, 7OS1
76G1, 7MA1, 7OS1
71E6, 7MA1
7MA1, 7MA2, 7OS1
7MA1
7MA1
7MA1
7MA1, 7OS1
76G1, 7MA1

I want to search for the string "7MA" only and count how often it appears in the list. (The list is originally much longer than that snippet). I want not to search only for 7MA1 because its possible that in one line it appears also with 7MA2 and/or 7MA3 and so on...

The Dataframe is called df_hvl and i searched for a solution but didnt find one.

2
  • What is desired output? Commented Feb 28, 2017 at 10:11
  • a counting how often 7MA appears in the column (including 7MA1, 7MA2, 7MA3 and so on) Commented Feb 28, 2017 at 10:11

4 Answers 4

12

I think you need str.count with sum:

print (df_hvl.FzListe.str.count(substr))
0     1
1     1
2     2
3     1
4     1
5     1
6     1
7     1
8     1
9     1
10    1
11    2
12    1
13    1
14    1
15    1
16    1
17    1
18    1
19    1
20    2
21    1
22    1
23    1
24    1
25    1
Name: FzListe, dtype: int64

substr = '7MA'
print (df_hvl.FzListe.str.count(substr).sum())
29
Sign up to request clarification or add additional context in comments.

1 Comment

nice it is really elegant
2

this will work too probably

df_hvl.FzListe.map(lambda d: "7MA" in d).sum()

1 Comment

This worked for me, and also seems to work with a series column (strings inside brackets) which is cool.
0

I would try something like this I think

b=0
for index in df.index:
    A=df.loc[row,'FzList'].split(',')
    for element in A:
        if '7MA'in element: 
            b+=1
return b 

Comments

-2

You need to use Series.str.count that accepts a regex pattern as the first argument, and an optional second argument accepting regex flags that can modify the matching behavior:

import re
df_hvl['FzListe'].str.count(re.escape(substr))
## enabling case insensitive match:
df_hvl['FzListe'].str.count(re.escape(substr), re.I)

You need to use re.escape as Series.str.count will fail if a substr contains special regex metacharacters.

Related posts:

In case you need to match a whole word...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.