88

I am trying to figure out how many times a string occurs in a string. For example:

nStr = '000123000123'

Say the string I want to find is 123. Obviously it occurs twice in nStr but I am having trouble implementing this logic into Python. What I have got at the moment:

pattern = '123'
count = a = 0
while pattern in nStr[a:]:
    a = nStr[a:].find(pattern)+1
    count += 1
return count

The answer it should return is 2. I'm stuck in an infinite loop at the moment.

I was just made aware that count is a much better way to do it but out of curiosity, does anyone see a way to do it similar to what I have already got?

0

14 Answers 14

130

Use str.count:

>>> nStr = '000123000123'
>>> nStr.count('123')
2

A working version of your code:

nStr = '000123000123'
pattern = '123'
count = 0
flag = True
start = 0

while flag:
    a = nStr.find(pattern, start)  # find() returns -1 if the word is not found, 
    #start i the starting index from the search starts(default value is 0)
    if a == -1:          #if pattern not found set flag to False
        flag = False
    else:               # if word is found increase count and set starting index to a+1
        count += 1        
        start = a + 1
print(count)
Sign up to request clarification or add additional context in comments.

1 Comment

count function doesn't work right in every situation. for example : pattern = "323" and nStr = "10032323000123". As you see 323 appears 2 times in the main string. But count's result is 1. So the second solution is right.
36

The problem with count() and other methods shown here is in the case of overlapping substrings.

For example: "aaaaaa".count("aaa") returns 2

If you want it to return 4 [(aaa)aaa, a(aaa)aa, aa(aaa)a, aaa(aaa)] you might try something like this:

def count_substrings(string, substring):
    string_size = len(string)
    substring_size = len(substring)
    count = 0
    for i in xrange(0,string_size-substring_size+1):
        if string[i:i+substring_size] == substring:
            count+=1
    return count

count_substrings("aaaaaa", "aaa")
# 4

Not sure if there's a more efficient way of doing it, but I hope this clarifies how count() works.

1 Comment

Note that xrange() was renamed to range() in Python 3.
7
import re

pattern = '123'

n =re.findall(pattern, string)

We can say that the substring 'pattern' appears len(n) times in 'string'.

2 Comments

This computes the count WITHOUT overlaps!
@BCuster good point. See my answer below which uses regex to compute the count WITH overlaps.
4

In case you are searching how to solve this problem for overlapping cases.

s = 'azcbobobegghaklbob'
str = 'bob'
results = 0
sub_len = len(str) 
for i in range(len(s)):
    if s[i:i+sub_len] == str: 
        results += 1
print (results)

Will result in 3 because: [azc(bob)obegghaklbob] [azcbo(bob)egghaklbob] [azcbobobegghakl(bob)]

Comments

1

I'm pretty new, but I think this is a good solution? maybe?

def count_substring(str, sub_str):
    count = 0
    for i, c in enumerate(str):
        if sub_str == str[i:i+2]:
            count += 1
    return count

Comments

0

string.count(substring) is not useful in case of overlapping.

My approach:

def count_substring(string, sub_string):

    length = len(string)
    counter = 0
    for i in range(length):
        for j in range(length):
            if string[i:j+1] == sub_string:
                counter +=1
    return counter

Comments

0

You are not changing a with each loop. You should put:

a += nStr[a:].find(pattern)+1

...instead of:

a = nStr[a:].find(pattern)+1

Comments

0
def count_substring(string, substring):
         c=0
         l=len(sub_string)
         for i in range(len(string)):
                 if string [i:i+l]==sub_string:
                          c=c+1
         return c
string=input().strip()
sub_string=input().strip()

count= count_substring(string,sub_string)
print(count)

Comments

0

As mentioned by @João Pesce and @gaurav, count() is not useful in the case of overlapping substrings, try this out...

def count_substring(string, sub_string):
    c=0
    for i in range(len(string)):
        if(string[i:i+len(sub_string)]==sub_string):
            c = c+1
    return c

Comments

0
def countOccurance(str,pat):
    count=0
    wordList=str.split()
    for word in wordList:
        if pat in word:
            count+=1
    return count

Comments

0

Usually i'm using enumerate for this kind of problems:

def count_substring(string, sub_string):
        count = 0
        for i, j in enumerate(string):
            if sub_string in string[i:i+3]:
                count = count + 1
        return count

1 Comment

@ruddy_simonpour you might have added that count_substring(nStr, pattern) with nStr = '000123000123' and pattern = '123' yields 2, which is correct.
0

Only one approach here uses regex, and that approach doesn't work for overlaps.

Here is how to use regex with "lookaheads" to find overlapping matches also:

import re

nStr = '00012312310001231'
regex_pattern = '(?=(1231))'

matches = re.findall(regex_pattern, nStr)
print(len(matches))

This returns 3, as it found two matches of 1231 in 1231231, despite the overlap.

Comments

-1

def count(sub_string,string):

count = 0
ind = string.find(sub_string)

while True:
    if ind > -1:
        count += 1
        ind = string.find(sub_string,ind + 1)
    else:
        break
return count

Comments

-1
def count_substring(string, sub_string):
    count = 0
    len_sub = len(sub_string)
    for i in range(0,len(string)):
        if(string[i:i+len_sub] == sub_string):
            count+=1
    return count

1 Comment

What advantage does this offer over other answers?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.