2

The count() function returns the number of times a substring occurs in a string, but it fails in case of overlapping strings.

Let's say my input is:

^_^_^-_-

I want to find how many times ^_^ occurs in the string.

mystr=input()
happy=mystr.count('^_^')
sad=mystr.count('-_-')
print(happy)
print(sad)

Output is:

1
1

I am expecting:

2
1

How can I achieve the desired result?

3 Answers 3

3

New Version

You can solve this problem without writing any explicit loops using regex. As @abhijith-pk's answer cleverly suggests, you can search for the first character only, with the remainder being placed in a positive lookahead, which will allow you to make the match with overlaps:

def count_overlapping(string, pattern):
    regex = '{}(?={})'.format(re.escape(pattern[:1]), re.escape(pattern[1:]))
    # Consume iterator, get count with minimal memory usage
    return sum(1 for _ in re.finditer(regex, string))

[IDEOne Link]

Using [:1] and [1:] for the indices allows the function to handle the empty string without special processing, while using [0] and [1:] for the indices would not.

Old Version

You can always write your own routine using the fact that str.find allows you to specify a starting index. This routine will not be very efficient, but it should work:

def count_overlapping(string, pattern):
    count = 0
    start = -1
    while True:
        start = string.find(pattern, start + 1)
        if start < 0:
            return count
        count += 1

[IDEOne Link]

Usage

Both versions return identical results. A sample usage would be:

>>> mystr = '^_^_^-_-'
>>> count_overlapping(mystr, '^_^')
2
>>> count_overlapping(mystr, '-_-')
1
>>> count_overlapping(mystr, '')
9
>>> count_overlapping(mystr, 'x')
0

Notice that the empty string is found len(mystr) + 1 times. I consider this to be intuitively correct because it is effectively between and around every character.

Sign up to request clarification or add additional context in comments.

Comments

3

you can use regex for a quick and dirty solution :

import re
mystr='^_^_^-_-'
print(len(re.findall('\^(?=_\^)',mystr)))

6 Comments

This is awesome. Search for first character, positive lookahead on the rest. May I steal and generalize it?
@MadPhysicist Yes , Please and thanks for for explaining it nicely :)
I knew one of the regex ninjas will come out and post better answer :-)
@Abhijithpk. See the update to my answer. Seems to work like the crap I posted before, but so much more elegant. Thanks for showing me this
@AyushGoyal Have a look at MadPhysicist 's updated answer ,He is explaining how positive lookahead works and he uses re.escape to avoid escaping regex special characters like ^ instead of my crude backslash escape sequence
|
1

You need something like this

def count_substr(string,substr):
    n=len(substr)
    count=0
    for i in range(len(string)-len(substr)+1):
        if(string[i:i+len(substr)] == substr):      
            count+=1
    return count

mystr=input()
print(count_substr(mystr,'121'))

Input: 12121990

Output: 2

6 Comments

This is a totally functional solution, but it has a number of inefficiencies. The main one is that string[i:i+len(substr)] creates a new string object every single time.
totally agree with you. It's not an efficient solution but only thing I could come up with. will surly improve :-)
It's completely functional, so +1. I really don't think mine is much better, but I couldn't think of a better way either :)
I especially like that we get the same result for an empty patter without any special processing.
I think it is absolutely ideal to do that :) I even addressed it in my answer.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.