0

I'm a relatively inexperienced programmer (but pretty experienced in general) and am looking to improve my Python skills (my language of choice). I've written some useful tools with Python but really want to take my programming/scripting to the next level. I understand the logic but lack familiarity with much of the library. I've been practicing simple programming tasks in Python, and my most recent practice example is a function that takes a string and a substring and outputs the number of occurrences of the substring within the string:

from re import match

def MyFunc(string, substring):
    n = len(substring)
    substring_count = 0
    x = 0
    for char in string:
        if match(substring, string[x:x+n]):
            substring_count = substring_count + 1
        x = x + 1
    return substring_count

Is this an efficient way of doing this? Is my code particularly Pythonish? I also tried another solution without using regex but wasn't nearly as successful.

2
  • This code can definitely be improved, but I'm not sure that StackOverflow is the right site for this sort of question... Commented Feb 25, 2015 at 16:23
  • 1
    I know Vivek Sable has answered the question, but here are some tips for general use: Regex is terrible on performance and readability, don't use it if you have a bulit-in alternative - try x += 1 instead of x = x + 1. Commented Feb 25, 2015 at 16:25

3 Answers 3

2

Use string count method to get count of substing in main content.

Description:

string.count(s, sub[, start[, end]])

Return the number of (non-overlapping) occurrences of substring sub in string s[start:end]. Defaults for start and end and interpretation of negative values are the same as for slices.

e.g.

>>> a = "aabbbffgghhtt"
>>> a.count("ab")
1
>>> a.count("b")
3
>>> a.count("x")
0
>>> 
Sign up to request clarification or add additional context in comments.

3 Comments

This is not the same as the OP's code, because it treats overlaps differently. For example, MyFunc("aaaa","aa") == 3, but "aaaa".count("aa") == 2.
That was my goal -- I wanted to catch the overlaps, not just distinct instances of each.
@devOpsEv then juniorcompressor's answer is the best one
1

Using regular expressions for non overlapping searches:

import re

def MyFunc(s, sub):
    return len(re.compile(re.escape(sub)).findall(s))

For overlapping:

def MyFunc(s, sub):
    n, m = len(sub), len(s)
    return sum(sub == s[i:i + n] for i in range(m - n + 1))

The problem you want to solve is what Knuth Morris Pratt algorithm accomplishes more efficiently.

Comments

0

If you want to use your own function using only built in python functions, use this:

def MyFunc(string, substring):
    return string.count(substring)

1 Comment

As stated in comments to other answers, count behaves differently as it does not catch the overlapping substrings.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.