Counting occurrences of a sub-string without using a built-in function

Question

My teacher challenged me of finding a way to count the occurences of the word "bob" in any random string variable without str.count(). So I did,

a = "dfjgnsdfgnbobobeob bob"
compteurDeBob = 0
for i in range (len(a) - 1):
   if a[i] == "b":
       if a[i+1] == "o":
           if a[i+2] == "b":
               compteurDeBob += 1
print(compteurDeBob)

but I wanted to find a way to do that with a word of any length as shown below, but I have no clue on how to do that...

a = input("random string: ")
word = input("Wanted word: ")
compteurDeBob = 0
for i in range (len(a)-1):

   #... i don't know... 

print(compteurDeBob)

Just so you know, you can accept only one answer. Be sure you select the one you believe is the best. — Jerry
– Jerry, Commented Oct 7, 2018 at 16:38

Giorgos Myrianthous · Accepted Answer · 2018-10-07 16:42:23Z

2

a = input("random string: ")
word = input("Wanted word: ")

count = 0
for i in range(len(a)-len(word)):
    if a[i:i+len(word)] == word:
        count += 1
print(count)

If you want your search to be case-insensitive, then you can use lower() function:

a = input("random string: ").lower()
word = input("Wanted word: ").lower()

count = 0
for i in range(len(a)):
    if a[i:i+len(word)] == word:
        count += 1
print(count)

For the user input

Hi Bob. This is bob

the first approach will output 1 and the second approach will output 2

edited Oct 7, 2018 at 16:42

answered Oct 7, 2018 at 16:23

Giorgos Myrianthous

40.4k21 gold badges156 silver badges175 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Eugene Yarmash · Accepted Answer · 2018-10-07 16:31:32Z

2

To count all overlapping occurrences (like in your example) you could just slice the string in a loop:

a = input("random string: ")
word = input("Wanted word: ")    
cnt = 0

for i in range(len(a)-len(word)+1):
    if a[i:i+len(word)] == word:
        cnt += 1

print(cnt)

answered Oct 7, 2018 at 16:31

Eugene Yarmash

152k44 gold badges346 silver badges391 bronze badges

Comments

jpp · Accepted Answer · 2018-10-07 17:01:22Z

2

You can use string slicing. One way to adapt your code:

a = 'dfjgnsdfgnbobobeob bob'

counter = 0
value = 'bob'
chars = len(value)

for i in range(len(a) - chars + 1):
    if a[i: i + chars] == value:
        counter += 1

A more succinct way of writing this is possible via sum and a generator expression:

counter = sum(a[i: i + chars] == value for i in range(len(a) - chars + 1))

This works because bool is a subclass of int in Python, i.e. True / False values are considered 1 and 0 respectively.

Note str.count won't work here, as it only counts non-overlapping matches. You could utilise str.find if built-ins are allowed.

edited Oct 7, 2018 at 17:01

answered Oct 7, 2018 at 16:22

jpp

166k37 gold badges301 silver badges363 bronze badges

Comments

willeM_ Van Onsem · Accepted Answer · 2018-10-07 16:55:37Z

0

The fastest way to calculate overlapping matches is the Knuth-Morris-Pratt algorithm [wiki] which runs in O(m+n) with m the string to match, and n the size of the string.

The algorithm first builds a lookup table that acts more or less as the description of a finite state machine (FSM). First we construct such table with:

def build_kmp_table(word):
    t = [-1] * (len(word)+1)
    cnd = 0
    for pos in range(1, len(word)):
        if word[pos] == word[cnd]:
            t[pos] = t[cnd]
        else:
            t[pos] = cnd
            cnd = t[cnd]
            while cnd >= 0 and word[pos] != word[cnd]:
                cnd = t[cnd]
        cnd += 1
    t[len(word)] = cnd
    return t

Then we can count with:

def count_kmp(string, word):
    n = 0
    wn = len(word)
    t = build_kmp_table(word)
    k = 0
    j = 0
    while j < len(string):
        if string[j] == word[k]:
            k += 1
            j += 1
            if k >= len(word):
                n += 1
                k = t[k]
        else:
            k = t[k]
            if k < 0:
                k += 1
                j += 1
    return n

The above counts overlapping instances in linear time in the string to be searched, which was an improvements of the "slicing" approach that was earlier used, that works in O(m×n).

answered Oct 7, 2018 at 16:55

willeM_ Van Onsem

482k33 gold badges483 silver badges624 bronze badges

1 Comment

toin Over a year ago

maybe that's right but far too complicated for me right now, thank you though !

Collectives™ on Stack Overflow

Counting occurrences of a sub-string without using a built-in function

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related