4

I'm trying to see if a string exists in another string with out using Python's predefined functions such as find and index..

Right now what my function takes 2 strings as parameters, one is the string we are going to search while the other string is what we are looking for in the first string.

If the second string exists in the first I want my function to return all the positions it occurs in the first string.

Right now, my function is able to find the first occurrence and return an index, however I want to find multiple occurrences instead of just the first one.

Below is my code:

def multi_find (s, r):

    s_len = len(s)
    r_len = len(r)

    if s_len < r_len:
        n = -1
    else:
        m = s_len - r_len
        n = -1  # assume r is not yet found in s
        i = 0

        while n == -1 and i < m:
            # search for r in s until not enough characters are left
            if s[i:i + r_len] == r:
                n = i
            else:
                i = i + 1
    print (n)

multi_find("abcdefabc. asdli! ndsf acba saa abe?", "abc")

Right now, this will output just "0" because thats where abc occurs first.. How can I get it to return "0" and "6" (The beginning of the second occurrence), basically keep checking after it found one.

I was thinking of something like creating a list of all the places it occurs and then append i to that list but when I tried that, nothing was working for me.

7 Answers 7

9

You can do:

>>> haystack = "abcdefabc. asdli! ndsf acba saa abe?"
>>> needle = "abc"
>>> for i, _ in enumerate(haystack):
...     if haystack[i:i + len(needle)] == needle:
...         print (i)
...
0
6
Sign up to request clarification or add additional context in comments.

3 Comments

Can you explain to me the "_" in the 'for i, _ in enumerate(haystack):' line? Not really sure what this does.
@JacobMammoliti: it means you're ignoring the variable that's there. enumerate() allows you to iterate over the positions and characters of the string but we're not using the characters. Hence, we're only iterating over each of the positions in the string. You can also write for i, c in enumerate(haystack): to iterate over each position i and each character c (at the same time) of the string.
Note that the "_" is just a convention to make it clear to human readers - it doesn't act any differently than if you had used "x".
3

Another alternative using regex:

>>> import re
>>> haystack = "abcdefabc. asdli! ndsf acba saa abe?"
>>> needle = "abc"
>>> [m.start() for m in re.finditer(r'{}'.format(re.escape(needle)), haystack)]
[0, 6]

The above solution will not work for overlapping sub-strings, like there are 3 'aa' in 'aaaa'. So, if you want to find overlapping matches as well, then:

>>> haystack = "bobob"
>>> needle = "bob"
>>> [m.start() for m in re.finditer(r'(?={})'.format(re.escape(needle)), haystack)]
[0, 2]

2 Comments

I love the use of re but since the goal is "to return all the positions it occurs in the first string", I think this fails to find some interesting cases where r occurs multiple times in s but its instances overlap. Consider the case: multi_find("bobob","bob"). Using your implementation, the string "bob" definitely occurs at position 2 in "bobob" but it is not returned. I love the one-liner but I thought I'd throw that in as a possible concern. It may not matter to @Jacob.
@DarrenStone Good point, added another solution that works for overlapping matches as well.
1
def multi_find(s, r):

    s_len = len(s)
    r_len = len(r)

    _complete = []

    if s_len < r_len:
        n = -1
    else:

        for i in xrange(s_len):
            # search for r in s until not enough characters are left
            if s[i:i + r_len] == r:
                _complete.append(i)
            else:
                i = i + 1
    print(_complete)

multi_find("abcdefabc. asdli! ndsf abc saa abe?", "abc")

Comments

1
def multi_find (s, r):
    s_len = len(s)
    r_len = len(r)
    n = [] # assume r is not yet found in s

    if s_len >= r_len:
        m = s_len - r_len
        i = 0

        while i < m:
            # search for r in s until not enough characters are left
            if s[i:i + r_len] == r:
                n.append(i)
            i = i + 1
    print (n)

multi_find("abcdefabc. asdli! ndsf acba saa abe?", "abc")

Pretty much just replace n with a list so you can keep adding values to it as you find them. You also need to be incrementing i even when a match is found, it would have been stuck in a loop forever except that you had the while n == -1 constraint that made it stop as soon as a match was found.

Comments

1

probably the best way to do this is to keep calling the find function (this is fastest too)

def multifind(string, value, start = 0, stop = None):
    values = []
    while True:
        found = string.find(value, start, stop)
        if found == -1:
            break
        values.append(found)
        start = found + 1
    return values

print multifind('hello abc abc', 'abc')

Output:

[6, 10]

1 Comment

I know it's late to comment, 8 years later, but the question was about doing this without calling pre-defined functions like find(), so this is not an answer to the question as asked.
1

@Jacob, I hope you'll find this one very short yet still easy to understand.

def multi_find(s, r):
    return [pos for pos in range(len(s)) if s.startswith(r,pos)]

Comments

0

Note: I think this answer here is still a good "teaching answer", I have submitted a better solution elsewhere in this thread, without recursion.

def multi_find(s, r, start=0):
    if start >= len(s): 
        return []
    if s.startswith(r, start):
        return [start] + multi_find(s, r, start+1)
    else:
        return multi_find(s, r, start+1)

This allows you to pass an optional start position to begin the search in s.

This solution is recursive, which may or may not be the fastest implementation, but it is correct and I believe it makes the code easy to identify each of the three possibilities at each position of s:

  1. end of s
  2. found another r
  3. didn't find another r

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.