How to find index positions of a substring using Python

Question

Very new to Python here, and struggling. Any help is appreciated! Confession: this is obviously a request for help with homework, but my course ends tomorrow and the instructor takes too long to return a message, so I'm afraid if I wait I won't get this finished in time.

I'm using a learning module from Cornell University called introcs. It's documented here: http://cs1110.cs.cornell.edu/docs/index.html

I am trying to write a function that returns a tuple of all indexes of a substring within a string. I feel like I'm pretty close, but just not quite getting it. Here's my code:


import introcs 

def findall(text,sub):
    result = ()
    x = 0
    pos = introcs.find_str(text,sub,x)

    for i in range(len(text)):
        if introcs.find_str(text,sub,x) != -1:
            result = result + (introcs.find_str(text,sub,x), )
            x = x + 1 + introcs.find_str(text,sub,x)

    return result

On the call findall('how now brown cow', 'ow') I want it to return (1, 5, 10, 15) but instead it lops off the last result and returns (1, 5, 10) instead.

Any pointers would be really appreciated!

I'm afraid if I wait I won't get this finished in time. Whose fault is it that you waited until the last minute? — Barmar
– Barmar, Commented Sep 12, 2022 at 22:05
Your code relies heavily on a function find_str() from a module called introcs, which isn't a common module and you didn't provide the code. What does find_str() do exactly? Why even use it? — Grismar
– Grismar, Commented Sep 12, 2022 at 22:05
It looks like introcs.find_str() is the same as the built-in str.find() method. — Barmar
– Barmar, Commented Sep 12, 2022 at 22:06
Instead of making excuses for a reason to post homework, you can just post it, as long as you stick to the rules for posting homework questions. That gets you answers quicker and avoids snarky comments. — Grismar
– Grismar, Commented Sep 12, 2022 at 22:07
@barmar - "looks like" but that's not the point of my question to OP, of course. — Grismar
– Grismar, Commented Sep 12, 2022 at 22:07

walker · Accepted Answer · 2022-09-12 22:08:09Z

1

You can use re to do it:

import re

found = [i.start() for i in re.finditer(substring, string)]

edited Sep 12, 2022 at 22:08

answered Sep 12, 2022 at 22:06

walker

4654 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

walker Over a year ago

fixed the code to include start index which is located in the match object

Barmar · Accepted Answer · 2022-09-12 22:15:58Z

0

You don't need to loop over all the characters in text. Just keep calling introcs.find_str() until it can't find the substring and returns -1.

Your calculation of the new value of x is wrong. It should just be 1 more than the index of the previous match.

Make result a list rather than a tuple so you can use append() to add to it. If you really need to return a tuple you can use return tuple(result) at the end to convert it.

def findall(text,sub):
    result = []
    x = 0
    while True:
        pos = introcs.find_str(text,sub,x)
        if pos == -1:
            break
        result.append(pos)
        x = pos + 1

    return result

answered Sep 12, 2022 at 22:15

Barmar

789k57 gold badges554 silver badges669 bronze badges

8 Comments

garet Over a year ago

Thank you, this was tremendously helpful! I had to tweak it a bit to fit within my instructors preferences (e.g., he hates break statements?), but you got me past the wall, so thanks again, I really appreciate it!

Barmar Over a year ago

IMHO he's wrong. Doing this without a break statement probably requires you to duplicate the code that sets pos.

garet Over a year ago

Probably a difference of opinion lost on me at this stage of my learning. I was able to introduce a variable, "loop" set to True and then create an else statement that changed loop to False, and that seemed to satisfy all my test cases.

Barmar Over a year ago

Ugh, that's even worse! Now I know why I see that style of coding in so many questions, professors like this.

garet Over a year ago

Can you help me understand why your way is preferred?

|

Grismar · Accepted Answer · 2022-09-12 22:28:45Z

Your code shows evidence of three separate attempts of keeping track of where you are in the string:

you loop over it with i
you put the position a sub was found at in pos
you compute an x

The question here is what do you want to happen in this case:

findall('abababa', 'aba')

Do you expect [0, 4] or [0, 2, 4] as a result? Assuming find_str works just like the standard str.find() and you want the [0, 2, 4] result, you can just start the next search at 1 position after the previously found position, and start searching at the start of the string. Also, instead of adding tuples together, why not build a list:

# this replaces your import, since we don't have access to it
class introcs:
    @staticmethod
    def find_str(text, sub, x):
        # assuming find_str is the same as str.find()
        return text.find(sub, x)


def findall(text,sub):
    result = []
    pos = -1

    while True:
        pos = introcs.find_str(text, sub, pos + 1)
        if pos == -1:
            break
        result.append(pos)

    return result


print(findall('abababa', 'aba'))

Output:

[0, 2, 4]

If you only want to match each character once, this works instead:

def findall(text,sub):
    result = []
    pos = -len(sub)

    while True:
        pos = introcs.find_str(text, sub, pos + len(sub))
        if pos == -1:
            break
        result.append(pos)

    return result

Output:

[0, 4]

Collectives™ on Stack Overflow

How to find index positions of a substring using Python

3 Answers 3

1 Comment

8 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

8 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related