A "string index out if range" Python error

Question

I´ve searched for other "string index out of range" cases, but they were not useful for me, so I wanted to search for help here.

The program has to do this: "Write a function kth_word(s, k) that given a string s and an integer k≥ 1 returns the kth word in string s. If s has less than k words it returns the empty string. We assume all characters of s are letters and spaces. Warning: do not use the split string method."

Here is my code:

def kth_word(s, k):
    new =""
     word_count = 0
     for i in range(0, len(s)):
         if s[i] == " " and s[i+1] != " ":
             word_count+=1
             #try to find how many characters to print until the space
         if word_count == k-1:
             while i!= " " and i<=len(s): #if it is changed to i<len(s), the output is strange and wrong
                 new+=s[i]
                 i=i+1
                 print(new) #check how new is doing, normally works good         
     return new



 print(kth_word('Alea iacta est', 2))

(I tried my best to implement the code in a right way, but i do not know how)

And depending on the place where you live return new it gives or an error or just an empty answer

So, you're not allowed to use str.split. What about str.rsplit? Or str.partition, re.split, bytes.split, re.findall, …? — abarnert
– abarnert, Commented Jul 13, 2018 at 0:23
@ggorlen Exactly. I think find and index probably are within the spirit. On the other hand, rsplit definitely is not. — abarnert
– abarnert, Commented Jul 13, 2018 at 0:41

Adam Smith · Accepted Answer · 2018-07-13 00:12:29Z

2

You iterate from 0 to len(s)-1 in your first for loop, but you're addressing i+1 which, on the last iteration, is len(s).

s[len(s)] is an IndexError -- it is out of bounds.

Additionally your while loop is off-by-one.

while i!= " " and i<=len(s):
    # do something referencing s[i]

Your first condition makes no sense (i is a number, how could it be " "?) and your second introduces the same off-by-one error as above, where i is maximally len(s) and s[len(s)] is an error.

Your logic is a bit off here, too, since you're wrapping this inside the for loop which is already referencing i. This appears to be a takewhile loop, but isn't really doing that.

edited Jul 13, 2018 at 0:12

answered Jul 13, 2018 at 0:00

Adam Smith

54.6k13 gold badges84 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

abarnert Over a year ago

This is true, but it's actually only going to cause an IndexError for strings that end with spaces, which I'd be willing to bet the OP hasn't tested, so it's not actually the error they're asking about (although of course it is an error they need to fix anyway).

jpp · Accepted Answer · 2018-07-13 00:08:02Z

1

Warning: do not use the split string method.

So groupby / islice from itertools should work:

from itertools import groupby, islice

def kth_word(s, k):
    g = (j for i, j in groupby(s, key=lambda x: x==' ') if not i)
    return ''.join(next(islice(g, k-1, k), ''))

words = 'Alea iacta est'

res = kth_word(words, 2)  # 'est'

We handle StopIteration errors by setting the optional parameter in next to ''.

edited Jul 13, 2018 at 0:08

answered Jul 13, 2018 at 0:05

jpp

166k37 gold badges301 silver badges363 bronze badges

3 Comments

Adam Smith Over a year ago

interesting use of groupby. I was considering an approach using takewhile.

jpp Over a year ago

@ggorlen, It may help when OP moves on to itertools :)

abarnert Over a year ago

If you're going to weasel around the assignment, why not just return s.rsplit[k]?

abarnert · Accepted Answer · 2018-07-13 00:38:26Z

You're not allowed to use str.split. If you could, the answer would just be:

def kth_word(s, k):
    return s.split()[k]

But if you could write a function that does the same thing str.split does, you could call that instead. And that would certainly show that you understand everything the assignment was testing for—how to loop over strings, and do character-by-character operations, and so on.

You can write a version with only the features of Python usually taught in the first week:

def split(s):
    words = []
    current = ''
    for ch in s:
        if ch.isspace():
            if current:
                words.append(current)
            current = ''
        else:
            current += ch
    if current:
        words.append(current)
    return words

If you know additional Python features, you can improve it in a few ways:

Build current as a list instead of a str and ''.join it.
Change those append calls to yield so it splits the string lazily (even better than str.split).
Use str.find or str.index or re.search to find the next space instead of searching character by character.
Abstract out the space-finding part into a general-purpose generator—or, once you realize what you want, find that function in itertools.
Add all of the features we're missing from str.split, like the ability to pass a custom delimiter instead of breaking on any whitespace.

But I think even the basic version—assuming you understand it and can explain how it works—ought to be enough to get an A on the assignment.

And, more importantly, you're practicing the best way to solve problems: reduce them to simpler problems. split is actually easier to write than kth_word, but once you write split, kth_word becomes trivial.

abarnert · Accepted Answer · 2018-07-13 00:21:24Z

You actually have at least five problems here, and you need to fix all of them.

First, as pointed out by Adam Smith, this is wrong:

for i in range(0, len(s)):
     if s[i] == " " and s[i+1] != " ":

This loops with i over all the values up to but not including len(s), which is good, but then, if s[i] is a space, it tries to access s[i+1]. So, if your string ended with a space, you would get an IndexError here.

Second, as ggorlen pointed out in a comment, this is wrong:

while i!= " " and i<=len(s):
    new+=s[i[]

When i == len(s), you're going to try to access s[i], which will be an IndexError. In fact, this is the IndexError you're seeing in your example.

You seem to realize that's a problem, but refuse to fix it, based on this comment:

#if it is changed to i<len(s), the output is strange and wrong

Yes, the output is strange and wrong, but that's because fixing this bug means that, instead of an IndexError, you hit the other bugs in your code. It's not causing those bugs.

Next, you need to return new right after doing the inner loop, rather than after the outer loop. Otherwise, you add all of the remaining words rather than just the first one, and you add them over and over, once per character, instead of just adding them once.

You may have been expecting that doing that i=i+1 would affect the loop variable and skip over the rest of the word, but (a) it won't; the next time through the for it just reassigns i to the next value, and (b) that wouldn't help anyway, because you're only advancing i to the next space, not to the end of the string.

Also, you're counting words at the space, but then you're iterating from that space until the next one. Which means (except for the first word) you're going to include that space as part of the word. So, you need to do an i += 1 before the while loop.

Although it would probably be a lot more readable to not try to reuse the same variable i, and also to use for instead of while.

Also, your inner loop should be checking s[i] != " ", not i!=" ". Obviously the index, being a number, will never equal a space character.

Without the previous fix, this would mean you output iacta est with an extra space before it—but with the previous fix, it means you output nothing instead of iacta.

Once you fix all of these problems, your code works:

def kth_word(s, k):
     word_count = 0
     for i in range(0, len(s) - 1):
         if s[i] == " " and s[i+1] != " ":
             word_count+=1
             #try to find how many characters to print until the space
         if word_count == k-1:
             new =""
             j = i+1            
             while j < len(s) and s[j] != " ":
                 new+=s[j]
                 j = j+1
                 print(new) #check how new is doing, normally works good       
             return new

Well, you still have a problem with the first word, but I'll leave it to you to find and fix that one.

TheDigitalScorcerer · Accepted Answer · 2018-07-13 00:49:14Z

Your use of the variable 'i' in both the for loop and the while loop was causing problems. using a new variable, 'n', for the while loop and changing the condition to n < len(s) fixes the problem. Also, some other parts of your code required changing because either they were pointless or not compatible with more than 2 words. Here is the fully changed code. It is explained further down:

    def kth_word(s, k):
        new = ""
        word_count = 0
        n = 0
        for i in range(0, len(s) - 1):
            if s[i] == " " and s[i + 1] != " ":
                word_count += 1
                #try to find how many characters to print until the space
            if word_count < k:
                while n < len(s): #if it is changed to i<len(s), the output is strange and wrong
                    new+=s[n]
                    n += 1
                    print(new) #check how new is doing, normally works good
        return new

    print(kth_word('Alea iacta est', 2))

Explanation:

As said in Adam Smith's answer, 'i' is a number and will never be equal to ' '. That part of the code was removed because it is always true.

I have changed i = i + 1 to i += 1. It won't make much difference here, but this will help you later when you use longer variable names. It can also be used to append text to strings.

I have also declared 'n' for later use and changed for i in range(0, len(s)): to for i in range(0, len(s) - 1): so that the for loop can't go out of range either.

if word_count == k-1: was changed to if word_count < k: for compatibility for more words, because the former code only went to the while loop when it was up to the second-last word.

And finally, spaces were added for better readability (This will also help you later).

Collectives™ on Stack Overflow

A "string index out if range" Python error

5 Answers 5

1 Comment

3 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related