0

Hello, I have the following function:

def width(input,output,attr):
    import re
    input = input.strip()
    if re.search(attr, input):
        k = input.find(attr)
        for i in input:
            if i == attr[0]:
                j = k + len(attr)+1
                while ((j <= len(input)) |  (j != ' ') | (input[j+1] != "'")):
                    j = j + 1
                    #print j, output, input[j], len(input), k
                    output = output+input[j]
                break
            k = k + 1
    return output

print width('a=\'100px\'','','a')

I get always get the following error:

Traceback (most recent call last):
  File "table.py", line 45, in <module>
    print width(split_attributes(w,'','<table.*?>'),'','width')
  File "table.py", line 24, in width
    while ((j <= len(input)) |  (j != ' ') | (input[j+1] != "'")):
IndexError: string index out of range

I have tried using or instead | but it didn't work!

2
  • Why are you using re to find a substring? If you are doing simple searches, use in like: if attr in input:. Commented Aug 24, 2011 at 1:15
  • 1
    This can't possibly be right: while ((j <= len(input)) | (j != ' ') | (input[j+1] != "'")):. j is an integer index into input but you are comparing it to a space. Commented Aug 24, 2011 at 1:22

6 Answers 6

1
while ((j <= len(input)) |  (j != ' ') | (input[j+1] != "'")):

0) You should be using or.

1) You should not use input as a variable name; it hides a built-in function.

2) j is an integer, so it can never be equal to ' ', so that test is useless.

3) j <= len(input) passes when j == len(input). The length of a string is not a valid index into the string; indices into a string of length N range from 0 to (N - 1) (you can also use negative numbers from -1 to -N, to count from the end). Certainly j+1 doesn't work either.

4) I can't tell what the heck you are actually trying to do. Could you explain it in words? As stated, this isn't a very good question; making the code stop throwing exceptions doesn't mean it's any closer to working correctly, and certainly doesn't mean it's any closer to being good code.

Sign up to request clarification or add additional context in comments.

Comments

0

It looks like j+1 is a number greater than or equal to the length of the string you have (input). Make sure you structure your while loop so that j < (len(input) - 1) is always true and you won't end up with the string index out of range error.

Comments

0

if j >= len(input) - 1 then j+1 will most certainly be out of bounds. Also, use or and not |.

Comments

0

You get an error IndexError: string index out of range. The only index reference is in part input[j+1]. Situation when j = len(input) will cause an error, as the following code demonstrates:

>>> input = "test string"
>>> len(input)
11
>>> input[11]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> input[10]
'g'

If you try to reference element number j+1, then condition j < ( len(input) - 1 ) needs to be satisfied.

Comments

0

When using != in if statements, make sure that or is actually what you need. Here's an example:

import random
a = random.randint(1, 10)
b = random.randint(1, 10)
c = random.randint(1, 10)
if a != 1 or b != 1 or c != 1:
    print "None of the values should equal 1"
    # The interpreter sees `a != 1`.
    # If `a` is not equal to 1 the condition is true, and this code gets excecuted.
    # This if statement will return true if ANY of the values are not equal to 1.
if a != 1 and b != 1 and c != 1:
    print "None of the values are equal to 1" # True
    # This if statement will return true if ALL of the values are not equal to 1.

This is a hard thing to understand at first (I made this mistake all the time), but if you practise it a bit, it will make perfect sense.

So, to get your code working, replace those |s with and, and it will work (and stick with the keywords or and and unless you specifically need boolean or or and (|/&):

while ((j <= len(input)) and  (j != ' ') and (input[j+1] != "'")):

and the output is:

100px

Comments

0

Not the solution to your problem. Code that probably does what you are aiming for.

Just use a single regular expression:

import re

def width(input, attr):
    """
    >>> width("a='100px'", 'a')
    '100px'
    """
    regex = re.compile(attr + r""".*?'(?P<quoted_string>[^']+)'""")
    m = regex.match(input.strip())
    return m.group("quoted_string") if m else ''

if __name__ == '__main__':
    import doctest
    doctest.testmod()

This code skips attr and searches for a quoted string that follows. (?P<quoted_string>[^']+) captures the quoted string. m.group("quoted_string") recovers the quoted string.

2 Comments

This only works properly if there is a single parameter. In the question, only one parameter is specified, but (assuming this is HTML parsing) HTML tags can contain multiple parameters. So width("onetag='100px' twotag='30px'", 'twotag') returns ''.
Yeah, I am doing my best to guess what he is really doing and point him in the right/a better direction. And if he is really doing HTML without using BeautifulSoup / lxml / some other HTML parser, then all bets are off. Obligatory link: stackoverflow.com/questions/1732348/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.