2

I'm new to Python, and I find the slice behaviour somewhat confusing.

If I do

test = 'abcdefgh'

for i in range(7):
    print test[-(8-i):-(6-i)]
    print i

the last iteration will misbehave. Since slicing [start:end] doesn't include end, it seems to me like I'd need to handle slices like this with a special case if the last character is in the range I want.

Did I miss something?

5
  • Note that "a"[0:100000] == "a". You do not get IndexError when using slices for out-of-range indexes. The out-of-range index is either replaced with the end/the beginning or, in other circumstances, the result is an empty string. Commented Jan 30, 2014 at 18:10
  • Yeah. So, for consistency, I'd be more comfortable with [-2:0] being the same as [-2:end], rather than just giving an empty slice. Commented Jan 30, 2014 at 18:23
  • Actually that would reduce consistency. Because: how come [-2:1] returns an empty string while [-2:0] works? One thing is if you omit the stop parameter, an other thing is if you provide an explicit index. Commented Jan 30, 2014 at 18:40
  • I wouldn't mind if [-2:2] gave the last two and first two characters of the string. Commented Jan 31, 2014 at 13:40
  • It just occurred to me that python's slicing provides a very strong invariant. Given a string s, it always holds that: s[x:y] in s is True, no matter what x and y are of if they are omitted; In other words s[x:y] is always a substring of s (if you omit the step). Including the changes you suggest would break this invariant. I believe this is the reason why they didn't want to implicitly change the step in these circumstances. s[x:y] should always return a substring of s, not a substring of s or the inverse of s depending on the indeces. Commented Feb 13, 2014 at 8:47

5 Answers 5

1

If you add another couple prints, you can see what's happening:

test = 'abcdefgh'

for i in range(7):
    print -(8-i), -(6-i)
    print test[-(8-i):-(6-i)]
    print i

Outputs:

-8 -6
ab
0
-7 -5
bc
1
-6 -4
cd
2
-5 -3
de
3
-4 -2
ef
4
-3 -1
fg
5
-2 0

All your ranges are negative, until the last, when it's 0

Adding or None to the end range will to avoid the 0 and act as if you didn't pass it in the first place:

for i in range(7):
    print test[-(8-i):(-(6-i) or None)]
    print i

Which outputs:

ab
0
bc
1
cd
2
de
3
ef
4
fg
5
gh
6

The way the or operator works, if the first argument is "falsish", the second argument is used, in this case None

Sign up to request clarification or add additional context in comments.

2 Comments

Ok, cool, that's a reasonably direct way of making this work. :)
I admit it's a bit awkward, but it avoids needing to make a special case for it. Negative slicing is nice until you run into something like this. Any other workaround requires an if..else statement, which would suck
1

You can't start at -1 and go to +1. -1 is the end, 1 the secund item. You can do

for i in range(7):
   ....:         print test[i:(2+i)]
   ....:     
ab
bc
cd
de
ef
fg
gh

Comments

1

The issue here is -0 just is 0, so you're attempting to grab up to the first character of the string

so for the case of i = 6 you get

test[-2:0] = ''

a better way of handling this is look ahead

for i in range(len(test)-1):
     print test[i:i+2]

for indexing from the end to work the correct syntax would leave out the 0

test[-2:] = 'gh'

5 Comments

Yeah, I get that. I just find it strange that they offer you this "indexing from the end"-shorthand, but make it not work for the last character.
This is because you are no longer indexing from the end, when you insert a 0 into that position. the correct "indexing from the end" for that case would be test[-2:]
I know. To me, it would have made sense to allow something like [-2:0] meaning the same as [-2:], to make this easier. The "or None"-solution is reasonably short and to the point, but I don't see why 0 can't be allowed as an index from both directions.
Because 0 would be ambiguous if it meant both and therefor unusable. If you think about implementing a parser for this kind of language it easily differentiates between indexing from the end and indexing by position based on if the number is negative or not. I also think that the more conventional/readable way is the way I stated above. I will be honest it took me a minute to think about the what the desired output of your statement was.
Normally, I would be using the method you stated above. I just ran across this issue when experimenting with this "new and strange" shorthand. Having just realised that in Python 'a' * 5 == 'aaaaa', I sort of expected the slicing to be "magic" as well, allowing things like [-2:5] and what not. Oh well, I guess I'll get used to it.
0

In the Python Tutorial (http://docs.python.org/2/tutorial/introduction.html), slice notation is defined as two indices separated by a colon.

In the last iteration of your example, the slice notation is [-2:0]. -2 is the index for the second to last character of the string, and 0 is the index for the first letter in the string. It does not make sense to take a slice from the second to last character to the first character.

If you want to go from the second to last character to the last character, simply eliminate the second index: [-2:]. That says, start at the second to last character and go to the end. Or be explicit and say [-2:len(test)].

For this example, I would suggest something like the following:

test = 'abcdefgh'
for i in range(7):
    start = -(8-i)
    end = -(6-i)
    # test your end condition
    if end == 0:
        end = None
    print test[start:end]
    print i

Comments

0

This is indeed an unfortunate consequence of the slice semantic.

The problem is that to mean "count from the end" you need to pass a negative number, and therefore you cannot ask "count 0 from the end" because -0 == 0 is not a negative number.

For counting 0 chars from the end you need to special case the issue with an if or other conditional trickery, because passing 0 means "0 elements from the start".

To have it working for these cases the semantic would have to be that -4 means counting 3 from the end (thus leaving room for -1 to mean "0 from the end"), but this would have been counter intuitive.

Being able to say x[-n:] to mean the last n chars of a string is a better compromise even if this doesn't work for n == 0 where instead of the empty string you get the full string.

3 Comments

I thought the same thing until I realized the default argument None can be easily inserted in the place of 0 using or. Take a look at my answer for a fleshed out example. (I mean only to inform, not to promote my answer. I've hit rep cap and my answer was already accepted, so what would be the point)
@mhlester: The point that there is a problem with the asymmetry of "counting from the end" using negatives (that doesn't allow a count of 0) remains. If you want the last n chars for example you cannot use x[-n or None:] but you have to use x[-n or len(x):]
Ah good point. The or None can only be used after the colon

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.