Python string slicing, special case if end of string included?

Question

I'm new to Python, and I find the slice behaviour somewhat confusing.

If I do

test = 'abcdefgh'

for i in range(7):
    print test[-(8-i):-(6-i)]
    print i

the last iteration will misbehave. Since slicing [start:end] doesn't include end, it seems to me like I'd need to handle slices like this with a special case if the last character is in the range I want.

Did I miss something?

Note that "a"[0:100000] == "a". You do not get IndexError when using slices for out-of-range indexes. The out-of-range index is either replaced with the end/the beginning or, in other circumstances, the result is an empty string. — Bakuriu
– Bakuriu, Commented Jan 30, 2014 at 18:10
Yeah. So, for consistency, I'd be more comfortable with [-2:0] being the same as [-2:end], rather than just giving an empty slice. — gibson
– gibson, Commented Jan 30, 2014 at 18:23
Actually that would reduce consistency. Because: how come [-2:1] returns an empty string while [-2:0] works? One thing is if you omit the stop parameter, an other thing is if you provide an explicit index. — Bakuriu
– Bakuriu, Commented Jan 30, 2014 at 18:40
I wouldn't mind if [-2:2] gave the last two and first two characters of the string. — gibson
– gibson, Commented Jan 31, 2014 at 13:40
It just occurred to me that python's slicing provides a very strong invariant. Given a string s, it always holds that: s[x:y] in s is True, no matter what x and y are of if they are omitted; In other words s[x:y] is always a substring of s (if you omit the step). Including the changes you suggest would break this invariant. I believe this is the reason why they didn't want to implicitly change the step in these circumstances. s[x:y] should always return a substring of s, not a substring of s or the inverse of s depending on the indeces. — Bakuriu
– Bakuriu, Commented Feb 13, 2014 at 8:47

mhlester · Accepted Answer · 2014-01-30 18:08:41Z

1

If you add another couple prints, you can see what's happening:

test = 'abcdefgh'

for i in range(7):
    print -(8-i), -(6-i)
    print test[-(8-i):-(6-i)]
    print i

Outputs:

-8 -6
ab
0
-7 -5
bc
1
-6 -4
cd
2
-5 -3
de
3
-4 -2
ef
4
-3 -1
fg
5
-2 0

All your ranges are negative, until the last, when it's 0

Adding or None to the end range will to avoid the 0 and act as if you didn't pass it in the first place:

for i in range(7):
    print test[-(8-i):(-(6-i) or None)]
    print i

Which outputs:

ab
0
bc
1
cd
2
de
3
ef
4
fg
5
gh
6

The way the or operator works, if the first argument is "falsish", the second argument is used, in this case None

answered Jan 30, 2014 at 18:08

mhlester

23.3k10 gold badges55 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

gibson Over a year ago

Ok, cool, that's a reasonably direct way of making this work. :)

mhlester Over a year ago

I admit it's a bit awkward, but it avoids needing to make a special case for it. Negative slicing is nice until you run into something like this. Any other workaround requires an if..else statement, which would suck

MatthieuBizien · Accepted Answer · 2014-01-30 18:12:18Z

1

You can't start at -1 and go to +1. -1 is the end, 1 the secund item. You can do

for i in range(7):
   ....:         print test[i:(2+i)]
   ....:     
ab
bc
cd
de
ef
fg
gh

edited Jan 30, 2014 at 18:12

answered Jan 30, 2014 at 18:06

MatthieuBizien

1,7251 gold badge10 silver badges19 bronze badges

Comments

jbh · Accepted Answer · 2014-01-30 18:14:41Z

1

The issue here is -0 just is 0, so you're attempting to grab up to the first character of the string

so for the case of i = 6 you get

test[-2:0] = ''

a better way of handling this is look ahead

for i in range(len(test)-1):
     print test[i:i+2]

for indexing from the end to work the correct syntax would leave out the 0

test[-2:] = 'gh'

edited Jan 30, 2014 at 18:14

answered Jan 30, 2014 at 18:06

jbh

1,1532 gold badges11 silver badges24 bronze badges

5 Comments

gibson Over a year ago

Yeah, I get that. I just find it strange that they offer you this "indexing from the end"-shorthand, but make it not work for the last character.

jbh Over a year ago

This is because you are no longer indexing from the end, when you insert a 0 into that position. the correct "indexing from the end" for that case would be test[-2:]

gibson Over a year ago

I know. To me, it would have made sense to allow something like [-2:0] meaning the same as [-2:], to make this easier. The "or None"-solution is reasonably short and to the point, but I don't see why 0 can't be allowed as an index from both directions.

jbh Over a year ago

Because 0 would be ambiguous if it meant both and therefor unusable. If you think about implementing a parser for this kind of language it easily differentiates between indexing from the end and indexing by position based on if the number is negative or not. I also think that the more conventional/readable way is the way I stated above. I will be honest it took me a minute to think about the what the desired output of your statement was.

gibson Over a year ago

Normally, I would be using the method you stated above. I just ran across this issue when experimenting with this "new and strange" shorthand. Having just realised that in Python 'a' * 5 == 'aaaaa', I sort of expected the slicing to be "magic" as well, allowing things like [-2:5] and what not. Oh well, I guess I'll get used to it.

Carlos Hanson · Accepted Answer · 2014-01-30 19:15:19Z

In the Python Tutorial (http://docs.python.org/2/tutorial/introduction.html), slice notation is defined as two indices separated by a colon.

In the last iteration of your example, the slice notation is [-2:0]. -2 is the index for the second to last character of the string, and 0 is the index for the first letter in the string. It does not make sense to take a slice from the second to last character to the first character.

If you want to go from the second to last character to the last character, simply eliminate the second index: [-2:]. That says, start at the second to last character and go to the end. Or be explicit and say [-2:len(test)].

For this example, I would suggest something like the following:

test = 'abcdefgh'
for i in range(7):
    start = -(8-i)
    end = -(6-i)
    # test your end condition
    if end == 0:
        end = None
    print test[start:end]
    print i

6502 · Accepted Answer · 2014-01-30 20:07:31Z

0

This is indeed an unfortunate consequence of the slice semantic.

The problem is that to mean "count from the end" you need to pass a negative number, and therefore you cannot ask "count 0 from the end" because -0 == 0 is not a negative number.

For counting 0 chars from the end you need to special case the issue with an if or other conditional trickery, because passing 0 means "0 elements from the start".

To have it working for these cases the semantic would have to be that -4 means counting 3 from the end (thus leaving room for -1 to mean "0 from the end"), but this would have been counter intuitive.

Being able to say x[-n:] to mean the last n chars of a string is a better compromise even if this doesn't work for n == 0 where instead of the empty string you get the full string.

edited Jan 30, 2014 at 20:07

answered Jan 30, 2014 at 18:24

6502

115k17 gold badges177 silver badges277 bronze badges

3 Comments

mhlester Over a year ago

I thought the same thing until I realized the default argument None can be easily inserted in the place of 0 using or. Take a look at my answer for a fleshed out example. (I mean only to inform, not to promote my answer. I've hit rep cap and my answer was already accepted, so what would be the point)

6502 Over a year ago

@mhlester: The point that there is a problem with the asymmetry of "counting from the end" using negatives (that doesn't allow a count of 0) remains. If you want the last n chars for example you cannot use x[-n or None:] but you have to use x[-n or len(x):]

mhlester Over a year ago

Ah good point. The or None can only be used after the colon

Collectives™ on Stack Overflow

Python string slicing, special case if end of string included?

5 Answers 5

2 Comments

Comments

5 Comments

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

5 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related