1

Consider

l = [1, 2, 3, 4, 5]
print(l is l[:])
t = (1, 2, 3, 4, 5)
print(t is t[:])
print(t[1:] is t[1:])
print('aa'[1:] is 'aa'[1:])
print('aaa'[1:] is 'aaa'[1:])

The result is, somewhat surprisingly, False, True, False, True, False.

Additionally, if I create objects of types list, tuple and str with large lengths and then create large numbers of slices [1:] of each, only with str is it efficient in terms of time and memory, even though tuples are also immutable and could, just like strings, be represented without copying the range specified by a slice, just by indexing into the contiguous memory where the data is already stored.

Why does CPython behave this way? Is it an implementation thing, or are all implementations required to follow the same choices?

5
  • 2
    Slicing is defined by each type. For immutable types, it's an implementation detail. The details of string interning are very arcane. For list objects, a slice always creates a copy. Indeed, mylist[:] was the old idiom for creating a shallow copy (nowadays, I prefer mylist.copy() for fluency reasons). I'm trying to find the actual documentation regarding list objects, but this is just a "well known fact" in Python that the semantics of slicing list objects impleis a copy. Note, memoryview objects creates new objects, but they share the underlying buffer! Commented Apr 7 at 19:46
  • 2
    The extra efficiency you're seeing with strings probably has nothing to do with the slicing itself, it's due to Python automatically interning the new string if it looks like a valid identifier. Try this with strings that are something like @ repeated, rather than a repeated, and I believe you will see different results. Commented Apr 7 at 19:47
  • ugh, so it is implied here: "s.copy() creates a shallow copy of s (same as s[:])" but like I said, the assumption is that s[:] is known to create a copy! This idiom is probably as old as Python 1! Commented Apr 7 at 19:51
  • In general, it's only important to distinguish copies of mutable types, so you can modify one without affecting the other. For immutable types, it shouldn't generally make a difference whether a shallow copy is a distinct object. Only is and id() can tell the difference. Commented Apr 7 at 21:18
  • @Barmar, it does for time and memory complexity. @jasonharper, thanks, it's your comment that's taught me the most. So there is no optimization based on immutability, neither for str, nor tuple objects (except copying the whole thing, for both types), just string interning. Bummer. It seems such a low hanging fruit. Commented Apr 9 at 11:13

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.