Since python does slice-by-copy, slicing strings can be very costly.
I have a recursive algorithm that is operating on strings. Specifically, if a function is passed a string a, the function calls itself on a[1:] of the passed string. The hangup is that the strings are so long, the slice-by-copy mechanism is becoming a very costly way to remove the first character.
Is there a way to get around this, or do I need to rewrite the algorithm entirely?
a[:1]is slicing out the first character (if any), which is incredibly cheap. Did you meana[1:](which would slice all but the first character)?lists; sure, the len 1strobjects wouldn't be copied, but the pointers to them would be, and the pointers are 4-8 bytes a piece, vs. 1-4 bytes a piece for each character in a string. The big-O cost of alistslice is identical to that of astrslice. For Python built-in types, about the only types withO(1)slicing arememoryviewand (on Py3)range.numpyadds whole slew of view-like sequences, but it's not a built-in package.strslices copy instead of creating views partially to keep the implementation simpler (the raw data can be allocated with the object header in a single block, with no need to have the data allocated separately with separate reference counts, and avoiding the need to store an offset into the data), and partially to avoid keepalive effects. If someone does something likesmallstr = mystr[1:11]wheremystris 1 GB long, it would be ridiculous to keepmystralive forever just becausesmallstrwas looking at 10 characters of it.