196

Is there some string class in Python like StringBuilder in C#?

2
  • 10
    This is a duplicate of Python equivalent of Java StringBuffer. CAUTION: The answers here are way out of date and have, in fact, become misleading. See that other question for answers that are more relevant to modern Python versions (certainly 2.7 and above). Commented Nov 20, 2017 at 8:52
  • It's February 8, 2025 as I write this response. Yes, with Python 3 (3.13.2 to be exact in my case), you can create a reasonable facsimile of the C# (very close to Java) StringBuilder class with the most often used capabilities. One such example can be found in my GitHub repo located at github.com/fmorriso/Python-StringBuilder Commented Feb 8 at 15:14

7 Answers 7

149

There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:

Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.

TLDR the fastest method is below. It's extremely compact, and also pretty understandable:

def method6():
  return ''.join([`num` for num in xrange(loop_count)])
Sign up to request clarification or add additional context in comments.

5 Comments

Note that this article was written based on Python 2.2. The tests would likely come out somewhat differently in a modern version of Python (CPython usually successfully optimizes concatenation, but you don't want to depend on this in important code) and a generator expression where he uses a list comprehension would be worthy of consideration.
It would be good to pull in some highlights in that article, at the least a couple of the implementations (to avoid link rot problems).
Method 1: resultString += appendString is the fastest according to tests by @Antoine-tran below
Your quote doesn't at all answer the question. Please include the relevant parts in your answer itself, to comply with new guidelines.
As I pointed out in comments to another answer, this does not simulate how people use StringBuilder in C#. StringBuilder is not just a string concatenation method. It also provides a way for you to store string values in a long indeterminative process.
49

Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.

Use join. It's very fast and more robust.

$ ipython3
Python 3.5.1 (default, Mar  2 2016, 03:38:02) 
IPython 4.1.2 -- An enhanced Interactive Python.

In [1]: values = [str(num) for num in range(int(1e3))]

In [2]: %%timeit
   ...: ''.join(values)
   ...: 
100000 loops, best of 3: 7.37 µs per loop

In [3]: %%timeit
   ...: result = ''
   ...: for value in values:
   ...:     result += value
   ...: 
10000 loops, best of 3: 82.8 µs per loop

In [4]: import io

In [5]: %%timeit
   ...: writer = io.StringIO()
   ...: for value in values:
   ...:     writer.write(value)
   ...: writer.getvalue()
   ...: 
10000 loops, best of 3: 81.8 µs per loop

8 Comments

Yes, the repr call dominates the runtime, but there's no need to make the mistake personal.
@AlexReinking sorry, nothing personal meant. I'm not sure what made you think it was personal. But if it was the use of their names, I used those only to refer to the user's answers (matches usernames, not sure if there's a better way).
good timing example that separates data initialization and concatenation operations
This answer is misleading because it creates the whole list of values together, so that it can be joined in one step. So it doesn't have the for loop. In reality, before join, you need to append the elements one by one, just like the for loop with other methods.
@FengJiang that is incorrect. The “for loop” is inside the join method implemented in C-code with optimizations. That is what makes it so fast. Moving the list initialization above the benchmarking makes the three measurements more accurate.
|
30

I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).

Here is the result for 10,000 iterations:

method1:  0.0538418292999 secs
process size 4800 kb
method2:  0.22602891922 secs
process size 4960 kb
method3:  0.0605459213257 secs
process size 4980 kb
method4:  0.0544030666351 secs
process size 5536 kb
method5:  0.0551080703735 secs
process size 5272 kb
method6:  0.0542731285095 secs
process size 5512 kb

and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):

method1:  5.88603997231 secs
process size 37976 kb
method3:  8.40748500824 secs
process size 38024 kb
method4:  7.96380496025 secs
process size 321968 kb
method5:  8.03666186333 secs
process size 71720 kb
method6:  6.68192911148 secs
process size 38240 kb

It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: "premature optimization is the root of all evil" :-)

3 Comments

Apparently Hoare does not accept that: hans.gerwitz.com/2004/08/12/…
It is not a premature optimization to avoid fragile, interpreter-dependant optimizations. If you ever want to port to PyPy or risk hitting one of the many subtle failure cases for the optimization, do things the right way.
Looks like Method 1 is easier for the compiler to optimize.
24

Python has several things that fulfill similar purposes:

  • One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
    • To build strings incorporating data with formatting, you would do the formatting separately.
  • For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you'd call list(your_string). You could also use a UserString.MutableString for this.
  • (c)StringIO.StringIO is useful for things that would otherwise take a file, but less so for general string building.

Comments

23

Using method 5 from above (The Pseudo File) we can get very good perf and flexibility

from cStringIO import StringIO

class StringBuilder:
     _file_str = None

     def __init__(self):
         self._file_str = StringIO()

     def Append(self, str):
         self._file_str.write(str)

     def __str__(self):
         return self._file_str.getvalue()

now using it

sb = StringBuilder()

sb.Append("Hello\n")
sb.Append("World")

print sb

Comments

7

you can try StringIO or cStringIO

Comments

0

There is no explicit analogue - i think you are expected to use string concatenations(likely optimized as said before) or third-party class(i doubt that they are a lot more efficient - lists in python are dynamic-typed so no fast-working char[] for buffer as i assume). Stringbuilder-like classes are not premature optimization because of innate feature of strings in many languages(immutability) - that allows many optimizations(for example, referencing same buffer for slices/substrings). Stringbuilder/stringbuffer/stringstream-like classes work a lot faster than concatenating strings(producing many small temporary objects that still need allocations and garbage collection) and even string formatting printf-like tools, not needing of interpreting formatting pattern overhead that is pretty consuming for a lot of format calls.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.