20

Is there a more idiomatic way to sum string lengths in Python than by using a loop?

length = 0
for string in strings:
    length += len(string)

I tried sum(), but it only works for integers:

>>> sum('abc', 'de')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]
3
  • 1
    What do you mean by "quicker"? Less typing or faster execution? Commented Sep 23, 2010 at 16:22
  • @Richard: Sorry, I was thinking "quicker" as in less typing, but what I actually mean is idiomatic. Commented Sep 23, 2010 at 17:07
  • 1
    No worries. I think that's what everybody else figured. I'm just a pedant! Commented Sep 23, 2010 at 17:09

8 Answers 8

49
length = sum(len(s) for s in strings)
Sign up to request clarification or add additional context in comments.

3 Comments

This is definitely a more idiomatic way of expressing it but I don't think it's any more efficient computationally. Still, +1 for elegance and Pythonicness!
If you're really worried about computational efficiency, you probably shouldn't use Python, or should write the computation-intense part in C or C++ (or SciPy's weave library if you're brave). I like this style because it's more legible to other Python developers.
Thanks, this is much shorter and easier to understand than my code.
18

My first way to do it would be sum(map(len, strings)). Another way is to use a list comprehension or generator expression as the other answers have posted.

2 Comments

Good answer, but I've accepted liori's answer because I found it more idiomatic.
@Josh: Most people will indeed find the genexp more pythonic. I just wanted to add this for completeness.
7

The shortest and fastest way is apply a functional programming style with map() and sum():

>>> data = ['a', 'bc', 'def', 'ghij']
>>> sum(map(len, data))
10

In Python 2, use itertools.imap instead of map for better memory performance:

>>> from itertools import imap
>>> data = ['a', 'bc', 'def', 'ghij']
>>> sum(imap(len, data))
10

Comments

5

I know this is an old question, but I can't help noting that the Python error message tells you how to do this:

TypeError: sum() can't sum strings [use ''.join(seq) instead]

So:

>>> strings = ['abc', 'de']
>>> print len(''.join(strings))
5

8 Comments

It seems wasteful to concatenate the strings when you don't have to, but +1 for adding another way of solving the problem!
I don't know - I long since stopped wondering whether code was CPU wasteful for non-realtime systems. But since you mentioned "less typing" this looks pretty tight.
@Zaz Wasteful? This is by far the fastest of the three solutions, if the timeit module is to be believed. The answer you accepted, sum(len(s) for s in strings), is over three times as slow, and is also almost twice as slow as sum(map(len, strings)). (Speed of course doesn't matter much in Python -- if you wanted speed you'd be using Pypy, as the saying goes -- but the full generator expression is also IMO a bit of an eyesore compared to the others.)
The other answers are more generic and useful, as they also answer the question when the element type of the list is not a string.
@aggieNick02 The other answers answer a question that was not asked! The question was about lists of strings, with the answer provided by the error message. Why overcomplicate things?
|
2
print(sum(len(mystr) for mystr in strings))

Comments

2

TLDR

If you care about performance use

len(''.join(strings))

else using map will suffice without sacrificing code readability or a lot of performance

sum(map(len, strings))

Performance metrics

Although I agree with the general consensus that when using Python your first priority should not be writing efficient and scalable code, I think it would be beneficial for this post to have some timings for the proposed answers.

Using the words from the first paragraph of lorem ipsum (list of strings excluded for the sake of brevity)

In [3]: timeit("""
    ...: length = 0
    ...: for s in strings:
    ...:     length += len(s)
    ...: """, globals=globals())
Out[3]: 5.197531974001322

In [4]: timeit("sum(len(s) for s in strings)", globals=globals())
Out[4]: 4.925184353021905

In [5]: timeit("sum(map(len, strings))", globals=globals())
Out[5]: 1.9876644779578783

In [6]: timeit("len(''.join(strings))", globals=globals())
Out[6]: 0.6793132669990882

So for large list of strings @Auspex is clearly to be prefered.

Comments

1

Here's another way using operator. Not sure if this is easier to read than the accepted answer.

import operator

length = reduce(operator.add, map(len, strings))

print length

Comments

-1

Just to add upon ...

Adding numbers from a list stored as a string

nos = ['1','14','34']

length = sum(int(s) for s in nos)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.