30

not sure this was asked before, but I couldn't find an obvious answer. I'm trying to count the number of elements in a list that are equal to a certain value. The problem is that these elements are not of a built-in type. So if I have

class A:
    def __init__(self, a, b):
        self.a = a
        self.b = b

stuff = []
for i in range(1,10):
    stuff.append(A(i/2, i%2))

Now I would like a count of the list elements whose field b = 1. I came up with two solutions:

print [e.b for e in stuff].count(1)

and

print len([e for e in stuff if e.b == 1])

Which is the best method? Is there a better alternative? It seems that the count() method does not accept keys (at least in Python version 2.5.1.

Many thanks!

2
  • 1
    It is not a good idea to to name a list as 'list'. Commented Nov 19, 2009 at 17:46
  • I totally agree, and changed the name of the list. Commented Nov 19, 2009 at 19:15

5 Answers 5

54
sum(x.b == 1 for x in L)

A boolean (as resulting from comparisons such as x.b == 1) is also an int, with a value of 0 for False, 1 for True, so arithmetic such as summation works just fine.

This is the simplest code, but perhaps not the speediest (only timeit can tell you for sure;-). Consider (simplified case to fit well on command lines, but equivalent):

$ py26 -mtimeit -s'L=[1,2,1,3,1]*100' 'len([x for x in L if x==1])'
10000 loops, best of 3: 56.6 usec per loop
$ py26 -mtimeit -s'L=[1,2,1,3,1]*100' 'sum(x==1 for x in L)'
10000 loops, best of 3: 87.7 usec per loop

So, for this case, the "memory wasteful" approach of generating an extra temporary list and checking its length is actually solidly faster than the simpler, shorter, memory-thrifty one I tend to prefer. Other mixes of list values, Python implementations, availability of memory to "invest" in this speedup, etc, can affect the exact performance, of course.

Sign up to request clarification or add additional context in comments.

17 Comments

Might be worth explaining how this works. It won't be obvious to everyone that you can add up a list of booleans.
Also, why is this a better approach than: len([e for e in list if e.b == 1]) which does not have to sum up elements?
@nicolaum and @Dave, I've added detailed explanation and timing -- which actually shows the "useless list" approach being faster than the simple one (at least in one sample case). Simplest is not always fastest: sometimes if you have memory that's free and unused anyway "investing" it can be a tradeoff saving you some time.
@Dave, I'm the one who originally coded sum in the CPython interpreter, and I assure you it never even gets close to reduce (not sure where you got the idea).
While this is very clever and the shortest code of all the examples, I find it the least pythonic and obvious. len([x for x in L if cond]) is verbose and includes some redundancy, but it's immediately obvious what it means.
|
19
print sum(1 for e in L if e.b == 1)

3 Comments

Nice one, I think this is more readable version of Alex Martelli's answer, summing 1 is more obvious than knowing that True can be treated as 1.
It also lends itself well as a common pattern: sum(len(n) for n in L if n.b == 1) for example.
@TendayiMawushe: summing 1 instead of boolean values is also about 30% faster, at least using Python 2.7 (see my comment to Alex' answer).
3

I would prefer the second one as it's only looping over the list once.

If you use count() you're looping over the list once to get the b values, and then looping over it again to see how many of them equal 1.

A neat way may to use reduce():

reduce(lambda x,y: x + (1 if y.b == 1 else 0),list,0)

The documentation tells us that reduce() will:

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value.

So we define a lambda that adds one the accumulated value only if the list item's b attribute is 1.

3 Comments

I like this approach the best. Don't understand why it doesn't receive any upvotes.
@phunehehe: I suppose it didn't get upvotes since it's by far the slowest and most verbose alternative proposed here.
Funny, I don't remember anymore. Maybe this answer fits with what I was doing (which I don't remember either) :D
1

To hide reduce details, you may define a count function:

def count(condition, stuff):
    return reduce(lambda s, x: \
                  s + (1 if condition(x) else 0), stuff, 0)

Then you may use it by providing the condition for counting:

n = count(lambda i: i.b, stuff)

Comments

-1

Given the input

name = ['ball', 'jeans', 'ball', 'ball', 'ball', 'jeans']
price = [1, 4, 1, 1, 1, 4]
weight = [2, 2, 2, 3, 2, 2]

First create a defaultdict to record the occurrence

from collections import defaultdict
occurrences = defaultdict(int)

Increment the count

for n, p, w in zip(name, price, weight):
    occurrences[(n, p, w)] += 1

Finally count the ones that appear more than once (True will yield 1)

print(sum(cnt > 1 for cnt in occurrences.values())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.