Conditional counting in Python

Question

not sure this was asked before, but I couldn't find an obvious answer. I'm trying to count the number of elements in a list that are equal to a certain value. The problem is that these elements are not of a built-in type. So if I have

class A:
    def __init__(self, a, b):
        self.a = a
        self.b = b

stuff = []
for i in range(1,10):
    stuff.append(A(i/2, i%2))

Now I would like a count of the list elements whose field b = 1. I came up with two solutions:

print [e.b for e in stuff].count(1)

and

print len([e for e in stuff if e.b == 1])

Which is the best method? Is there a better alternative? It seems that the count() method does not accept keys (at least in Python version 2.5.1.

Many thanks!

It is not a good idea to to name a list as 'list'.

MAK
– MAK

2009-11-19 17:46:41 +00:00
Commented Nov 19, 2009 at 17:46 — MAK
– MAK, Commented Nov 19, 2009 at 17:46
I totally agree, and changed the name of the list.

nicolaum
– nicolaum

2009-11-19 19:15:01 +00:00
Commented Nov 19, 2009 at 19:15 — nicolaum
– nicolaum, Commented Nov 19, 2009 at 19:15

Alex Martelli · Accepted Answer · 2009-11-19 17:07:56Z

54

sum(x.b == 1 for x in L)

A boolean (as resulting from comparisons such as x.b == 1) is also an int, with a value of 0 for False, 1 for True, so arithmetic such as summation works just fine.

This is the simplest code, but perhaps not the speediest (only timeit can tell you for sure;-). Consider (simplified case to fit well on command lines, but equivalent):

$ py26 -mtimeit -s'L=[1,2,1,3,1]*100' 'len([x for x in L if x==1])'
10000 loops, best of 3: 56.6 usec per loop
$ py26 -mtimeit -s'L=[1,2,1,3,1]*100' 'sum(x==1 for x in L)'
10000 loops, best of 3: 87.7 usec per loop

So, for this case, the "memory wasteful" approach of generating an extra temporary list and checking its length is actually solidly faster than the simpler, shorter, memory-thrifty one I tend to prefer. Other mixes of list values, Python implementations, availability of memory to "invest" in this speedup, etc, can affect the exact performance, of course.

edited Nov 19, 2009 at 17:07

answered Nov 19, 2009 at 16:03

Alex Martelli

887k175 gold badges1.3k silver badges1.4k bronze badges

Sign up to request clarification or add additional context in comments.

17 Comments

David Webb Over a year ago

Might be worth explaining how this works. It won't be obvious to everyone that you can add up a list of booleans.

nicolaum Over a year ago

Also, why is this a better approach than: len([e for e in list if e.b == 1]) which does not have to sum up elements?

Alex Martelli Over a year ago

@nicolaum and @Dave, I've added detailed explanation and timing -- which actually shows the "useless list" approach being faster than the simple one (at least in one sample case). Simplest is not always fastest: sometimes if you have memory that's free and unused anyway "investing" it can be a tradeoff saving you some time.

Alex Martelli Over a year ago

@Dave, I'm the one who originally coded sum in the CPython interpreter, and I assure you it never even gets close to reduce (not sure where you got the idea).

Ben James Over a year ago

While this is very clever and the shortest code of all the examples, I find it the least pythonic and obvious. len([x for x in L if cond]) is verbose and includes some redundancy, but it's immediately obvious what it means.

|

Roger Pate · Accepted Answer · 2009-11-19 15:57:23Z

19

print sum(1 for e in L if e.b == 1)

answered Nov 19, 2009 at 15:57

Roger Pate

3 Comments

Tendayi Mawushe Over a year ago

Nice one, I think this is more readable version of Alex Martelli's answer, summing 1 is more obvious than knowing that True can be treated as 1.

Roger Pate Over a year ago

It also lends itself well as a common pattern: sum(len(n) for n in L if n.b == 1) for example.

Frerich Raabe Over a year ago

@TendayiMawushe: summing 1 instead of boolean values is also about 30% faster, at least using Python 2.7 (see my comment to Alex' answer).

David Webb · Accepted Answer · 2009-11-19 16:11:03Z

3

I would prefer the second one as it's only looping over the list once.

If you use count() you're looping over the list once to get the b values, and then looping over it again to see how many of them equal 1.

A neat way may to use reduce():

reduce(lambda x,y: x + (1 if y.b == 1 else 0),list,0)

The documentation tells us that reduce() will:

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value.

So we define a lambda that adds one the accumulated value only if the list item's b attribute is 1.

edited Nov 19, 2009 at 16:11

answered Nov 19, 2009 at 15:58

David Webb

194k57 gold badges319 silver badges302 bronze badges

3 Comments

phunehehe Over a year ago

I like this approach the best. Don't understand why it doesn't receive any upvotes.

Frerich Raabe Over a year ago

@phunehehe: I suppose it didn't get upvotes since it's by far the slowest and most verbose alternative proposed here.

phunehehe Over a year ago

Funny, I don't remember anymore. Maybe this answer fits with what I was doing (which I don't remember either) :D

Calvin · Accepted Answer · 2014-12-27 08:57:30Z

1

To hide reduce details, you may define a count function:

def count(condition, stuff):
    return reduce(lambda s, x: \
                  s + (1 if condition(x) else 0), stuff, 0)

Then you may use it by providing the condition for counting:

n = count(lambda i: i.b, stuff)

answered Dec 27, 2014 at 8:57

Calvin

112 bronze badges

Comments

Alan Wu · Accepted Answer · 2021-02-26 07:39:38Z

-1

Given the input

name = ['ball', 'jeans', 'ball', 'ball', 'ball', 'jeans']
price = [1, 4, 1, 1, 1, 4]
weight = [2, 2, 2, 3, 2, 2]

First create a defaultdict to record the occurrence

from collections import defaultdict
occurrences = defaultdict(int)

Increment the count

for n, p, w in zip(name, price, weight):
    occurrences[(n, p, w)] += 1

Finally count the ones that appear more than once (True will yield 1)

print(sum(cnt > 1 for cnt in occurrences.values())

answered Feb 26, 2021 at 7:39

Alan Wu

11 silver badge

Collectives™ on Stack Overflow

Conditional counting in Python

5 Answers 5

17 Comments

3 Comments

3 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

17 Comments

3 Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related