How to count values across fields in Django?

Question

MODELS

class ModelA(models.Model):
    name = models.CharField()

class ModelB(models.Model):
    MY_CHOICES = (
        ('X', 'X'),
        ('Y', 'Y'),
        ('Z', 'Z'),
    )
    modela = models.ForeignKey(ModelA, on_delete=models.CASCADE)
    txt_1 = models.CharField(choices=MY_CHOICES)
    txt_2 = models.CharField(choices=MY_CHOICES)

Given the simplified example above, how could I count how many times each of the choices values has been recorded given that there are Two fields that need to be counted?

Ideally the result would be something along the lines of:

{'X': 15, 'Y': 27, 'Z': 89}

I have tried the following, however in my real model, I have around 20 fields to count across and this is not giving the results I was hoping for:

ModelA.objects.values('modelb__txt1', 'modelb__txt2').annotate(Count('modelb__txt1', 'modelb__txt2'))

I have previously created huge dictionaries and manually sorted/counted the values, but this is now unmanageable and ugly.

Say there is a row (X, Y), does this mean we count this as one for both X and Y? — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Jul 27, 2018 at 12:06
What in case it is (X, X)? Does this count for 1 to X? Or twice? — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Jul 27, 2018 at 12:06
@WillemVanOnsem - We would count each value so (X, Y) would be {'X': 1, 'Y':1} and (X, X) would be {'X': 2, 'Y': 0}. — Karl
– Karl, Commented Jul 27, 2018 at 12:11

willeM_ Van Onsem · Accepted Answer · 2018-07-27 12:25:01Z

1

With one query (for a limited number of columns)

With one query, we can do this like:

from django.db.models import Count

qs = ModelB.objects.values('txt_1', 'txt_2').annotate(
    cnt=Count('id')
).order_by('txt_1', 'txt_2')

But now we are still not there, since now we have for every combination of txt_1 and txt_2 the number of elements. We want to "flatten" this to every individual choice. We can do this for example by constructing a Counter [Python-doc]:

from collections import Counter

result = Counter()
for row in qs:
    result[row['txt_1']] += row['cnt']
    result[row['txt_2']] += row['cnt']

For every row of this QuerySet, we thus add the number (cnt) to the two keys. This thus means that we count a row where both txt_1, and txt_2 have value 'X' twice.

A Counter is a subclass of a dictionary, but if you want to cast it to a dictionary, you can later write:

result_dict = dict(result)

Options that are never selected, will not appear in the dictionary, since the queryset will not contain these, and thus we will never add these to the Counter. But we can of course post-process the dictionary, and add 0 for these.

With n queries (with n the number of columns)

The above will work usually quite well. If the number of choices is however quite large, the processing will be more on the Python side, which is typically slower. We can then make a linearization, and work with two queries:

from collections import Counter
from django.db.models import Count

result = Counter()
for col in ['txt_1', 'txt_2']:
    qs = ModelB.objects.values(col).annotate(cnt=Count('id')).order_by(col)
    result.update({q[col]: q['cnt'] for q in qs})

This will here reduce in two queries. But in this case each query will (at most) return three rows. Whereas the other approach will result in one query returning (at most) nine rows. For a small amount of rows, that is not a problem. But the number of cases can easily grow exponential in the number of columns.

edited Jul 27, 2018 at 12:25

answered Jul 27, 2018 at 12:10

willeM_ Van Onsem

482k33 gold badges483 silver badges624 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Karl Over a year ago

Thanks for the quick answer, Ill give this a try now!

Karl Over a year ago

Thanks again, the single query approach appears to be working great!

Collectives™ on Stack Overflow

How to count values across fields in Django?

1 Answer 1

With one query (for a limited number of columns)

With n queries (with n the number of columns)

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

With one query (for a limited number of columns)

With n queries (with n the number of columns)

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related