12

Is it possible to calculate the cumulative (running) sum using django's orm? Consider the following model:

class AModel(models.Model):
    a_number = models.IntegerField()

with a set of data where a_number = 1. Such that I have a number ( >1 ) of AModel instances in the database all with a_number=1. I'd like to be able to return the following:

AModel.objects.annotate(cumsum=??).values('id', 'cumsum').order_by('id')
>>> ({id: 1, cumsum: 1}, {id: 2, cumsum: 2}, ... {id: N, cumsum: N})

Ideally I'd like to be able to limit/filter the cumulative sum. So in the above case I'd like to limit the result to cumsum <= 2

I believe that in postgresql one can achieve a cumulative sum using window functions. How is this translated to the ORM?

7
  • I don't get it. What is cumsum? And there's only one record with id=1 Commented Apr 20, 2017 at 11:23
  • cumsum == cumulative sum, obviously this is for more than one record - edited to make clearer so the size of the set of data is greater than one. Commented Apr 20, 2017 at 11:32
  • I don't think you cand do it with the ORM...use python instead Commented Apr 20, 2017 at 11:41
  • The phrase you are looking for is the running total (or running sum). Which is a specialized case of moving aggregates. Which are a kind of window functions. Commented Apr 20, 2017 at 13:09
  • I think cumulative sum and running total are the same thing mathworld.wolfram.com/CumulativeSum.html . But yes it is a window function I'm after. Commented Apr 20, 2017 at 13:13

5 Answers 5

18

For reference, starting with Django 2.0 it is possible to use the Window function to achieve this result:

AModel.objects.annotate(cumsum=Window(Sum('a_number'), order_by=F('id').asc()))\
              .values('id', 'cumsum').order_by('id', 'cumsum')
Sign up to request clarification or add additional context in comments.

1 Comment

Interesting to know also, the window functions do not work with SqLite3.
5

From Dima Kudosh's answer and based on https://stackoverflow.com/a/5700744/2240489 I had to do the following: I removed the reference to PARTITION BY in the sql and replaced with ORDER BY resulting in.

AModel.objects.annotate(
    cumsum=Func(
        Sum('a_number'), 
        template='%(expressions)s OVER (ORDER BY %(order_by)s)', 
        order_by="id"
    ) 
).values('id', 'cumsum').order_by('id', 'cumsum')

This gives the following sql:

SELECT "amodel"."id",
SUM("amodel"."a_number") 
OVER (ORDER BY id) AS "cumsum" 
FROM "amodel" 
GROUP BY "amodel"."id" 
ORDER BY "amodel"."id" ASC, "cumsum" ASC

Dima Kudosh's answer was not summing the results but the above does.

Comments

2

For posterity, I found this to be a good solution for me. I didn't need the result to be a QuerySet, so I could afford to do this, since I was just going to plot the data using D3.js:

import numpy as np
import datettime

today = datetime.datetime.date()

raw_data = MyModel.objects.filter('date'=today).values_list('a_number', flat=True)

cumsum = np.cumsum(raw_data)

Comments

1

You can try to do this with Func expression.

from django.db.models import Func, Sum

AModel.objects.annotate(cumsum=Func(Sum('a_number'), template='%(expressions)s OVER (PARTITION BY %(partition_by)s)', partition_by='id')).values('id', 'cumsum').order_by('id')

1 Comment

Thanks, really appreciate your answer. It didn't quite work for me and I've posted my amendment.
0

Check this

AModel.objects.order_by("id").extra(select={"cumsum":'SELECT SUM(m.a_number) FROM table_name m WHERE m.id <= table_name.id'}).values('id', 'cumsum')

where table_name should be the name of table in database.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.