Memory leak with Django + Django Rest Framework + mod_wsgi

Question

I have the following code where I have a function based view which uses a ModelSerializer to serialize data. I am running this with apache + mod_wsgi (with 1 worker thread, 1 child threads and 1 mod_wsgi threads for the sake of simplicity).

With this, my memory usage shoots up significantly (200M - 1G based on how large the query is) and stays there and does not come down even on request completion. On subsequent requests to the same view/url, the memory increases slightly everytime but does not take a significant jump. To rule out issues with django-filter, I have modified my view and have written filtering query myself.

The usual suspect that DEBUG=True is ruled out as I am not running in DEBUG mode. I have even tried to use guppy to see what is happening but I was unable to get far with guppy. Could someone please help why the memory usage is not down after the request is completed and how to go about debugging it?

Update: I am using default CACHE setting i.e. I have not defined it at all, in which case I presume it is going to use Local Memory for cache as mentioned in the docs.

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
    }
}



class MeterData(models.Model):
    meter = models.ForeignKey(Meter)
    datetime = models.DateTimeField()

    # Active Power Total
    w_total = models.DecimalField(max_digits=13, decimal_places=2,
                                  null=True)
    ...


class MeterDataSerializer(serializers.ModelSerializer):
    class Meta:
        model = MeterData
        exclude = ('meter', )


@api_view(['GET', ])
@permission_classes((AllowAny,))
def test(request):
    startDate = request.GET.get('startDate', None)
    endDate = request.GET.get('endDate', None)
    meter_pk = request.GET.get('meter', None)
    # Writing query ourself instead of using django-filter to
    # to keep things simple.
    queryset = MeterData.objects.filter(meter__pk=meter_pk,
                                        datetime__gte=startDate,
                                        datetime__lte=endDate)


    logger.info(queryset.query)
    kwargs = {}
    kwargs['context'] = {
        'request': request,
        'view': test,
        'format': 'format',
    }
    kwargs['many'] = True

    serializer = MeterDataSerializer(queryset, **kwargs)
    return Response(serializer.data)

@Sayse: I have left the CACHE setting to default i.e not defined it, in which case, it will use LocalMemory for cache I presume. — Divick
– Divick, Commented Dec 1, 2016 at 11:56
I don't believe it is a leak, I think what you're seeing is data cached but I don't think I have enough to say for certain.. (local memory caching) — Sayse
– Sayse, Commented Dec 1, 2016 at 12:00
Ignoring the cache, the way the UNIX memory model works is that when a process allocates memory, even if it is freed, it is only freed back to the in process memory allocator in most cases, it doesn't get freed back to the operating system. Thus the process memory usage will not reduce, but the memory will still be reused for subsequent memory allocations within the same process. — Graham Dumpleton
– Graham Dumpleton, Commented Dec 1, 2016 at 23:46
So if you pull in huge amounts of data and work on it, you can expect memory usage of the process to be quite high. If you are processing the data, if possible don't pull it all into memory at the same time, but pull the data in batches and process it a part at a time. — Graham Dumpleton
– Graham Dumpleton, Commented Dec 1, 2016 at 23:47

Sayse · Accepted Answer · 2016-12-01 12:05:51Z

3

Whilst I can't say for certain, I'll add this as an answer anyway to be judged on it...

As you know, django's default cache is the LocMemCache

Which in the above docs you'll find:

Note that each process will have its own private cache instance

And I think this is all you're seeing. The jump in memory is just the storage of your query. I'd think you only need to be concerned if this memory usage continued to grow beyond a normalcy.

The same doc also says it might not be very viable in production so it might be time to move beyond this, which would also allow you to see if caching was the culprit.

answered Dec 1, 2016 at 12:05

Sayse

43.4k14 gold badges85 silver badges150 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Memory leak with Django + Django Rest Framework + mod_wsgi

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related