Python/Django loop optimization

Question

i am currently working on a Django website and at a certain point need to generate a dataset for a graph out of model data. Over the last days i have been trying to optimize the code to generate that data, but its still relatively slow with a dataset which will definetely get a lot larger when the website is up.

The model i am working with looks like this:

class Prices(models.Model):
name = models.CharField(max_length=200, blank=False)
website = models.CharField(max_length=200, blank=False, choices=WEBSITES)
date= models.DateField('date', default=date.today(), blank=False, null=False)
price = models.IntegerField(default=0, blank=False, null=False)

class Meta:
    indexes = [
        models.Index(fields=['website']),
        models.Index(fields=['date'])
    ]

It stores prices for different items(defined by the name) on different websites. For the graph representation of the model I need two arrays, one with all the dates as the y-Axis and one with the prices of the item at that date(which might not be available, not every item has a date entry for every day on every website). And all of that for every website.

My code for generating that data looks like this atm:

    for website in websites:
    iterator = Prices.objects.filter(website=website).iterator()
    data = []
    entry = next(iterator)
    for date in labels:
        if entry is not None and date == entry.date:
            data.append(entry.price* 0.01)
            entry = next(iterator, None)
        else:
            data.append(None)
     ... do stuff with data (not relevant for performance)

I loop over each website and retrieve all price data from my model. Then I loop over all dates(which are in the array labels), see if that date entry is available and if so, add that to the array, otherwise none. Does anybody have tips or ideas on how to optimize the inner for-loop, as thats what makes out 90% of my performance problem.

You can filter your website data over the dates list directly, and even multi thread your function to make things go even faster. — BATMAN
– BATMAN, Commented Sep 16, 2020 at 12:09
the model query is not my performance problem, the for loop is — Tobias
– Tobias, Commented Sep 16, 2020 at 13:42
Just to be clear, You have a list of dates (labels) and you want to filter your database entries (iterator) if they match the date from the list. Right? — BATMAN
– BATMAN, Commented Sep 16, 2020 at 17:29
i have a list of dates and, for every website, want to generate a list of prices at that dates — Tobias
– Tobias, Commented Sep 16, 2020 at 18:38

BATMAN · Accepted Answer · 2020-09-20 16:31:52Z

1

This might not be as efficient as you want but you can try this. If this still does not meet your requirements, then all I can think of is doing this asynchronously.

iterator = Prices.objects.filter(website=website)

from collections import defaultdict
result = reduce(lambda acc, prices: acc[prices.date].append((prices.price * 0.01, prices.website)) or acc, filter(lambda x: x.date in labels, iterator), defaultdict(list))

One way to do this asynchronously is using concurrent.futures module

ThreadPoolExecutor(max_workers = 10) (you can specify maximum workers like this).

Moreover, if you want multiple Processes instead of Threads. You can simply replace ThreadPoolExecutor with ProcessPoolExecutor

result = defaultdict(list)
def reducer_function(price):
    if price.date in labels:
        result[price.date].append((prices.price * 0.01, prices.website))

with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(reducer_function, iterator)

edited Sep 20, 2020 at 16:31

answered Sep 17, 2020 at 6:18

BATMAN

3731 gold badge2 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Tobias Over a year ago

Yea i guess doing it asynchronously is the way to go. Thanks for your help though!

BATMAN Over a year ago

btw, did this solution work for you? if it did you can use the same for mapping this to concurrent.futures executor. May the Force be with You!

Tobias Over a year ago

Yes, got it working with a bit of tweaking. Its acually a little bit faster than my version, so im gonna try doing this asynchronously now.

BATMAN Over a year ago

I tried to do that in my way. Take a look if it works out..

Collectives™ on Stack Overflow

Python/Django loop optimization

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related