0

i am currently working on a Django website and at a certain point need to generate a dataset for a graph out of model data. Over the last days i have been trying to optimize the code to generate that data, but its still relatively slow with a dataset which will definetely get a lot larger when the website is up.

The model i am working with looks like this:

class Prices(models.Model):
name = models.CharField(max_length=200, blank=False)
website = models.CharField(max_length=200, blank=False, choices=WEBSITES)
date= models.DateField('date', default=date.today(), blank=False, null=False)
price = models.IntegerField(default=0, blank=False, null=False)

class Meta:
    indexes = [
        models.Index(fields=['website']),
        models.Index(fields=['date'])
    ]

It stores prices for different items(defined by the name) on different websites. For the graph representation of the model I need two arrays, one with all the dates as the y-Axis and one with the prices of the item at that date(which might not be available, not every item has a date entry for every day on every website). And all of that for every website.

My code for generating that data looks like this atm:

    for website in websites:
    iterator = Prices.objects.filter(website=website).iterator()
    data = []
    entry = next(iterator)
    for date in labels:
        if entry is not None and date == entry.date:
            data.append(entry.price* 0.01)
            entry = next(iterator, None)
        else:
            data.append(None)
     ... do stuff with data (not relevant for performance)

I loop over each website and retrieve all price data from my model. Then I loop over all dates(which are in the array labels), see if that date entry is available and if so, add that to the array, otherwise none. Does anybody have tips or ideas on how to optimize the inner for-loop, as thats what makes out 90% of my performance problem.

4
  • You can filter your website data over the dates list directly, and even multi thread your function to make things go even faster. Commented Sep 16, 2020 at 12:09
  • the model query is not my performance problem, the for loop is Commented Sep 16, 2020 at 13:42
  • Just to be clear, You have a list of dates (labels) and you want to filter your database entries (iterator) if they match the date from the list. Right? Commented Sep 16, 2020 at 17:29
  • i have a list of dates and, for every website, want to generate a list of prices at that dates Commented Sep 16, 2020 at 18:38

1 Answer 1

1

This might not be as efficient as you want but you can try this. If this still does not meet your requirements, then all I can think of is doing this asynchronously.

iterator = Prices.objects.filter(website=website)

from collections import defaultdict
result = reduce(lambda acc, prices: acc[prices.date].append((prices.price * 0.01, prices.website)) or acc, filter(lambda x: x.date in labels, iterator), defaultdict(list))

One way to do this asynchronously is using concurrent.futures module

ThreadPoolExecutor(max_workers = 10) (you can specify maximum workers like this).

Moreover, if you want multiple Processes instead of Threads. You can simply replace ThreadPoolExecutor with ProcessPoolExecutor

result = defaultdict(list)
def reducer_function(price):
    if price.date in labels:
        result[price.date].append((prices.price * 0.01, prices.website))

with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(reducer_function, iterator)
Sign up to request clarification or add additional context in comments.

4 Comments

Yea i guess doing it asynchronously is the way to go. Thanks for your help though!
btw, did this solution work for you? if it did you can use the same for mapping this to concurrent.futures executor. May the Force be with You!
Yes, got it working with a bit of tweaking. Its acually a little bit faster than my version, so im gonna try doing this asynchronously now.
I tried to do that in my way. Take a look if it works out..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.