i am currently working on a Django website and at a certain point need to generate a dataset for a graph out of model data. Over the last days i have been trying to optimize the code to generate that data, but its still relatively slow with a dataset which will definetely get a lot larger when the website is up.
The model i am working with looks like this:
class Prices(models.Model):
name = models.CharField(max_length=200, blank=False)
website = models.CharField(max_length=200, blank=False, choices=WEBSITES)
date= models.DateField('date', default=date.today(), blank=False, null=False)
price = models.IntegerField(default=0, blank=False, null=False)
class Meta:
indexes = [
models.Index(fields=['website']),
models.Index(fields=['date'])
]
It stores prices for different items(defined by the name) on different websites. For the graph representation of the model I need two arrays, one with all the dates as the y-Axis and one with the prices of the item at that date(which might not be available, not every item has a date entry for every day on every website). And all of that for every website.
My code for generating that data looks like this atm:
for website in websites:
iterator = Prices.objects.filter(website=website).iterator()
data = []
entry = next(iterator)
for date in labels:
if entry is not None and date == entry.date:
data.append(entry.price* 0.01)
entry = next(iterator, None)
else:
data.append(None)
... do stuff with data (not relevant for performance)
I loop over each website and retrieve all price data from my model. Then I loop over all dates(which are in the array labels), see if that date entry is available and if so, add that to the array, otherwise none. Does anybody have tips or ideas on how to optimize the inner for-loop, as thats what makes out 90% of my performance problem.
filteryour website data over the dates list directly, and even multi thread your function to make things go even faster.labels) and you want to filter your database entries (iterator) if they match the date from the list. Right?