1

I'm trying to filter a queryset by checking that the object is in a list of those objects.

employee_list = [<Employee: A>, <Employee: B>, <Employee: C>]
qs = Employee.objects.filter(id__in=employee_list, [other_filters])

After running above, qs is an empty list. I was thinking that I could make a new list such as

employee_ids = [emp.id for emp in employee_list]
qs = Employee.objects.filter(id__in=employee_ids, [other_filters])

I haven't done benchmarking on this method, but I imagine performance would probably take a hit. Alternatively I could intersect the lists afterwards like:

qs = Employee.objects.filter([other_filters])
filtered_qs = [emp for emp in employee_lids if emp in qs]

However, I think the performance hit would be even worse this way.

What's the best/fastest way to do this? Thanks.

0

2 Answers 2

2

Rule of thumb is to filter as much as possible through SQL, so I would go for

qs = Employee.objects.filter(id__in=[emp.id for emp in employee_list], [other_filters])

I do not have any performance testing to back this up with though.

Sign up to request clarification or add additional context in comments.

1 Comment

I may note this greatly depends on the size of both the queryset, the list and the use case
1

As Martol1ni noted you want to filter at the SQL level whenever possible so I think your methods do get progressively slower, but another issue...

Based on the Django Docs: https://docs.djangoproject.com/en/dev/ref/models/querysets/ I think yourid__in should be a list of integer ids, not a list of models.

Edit: Oh I see he covered this in his answer, but it was not explicit that it was incorrect in your question.

Edit2: But yes, if you want to know for sure, what really matters is real world performance, which you can do with django-debug-toolbar. It seems to me though the real issue was the id__in misuse which lead you to find trickier ways to do what you wanted to do.

1 Comment

I ran a profiler on each of the different methods, but didn't see too much a difference between them. My test data set is much smaller than production, obviously, but constructing a list of id's from the list of objects and then filtering id__in id_list seemed to be the fastest.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.