0

So, I have a set of models in my database. Now I have made a script to web scrape certain items form a website. Now I wanted to delete all items from my database which are deleted in the website. So what I mean is if an item that I have web scraped is deleted in the website I want to do the same in my database. How do I check for this using a loop?

I was thinking of doing this:

items = ItemModel.objects.all()
for item in items:
  if item.tite not in webscrape_item[0]:
     item.delete

Im checking based on the title and deleting if the title does not exist in the web scrape array.

1 Answer 1

1

Solution

No need to even do a loop, it's more efficient this way:

ItemModel.objects.exclude(tite__in=webscrape_item[0]).delete()

Explanation

Let's break the above down a bit: If you want to filter a queryset you would normally do:

ItemModel.objects.filter(some_property=some_value)

But we can just as well tell django what to exclude rather than what to include using .exclude. Hence:

ItemModel.objects.exclude(some_property=some_value)

Now in your case, we don't want to exclude just one value of tite but any value that is in your list. For this django provides __in= which is a way of saying "if the value is in this list". The following returns a querySet all of the values you want to delete:

ItemModel.objects.exclude(tite__in=webscrape_item[0])

But every querySet has a method delete which deletes all of the values in the queryset from the database. Hence the final expression:

ItemModel.objects.exclude(tite__in=webscrape_item[0]).delete()

Note though that if you have provided a custom delete method for your model, this will not necessarily be called. But otherwise this is a better way to go than a loop.

Why is it more efficient?

Why didn't we do this:

items_to_delete = ItemModel.objects.exclude(tite__in=webscrape_item[0])
for item in items_to_delete:
    item.delete()

Well the above loop, is one hit to the database to get items_to_delete, then an additional n hits to the database for every time we call item.delete(). Where as the solution I've suggested will attempt to do this in a single SQL statement.

Sign up to request clarification or add additional context in comments.

3 Comments

can you explain what the code is doing here
@AdamBaser Sure :) I've updated the answer to include an explanation. I hope this helps
Thank you so much man! Really appretiate it! :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.