3

I have been practicing Django for a while now. Currently I am using it in a project where I'm fetching Facebook data via GET requests and then saving it to an sqlite database using Django models. I would like to know how can I improve the following code and save a list of Facebook posts and their metrics efficiently. In my current situation, I am using a for loop to iterate on a list containing several Facebook Posts and their respective metrics which is then associated to the specific Django model and finally saved.


def save_post(post_id, page_id):

    facebook_post = Post(post_id=post_id,
                    access_token=fb_access_token)

    post_db = PostsModel(page_id=page_id, post_id=post.post_id)
    post_db.message = facebook_post.message
    post_db.story = facebook_post.story
    post_db.full_picture = facebook_post.full_picture
    post_db.reactions_count = facebook_post.reactions_count
    post_db.comments_count = facebook_post.comments_count
    post_db.shares_count = facebook_post.shares_count
    post_db.interactions_count = facebook_post.interactions_count
    post_db.created_time = facebook_post.created_time
    post_db.published = facebook_post.published
    post_db.attachment_title = facebook_post.attachment_title
    post_db.attachment_description = facebook_post.attachment_description
    post_db.attachment_target_url = facebook_post.attachment_target_url
    post_db.save()

post_db is a Django model object instantiated using PostsModel while Post is a normal Python Class which I wrote. The latter is simply a collection of GET requests which fetches data from Facebook's Graph API and returns JSON data whereby I associate relevant data to class attributes (message, 'shares_count`).

I read about the bulk_create function from Django's documentation but I don't know how to pass on the above. I also tried using multiprocessing and Pool but the above function does execute. Right now, I am just iterating sequentially on a list. As the list increases in length, it takes more time to save.


def create(self, request):
        page_id = request.data['page_id']

        page = get_object_or_404(PagesModel, pk=page_id)
        post_list = get_list_or_404(PostsModel, page_id=page_id)

        for post_id in post_list:
            save_post(post_id=post_id, page_id=page)

The above function gets an already saved list from the database for a specific page based on the page_id. Then, the for loop iterates on each post in the list and its post_id and page instance are sent to the save_post function to fetch its data and save it.

Huge thanks if anyone can suggest a more effective way to tackle this. Thank you.

1
  • I don't think for this bulk_create is useful. In your case, it only hits database server at ones (.save). So, just looping should be it. Commented Sep 24, 2020 at 12:34

1 Answer 1

5

You are going in the right direction with the bulk_load. Generate a list of the PostsModel objects and then use bulk_create to upload them into the database. An important note here is that it won't work if the posts already exist in the database. For updating posts, try bulk_update.

def save_post(post_id, page_id):

    facebook_post = Post(post_id=post_id,
                access_token=fb_access_token)

    post_db = PostsModel(page_id=page_id, post_id=post.post_id)
    post_db.message = facebook_post.message
    post_db.story = facebook_post.story
    post_db.full_picture = facebook_post.full_picture
    post_db.reactions_count = facebook_post.reactions_count
    post_db.comments_count = facebook_post.comments_count
    post_db.shares_count = facebook_post.shares_count
    post_db.interactions_count = facebook_post.interactions_count
    post_db.created_time = facebook_post.created_time
    post_db.published = facebook_post.published
    post_db.attachment_title = facebook_post.attachment_title
    post_db.attachment_description = facebook_post.attachment_description
    post_db.attachment_target_url = facebook_post.attachment_target_url
    return post_db

def create(self, request):
    page_id = request.data['page_id']

    page = get_object_or_404(PagesModel, pk=page_id)
    post_list = get_list_or_404(PostsModel, page_id=page_id)
    
    post_model_list = [save_post(post_id=post_id, page_id=page) for post_id in 
                       post_list]
    
    PostsModel.objects.bulk_create(post_model_list)

        
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.