0

I'm trying to understand how much time do some basic queries to my MongoDB collections (around 200k documents) take and i don't understand why according to the MongoDB profiler the query takes around 15 milliseconds, while from Python the query is going to take from 1 to 2 seconds.

In order to track the query from Python i did something very basic:

import time
from pymongo import MongoClient

client = MongoClient('')
db = client.my_db

start = time.time()
query = db['my_col'].find({'unix': {'$gte': 'some_timestamp' }})
data = list(query)
end = time.time()

print(end-start)

So here i'm just retrieving all the documents after a specific unix timestamp and then i convert the query to a Python list. The output of this code will range from >>1.01 to >>1.80 seconds on average, while according to the profiler the query takes just some milliseconds.

What am i missing here? Is it because what actually takes time is the loop through the cursor?

2
  • 1
    This code includes the time required to send the query to the database, execute the query, create a list with all of the query result objects, the network time to send the results from the server to the client and the time to establish a connection. The MongoDB profiler will only include the time to execute the query. This code measures something different than what the profiler measures. If you're concerned about performance, try increasing the cursor batch size and ensure that pymongo is using the C extensions. Commented May 31, 2021 at 23:09
  • Thank you a lot @MichaelRuth! That was very clear Commented May 31, 2021 at 23:14

1 Answer 1

1

pymongo won't make a connection to the database until the first transactional command. So your timings include all the connection setup etc.

This should give you more accurate timings:

import time
from pymongo import MongoClient

client = MongoClient('')
db = client.my_db

db['my_col'].find_one()

start = time.time()
query = db['my_col'].find({'unix': {'$gte': 'some_timestamp' }})
data = list(query)
end = time.time()

print(end-start)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you a lot! But do you have any idea why do the queries take a lot less, according to the profiler?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.