1

Should I be re-initializing the connection on every insert?

class TwitterStream:
  def __init__(self, timeout=False):
  while True:
    dump_data()

  def dump_data:
    ##dump my data into mongodb    
    ##should I be doing this every time??:
    client=MongoClient()
    mongo=MongoClient('localhost',27017)
    db=mongo.test
    db.insert('some stuff':'other stuff')
    ##dump data and close connection
    #########################

Do I need to open the connection every time I write a record? Or can I leave a connection open assuming I'll be writing to the database 5 times per second with about 10kb each time?

If just one connection is enough, where should I define the variables which hold the connection (client, mongo, db)?

2 Answers 2

1

Open one MongoClient that lives for the duration of your program:

client = MongoClient()

class TwitterStream:
    def dump_data:
        while True:
            db = client.test
            db.insert({'some stuff': 'other stuff'})

Opening a single MongoClient means you only pay its startup cost once, and its connection-pooling will minimize the cost of opening new connections.

If you're concerned about surviving occasional network issues, wrap your operations in an exception block:

try:
    db.insert(...)
except pymongo.errors.ConnectionFailure:
    # Handle error.
    ...
Sign up to request clarification or add additional context in comments.

Comments

0

Opening connections is in general an expensive operation, so I recommend you to reuse them as much as possible.

In the case of MongoClient, you should be able to leave the connection open and keep reusing it. However, as the connection lives on for a longer time, eventually you'll start hitting connectivity issues. The recommended solution for this it to configure MongoClient to use auto-reconnect, and catch the AutoReconnect exception as part of your retry mechanisms.

Here's an example of said approach, taken from http://python.dzone.com/articles/save-monkey-reliably-writing:

while True:
    time.sleep(1)
    data = {
        'time': datetime.datetime.utcnow(),
        'oxygen': random.random()
    }

    # Try for five minutes to recover from a failed primary
    for i in range(60):
        try:
            mabel_db.breaths.insert(data, safe=True)
            print 'wrote'
            break # Exit the retry loop
        except pymongo.errors.AutoReconnect, e:
            print 'Warning', e
            time.sleep(5)

1 Comment

thank you very much for your great answer to this. can you please point out some examples for me? all the way in your stand what you are saying I have no clue how to implement.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.