5

I have dumped a mongodb collection using the mongodump command. The output is a dump directory which has these files:

dump/
    |___coll.bson
    |___coll.metadata.json

How can I open the exported files to a array of dictionaries that work in python? I tried the following and none worked:

with open('dump/coll.bson', 'rb') as f:
    coll_raw = f.read()
import json
coll = json.loads(coll_raw)

# Using pymongo
from bson.json_util import loads
coll = loads(coll_raw)

ValueError: No JSON object could be decoded

2 Answers 2

10

You should try:

from bson import BSON
with open('dump/coll.bson', 'rb') as f:
    coll_raw = f.read()

coll = bson.decode_all(coll_raw) 
Sign up to request clarification or add additional context in comments.

10 Comments

This probably means that your BSON is incorrect, can you send me a sample BSON object that you are trying to decode ?
The bson file is the dump I got with mongodump. The file is huge. Let me see if I can replicate the error with a small database.
Did you try running BSON.is_valid(coll_row)?
@YashMehrotra here's the file: dropbox.com/s/6yyssja0la0ctln/dump.zip?dl=0 Direct output of mongodump
@Quirk object 'BSON' has no attribute 'is_valid'
|
0

I know this was answered a long time ago, but you could try decoding each document separately and then you'd know which doc is causing the problem.

I use this library: https://github.com/bauman/python-bson-streaming

from bsonstream import KeyValueBSONInput
f = open("restaurants.bson", 'rb')
stream = KeyValueBSONInput(fh=f)
for dict_data in stream:
    print dict_data
f.close()

I see 25359 records which all seem to decode to something like:

{u'_id': ObjectId('5671bb2e111bb7b9a7ce4d9a'),
 u'address': {u'building': u'351',
              u'coord': [-73.98513559999999, 40.7676919],
              u'street': u'West   57 Street',
              u'zipcode': u'10019'},
 u'borough': u'Manhattan',
 u'cuisine': u'Irish',
 u'grades': [{u'date': datetime.datetime(2014, 9, 6, 0, 0),
              u'grade': u'A',
              u'score': 2},
             {u'date': datetime.datetime(2013, 7, 22, 0, 0),
              u'grade': u'A',
              u'score': 11},
             {u'date': datetime.datetime(2012, 7, 31, 0, 0),
              u'grade': u'A',
              u'score': 12},
             {u'date': datetime.datetime(2011, 12, 29, 0, 0),
              u'grade': u'A',
              u'score': 12}],
 u'name': u'Dj Reynolds Pub And Restaurant',
 u'restaurant_id': u'30191841'}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.