0

I have a complex MongoDB database, consisting of documents nested upto 7 levels deep. I need to use PyMongo to extract the data, and then convert the extracted data to a .csv file.

4
  • What have you tried so far? Commented Jul 2, 2018 at 4:08
  • Can you use mongoexport? Commented Jul 2, 2018 at 4:11
  • So far I am able to extract the entire database and store it as a Python object. I am then able to convert this object to a .csv file. However the .csv file has thousands of columns. I need to know how I can extract the data in a clean manner. Commented Jul 2, 2018 at 4:11
  • @Astro I can use mongoexport, but the .csv file has thousands of columns. I need to extract the data in an organized manner. I'm ok with extracting the data in multiple csv files and then combining all csv files into one. I'm not sure how to proceed with that though; I just know how to extract data as a whole. Commented Jul 2, 2018 at 4:13

1 Answer 1

1

You can try using json_normalize. It is used to flatten the json.Reads data to a dataframe which can be stored in csv later.

For eg:

from pandas.io.json import json_normalize

# mongo_value is your mongo query
mongo_aggregate = db.events.aggregate(mongo_value)

mongo_df = json_normalize(list(mongo_aggregate))

# print(mongo_df)
mongo_columns = list(mongo_df.columns.values)

#just picks the column_name instead of properties.something.something.column_name

for w in range(len(mongo_columns)):
    mongo_columns[w] = mongo_columns[w].split('.')[-1].lower()

mongo_df.columns = mongo_columns

For reference read this https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.io.json.json_normalize.html

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.