I have imported a csv dataset using python and did some clean ups. Download the dataset here
# importing pandas
import pandas as pd
# reading csv and assigning to 'data'
data = pd.read_csv('co-emissions-per-capita.csv')
# dropping all columns before 2016 (2016 - 2017 remains)
data.drop(data[data.Year < 2016].index, inplace=True)
# dropping rows with all null values in rows
data.dropna(how="all", inplace=True)
# dropping rows with all null values in columns
data.dropna(axis="columns", how="all", inplace=True)
# filling NA values
data["Entity"].fillna("No Country", inplace=True)
data["Code"].fillna("No Code", inplace=True)
data["Year"].fillna("No Year", inplace=True)
data["Per capita CO2 emissions (tonnes per capita)"].fillna(0, inplace=True)
# Sort by Year && Country
data.sort_values(["Year", "Entity"], inplace=True)
# renaming columns
data.rename(columns={"Entity": "Country",
"Per capita CO2 emissions (tonnes per capita)": "CO2 emissions (metric tons)"}, inplace=True)
My currecnt dataset has data for 2 years and 197 countries which is 394 rows
I want to insert the data into mongodb in the following format.
{
{
"_id": ObjectId("5dfasdc2f7c4b0174c5d01bc"),
"year": 2016,
"countries":
{
"name": "Afghanistan",
"code": "AFG",
"CO2 emissions (metric tons)": 0.366302
},
{
"name": "Albania",
"code": "ALB",
"CO2 emissions (metric tons)": 0.366302
}
},
{
"_id": ObjectId("5dfasdc2f7c4b0174c5d01bc"),
"year": 2017,
"countries":
{
"name": "Afghanistan",
"code": "AFG",
"CO2 emissions (metric tons)": 0.366302
},
{
"name": "Albania",
"code": "ALB",
"CO2 emissions (metric tons)": 0.366302
}
}
}
I want one object each for an year. Inside that I want to nest all the countries and it related information. To be precise, I want my database to have 2(max) objects and 197 nested objects inside each main object. So each year will only be listed once inside the database whereas each country will appear twice in the database 1 each for 1 year is there a better structure to store these data? please specify the steps to store these data into mongodb and I'd really appreciate if you can suggest a good 'mongoose for NodeJs' like ODM driver for python.