I want to insert pandas DataFrame into MongoDB. However, when I do so, The timestamp column (which is the index_coloumn of the Dataframe) does not get inserted into MongoDB.
Below is my pseudocode code which reproduces the problem:
from datetime import datetime
import pandas as pd
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.ticks
collection = db.STOCK
collection_ohlc = db.STOCK_ohlc
# Read per second ticks data from Mongo into a dataframe
results = collection.find(
{'timestamp': {'$gte': '2019-01-24T09:15:00', '$lte': '2019-01-24T09:19:59'}})
df = pd.DataFrame(list(results))
# Convert per second ticks data into 1 Minute OHLC Candle
df['timestamp'] = pd.to_datetime(df['timestamp'], errors='coerce')
df.set_index('timestamp', inplace=True)
ohlc_data = df['ltp'].resample('5min').ohlc()
# Print OHLC candle dataframe
print(ohlc_data)
# Write the OHLC candle back to Mongo into a new collection STOCK_ohlc
collection_ohlc.insert_many(ohlc_data.to_dict('records'))
Here is the output of above print(ohlc_data) statement:
open high low close
timestamp
2019-01-24 09:15:00 286.55 286.7 285.5 285.65
Now the code runs fine and ohlc values are inserted in MongoDB. However, the timestamp column is missing.
Below is MongoShell which lists above inserted record:
> db.STOCK_ohlc.find()
{ "_id" : ObjectId("5c6abc6f4994a1bc8c3c08fd"), "open" : 286.55, "high" : 286.7, "low" : 285.5, "close" : 285.65 }
>
As we see, the timestamp is missing from above inserted record. This is useless if timestamp is missing.
I tried various orient as mentioned in pandas.DataFrame.to_dict but none of them seem to be inserting into the MongoDB. The only orient that inserts data is records but then it omits timestamp.
Any pointers would be of great help.
UPDATE:
Here is the output of print(ohlc_data.to_dict('records'))
[{'open': 286.55, 'high': 286.7, 'low': 285.5, 'close': 285.65}]
ohlc_data.to_dict('records')? It seems the problem is your row key is type oftimestampand Mongo needs string key. Somehow its omitted. Try this solution: https://stackoverflow.com/a/36909509/3710490