I'm currently attempting to load several text files into MongoDB (they're in JSON format).
I tried using an OS walk, but I seem to be having trouble. My current method is:
>>> import pymongo
>>> import os
>>> import json
>>> from pymongo import Connection
>>> connection = Connection()
>>> db = connection.Austin
>>> collection = db.tweets
>>> collection = db.tweet_collection
>>> db.tweet_collection
Collection(Database(Connection('localhost', 27017), u'Austin'), u'tweet_collection')
>>> collection
Collection(Database(Connection('localhost', 27017), u'Austin'), u'tweet_collection')
>>> tweets = db.tweets
>>> tweet = open(os.path.expanduser('~/Tweets/10_7_2012_12:09-Tweets.txt'),'r')
>>> for line in tweet:
... d = json.loads(line)
... tweets.insert(d)
...
For inserting a single Tweet. I want to be able to open multiple files and run that same piece of code, namely the for loop that turns the JSON into python dictionaries and inserts it into the collection, autonomously.
Does anyone have a solid example of how to do this, complete with an explanation?
While we're on the topic, I'm attempting to use MongoDB with a poor understanding of databases (silly and stupid, I know), but MongoDB can support multiple instances of databases at the same time, and stores collections, which are groups of documents, and you can insert individual documents, correct?
(Also, please ignore the inconsistency between the collections tweets and tweet_collection.. I was just experimenting to get a better understanding)