I have a big database of several json file. There are tweets and each tweet is a
{"object":"Message",
"action":"create",
"data":{"id":152376374,"body":"@PeakySellers right there with you brother. 368$","created_at":"2019-01-31T23:59:56Z",
"user":{"id":971815,"username":"mtothe5thpower",}'
}
and I have 3 million row in one file and the size is more than 5GB. I use pandas to read the file and it works well
data2=pd.read_table('file', sep="\n",header=None)
Now I have a database and in each row, there is one element (like a tweet that I mentioned earlier) and its type is String. Now I convert each element to a dictionary to use the file and access each element. I am using the code below:
for i, row in data2.itertuples():
data2["dic"][i] = json.loads(data2[0][i])
While this code successfully converts each string to a dictionary, it is very slow. I think there should be a faster way for this task. Thank you in advance for any help or suggestions.
for i, row in enumerate(data2.itertuples()):. That will save you from having to manageiyourself.