0

For a project to create a database, I wanted to convert .json files into .sqlite3 files using python (currently running Python 3.6.4 on Windows 10). Below is the code designed to read a json file

...

with open('C:/Documents/{}/Posts.json'.format(forum), encoding="utf8") as f:
            row = json.load(f)
            parent_id = row['_Id']
            body = format_data(row['_Body'])
            score = row['_Score']
            comment_id = row['_Id']
            comment_id_type = row['_PostTypeId']
            parent_id_type = row['_PostTypeId']
            accepted_answer_id = row['_AcceptedAnswerId']
            accepted_parent_id = row['_ParentId']

            ...

While running this code I encounter this error.

File "C:\Python\data base.py", line 85, in <module>
parent_id = row['_Id']
KeyError: '_Id'

I've read into this error finding that, according to the official python docs, the exception KeyError is

Raised when a mapping (dictionary) key is not found in the set of existing keys.

Now I've had trouble understanding this syntax because '_Id' exists in the json file(as seen below)

{
   "posts": {
      "row": [
         {
            "_Id": "1",
            "_PostTypeId": "1",
            "_AcceptedAnswerId": "3",
            "_CreationDate": "2016-08-02T15:39:14.947",
            "_Score": "5",
            "_ViewCount": "254",
            "_Body": "<p>What does \"backprop\" mean? I've Googled it, but it's showing backpropagation.</p>\n\n<p>Is the \"backprop\" term basically the same as \"backpropagation\" or does it have a different meaning?</p>\n",
            "_OwnerUserId": "8",
            "_LastEditorUserId": "7488",
            "_LastEditDate": "2017-05-28T13:48:02.003",
            "_LastActivityDate": "2017-05-28T13:48:02.003",
            "_Title": "What is \"backprop\"?",
            "_Tags": "<neural-networks><definitions><terminology>",
            "_AnswerCount": "3",
            "_CommentCount": "3"
         },

(This is a json from AI:stackexchange data)

I request someone help give me a solution to my KeyError, for other sources I have searched yield me no help

Please and thank you, in advance.

1
  • You should just use Pandas to read JSON, and dump a Dataframe to a database... Commented Jan 7, 2018 at 0:08

2 Answers 2

2

First you have to access "posts"

with open('C:/Documents/{}/Posts.json'.format(forum), encoding="utf8") as f:
    j = json.load(f)
    for row in j['posts']['row']:
        parent_id = row['_Id']
        body = format_data(row['_Body'])
        # ...
Sign up to request clarification or add additional context in comments.

Comments

1

KeyError is raised when you request for a key which does not exist in the dictionary. In your case, from the json, it seems you have to access it like so,

json['posts' ]['row'][0].

posts is a dict. row is a list of dicts. A list is ordered, that's why we can index into it.

Full code:

with open('C:/Documents/{}/Posts.json'.format(forum), encoding="utf8") as f:
            jsondict = json.load(f)

            # Remember, posts > row > first_index
            row = jsondict['posts']['row'][0]
            parent_id = row['_Id']
            body = format_data(row['_Body'])
            score = row['_Score']
            comment_id = row['_Id']
            comment_id_type = row['_PostTypeId']
            parent_id_type = row['_PostTypeId']
            accepted_answer_id = row['_AcceptedAnswerId']
            accepted_parent_id = row['_ParentId']
            ...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.