2

I have a file which contains several json records. I have to parse this file and load each of the jsons to a particular SQL-Server table. However, the table might not exist on the database, in which case I have to also create it first before loading. So, I have to parse the json file and figure out the fields/columns and create the table. Then I will have to de-serialize the jsons into records and insert them into the table created. However, the caveat is that some fields in the json are optional i.e. a field might be absent from one json record but could be present in another record. Below is an example file with 3 records :-

{ id : 1001, 
  name : "John", 
  age : 30 
} , 

{ id : 1002,
  name : "Peter",
  age : 25
},

{ id : 1002,
  name : "Kevin",
  age : 35,
  salary : 5000
},

Notice that the field salary appears only in the 3rd record. The results should be :-

CREATE TABLE tab ( id int, name varchar(100), age int, salary int );

INSERT INTO tab (id, name, age, salary) values (1001, 'John', 30, NULL)
INSERT INTO tab (id, name, age, salary) values (1002, 'Peter', 25, NULL)
INSERT INTO tab (id, name, age, salary) values (1003, 'Kevin', 35, 5000)

Can anyone please help me with some pointers as I am new to Python. Thanks.

2
  • Perhaps, SQL databases are not the best choice. SQL needs a fixed schema. Commented Aug 22, 2018 at 21:52
  • 2
    Just a heads up that your example JSON file is not a valid JSON. You would need to encapsulate the entire file in a set of square brackets, as well as encapsulate your keys with double quotations. Commented Aug 22, 2018 at 21:56

2 Answers 2

2

You could try this:

import json

TABLE_NAME = "tab"

sqlstatement = ''
with open ('data.json','r') as f:
    jsondata = json.loads(f.read())

for json in jsondata:
    keylist = "("
    valuelist = "("
    firstPair = True
    for key, value in json.items():
        if not firstPair:
            keylist += ", "
            valuelist += ", "
        firstPair = False
        keylist += key
        if type(value) in (str, unicode):
            valuelist += "'" + value + "'"
        else:
            valuelist += str(value)
    keylist += ")"
    valuelist += ")"

    sqlstatement += "INSERT INTO " + TABLE_NAME + " " + keylist + " VALUES " + valuelist + "\n"

print(sqlstatement)

However for this to work, you'll need to change your JSON file to correct the syntax like this:

[{  
    "id" : 1001, 
    "name" : "John", 
    "age" : 30 
} , 

{   
    "id" : 1002,
    "name" : "Peter",
    "age" : 25
},

{
    "id" : 1003,
    "name" : "Kevin",
    "age" : 35,
    "salary" : 5000
}]

Running this gives the following output:

INSERT INTO tab (age, id, name) VALUES (30, 1001, 'John')
INSERT INTO tab (age, id, name) VALUES (25, 1002, 'Peter')
INSERT INTO tab (salary, age, id, name) VALUES (5000, 35, 1003, 'Kevin')

Note that you don't need to specify NULLs. If you don't specify a column in the insert statement, it should automatically insert NULL into any columns you left out.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a ton @Jon Warren . Worked like a charm... that's exactly what I needed. Also learned a lot of new things. Thanks again.
Hi @John Warren and Mathew I was wondering if you know how to solve the same problem like the one above, but for nested (with more complex tree structure) Json format? I have my question here: stackoverflow.com/questions/68777843/…
2

In Python, you can do something like this using sqlite3 and json, both from the standard library.

import json
import sqlite3

# The string representing the json.
# You will probably want to read this string in from
# a file rather than hardcoding it.
s = """[
    {
        "id": 1001, 
        "name": "John", 
        "age" : 30 
    }, 
    {
        "id" : 1002,
        "name" : "Peter",
        "age" : 25
    },
    {
        "id" : 1002,
        "name" : "Kevin",
        "age" : 35,
        "salary" : 5000
    }
]"""

# Read the string representing json
# Into a python list of dicts.
data = json.loads(s)


# Open the file containing the SQL database.
with sqlite3.connect("filename.db") as conn:

    # Create the table if it doesn't exist.
    conn.execute(
        """CREATE TABLE IF NOT EXISTS tab(
                id int,
                name varchar(100),
                age int,
                salary int
            );"""
        )

    # Insert each entry from json into the table.
    keys = ["id", "name", "age", "salary"]
    for entry in data:

        # This will make sure that each key will default to None
        # if the key doesn't exist in the json entry.
        values = [entry.get(key, None) for key in keys]

        # Execute the command and replace '?' with the each value
        # in 'values'. DO NOT build a string and replace manually.
        # the sqlite3 library will handle non safe strings by doing this.
        cmd = """INSERT INTO tab VALUES(
                    ?,
                    ?,
                    ?,
                    ?
                );"""
        conn.execute(cmd, values)

    conn.commit()

This will create a file named 'filename.db' in the current directory with the entries inserted.

To test the tables:

# Testing the table.
with sqlite3.connect("filename.db") as conn:
    cmd = """SELECT * FROM tab WHERE SALARY NOT NULL;"""
    cur = conn.execute(cmd)
    res = cur.fetchall()
    for r in res:
        print(r)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.