0

I've been loading json data from a file like this:

with open("data.json") as jd:
    print("loading json")
    j = json.load(jd)
    print("inserting")
    SendToPostGres(j)

def SendToPostGres(incs):
    length = len(incs)
    processed = 0
    pgParams = {
            'database': 'mydb',
            'user': 'hi',
            'password': '2u',
            'host': 'somedb.com',
            'port': 1111
            }
    conn = psycopg2.connect(**pgParams)
    curs = conn.cursor()

    for i in incs:
        curs.execute("insert into MY_TABLE (data) values (%s)", [Json(i)])
        processed += 1
        conn.commit()
        print("%s processed, %s remaining" % (processed, length+1-processed))

This is highly inefficient. I've tried googling this and looking at other posts, but I can't seem to get the desired effect of: "For each item in my list of json, create a row in my database with the corresponding data stored as a json type in postgres."

Could someone explain to me the most efficent way to do this in bulk?

UPDATE:

Per an answer below, I've tried updating to use the execute_values function from extras. The error I'm receiving now is:

"string index out of range"

Note that I tried changing page size, because I thought that might be related. What I tried didn't work. But it might still be an issue.

def SendToPostGres(incs):
    values = []
    for i in incs:
        values.append(json.dumps(i))

    pgParams = {
            'database': 'MY_DB',
            'user': 'hi',
            'password': '2u',
            'host': 'somedb.com',
            'port': 5432
            }
    conn = psycopg2.connect(**pgParams)
    curs = conn.cursor()

    try:
        psycopg2.extras.execute_values(curs, "insert into incidents (data) values (%s)", values, page_size=len(values))
    except Exception as e:
        raise e
    rows = curs.fetchall()
    curs.close()
0

1 Answer 1

2

Use extras.execute_values from psycopg2.

Use '%s' syntax in your query to designate where values should be injected.

This is incredibly fast compared to your current method.

from psycopg2 import extras

def queryPostgresBulk(conn, query, values):

    _query = query
    _values = values
    _conn = conn
    _cur = _conn.cursor()
    try:
        extras.execute_values(_cur, _query, _values, page_size=_values.__len__())
    except Exception, e:
        raise e
    rows = _cur.fetchall()
    _cur.close()

    return rows

Update to OP comment:

Use json.dumps() to convert your list of dicts to a list of strings tuples of json strings, the format expected by the function. Pass it a list of json strings tuples of json strings, rather than dicts representing json objects.

import json

_values = []
for dict in list
    _values.append((json.dumps(dict),))

Or with list comprehension:

_values = [(json.dumps(x),) for x in list]

Also worth pointing out that the data you're loading isn't in valid json format without a single key at the top level.

Update to OP comment again:

You need to supply a list of tuples as values, with the json strings being within that tuple. If the only data you want to inject in values is the json string, then you need to update your for loop building values to:

for i in incs:
    values.append((json.dumps(i),))

Not sure why I'm posting this since you downvoted my correct answers to your two earlier versions of your question...hopefully it will help someone else.

Sign up to request clarification or add additional context in comments.

8 Comments

I tried that, but the result I got is "Dict does not support indexing." The type in '_values' in your code is a list of dicts.
Post the format of your data if you need help transforming it to fit the psycopg2 function's expectations. If it's not a list of dicts that you're iterating through, what is it?
It is a list of dicts. i.e: [{"hi": "guy"}, {"you":"rock"]. I've tried putting the "values" value in as just the raw object and tried to map the Json function from extra to it, wrap the whole list in Json, and do [Json(incs)].
Sorry but your updated answer still doesn't work. I'll update my question.
My answer does work. The rest of your code doesn’t. If you need more help, post the stack trace of your error in another question
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.