Python - creating string from dict items (for writing to postgresql db)

Question

I'm writing some code using psycopg2 to connect to a PostGreSQL database.

I have a lot of different data types that I want to write to different tables in my PostGreSQL database. I am trying to write a function that can write to each of the tables based on a single variable passed in the function and I want to write more than 1 row at a time to optimize my query. Luckily PostGreSQL allows me to do that: PostGreSQL Insert:

INSERT INTO films (code, title, did, date_prod, kind) VALUES
('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy');

I have run into a problem that I was hoping someone could help me with.

I need to create a string:

string1 = (value11, value21, value31), (value12, value22, value32)

The string1 variable will be created by using a dictionary with values. So far I have been able to create a tuple that is close to the structure I want. I have a list of dictionaries. The list is called rows:

string1 = tuple([tuple([value for value in row.values()]) for row in rows])

To test it I have created the following small rows variable:

rows = [{'id': 1, 'test1': 'something', 'test2': 123},
        {'id': 2, 'test1': 'somethingelse', 'test2': 321}]

When rows is passed through the above piece of code string1 becomes as follows:

((1, 'something', 123), (2, 'somethingelse', 321))

As seen with string1 I just need to remove the outmost parenthesis and make it a string for it to be as I need it. So far I don't know how this is done. So my question to you is: "How do I format string1 to have my required format?"

Clodoaldo Neto · Accepted Answer · 2017-05-23 13:46:24Z

2

execute_values makes it much easier. Pass the dict sequence in instead of a values sequence:

import psycopg2, psycopg2.extras

rows = [
    {'id': 1, 'test1': 'something', 'test2': 123},
    {'id': 2, 'test1': 'somethingelse', 'test2': 321}
]

conn = psycopg2.connect(database='cpn')
cursor = conn.cursor()

insert_query = 'insert into t (id, test1, test2) values %s'
psycopg2.extras.execute_values (
    cursor, insert_query, rows,
    template='(%(id)s, %(test1)s, %(test2)s)',
    page_size=100
)

And the values are inserted:

table t;
 id |     test1     | test2 
----+---------------+-------
  1 | something     |   123
  2 | somethingelse |   321

To have the number of affected rows use a CTE:

insert_query = '''
    with i as (
        insert into t (id, test1, test2) values %s
        returning *
    )
    select count(*) from i
'''
psycopg2.extras.execute_values (
    cursor, insert_query, rows,
    template='(%(id)s, %(test1)s, %(test2)s)',
    page_size=100
)
row_count = cursor.fetchone()[0]

edited May 23, 2017 at 13:46

answered May 23, 2017 at 12:46

Clodoaldo Neto

127k30 gold badges251 silver badges274 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Zeliax Over a year ago

Does execute_values support Python3?

Clodoaldo Neto Over a year ago

@Zeliax I tested it in Python3. execute_values is new in Psycopg 2.7. Check your version: psycopg2.__version__

Zeliax Over a year ago

Thanks. I just updated it. I was on 2.6.2. Will check if this method does the trick. Do I need to execute and do a conn.commit() after the last line in your code snippet?

Zeliax Over a year ago

Lastly, what does the page_size=100 mean? Is that how many are inserted in each "pass-through"?

Clodoaldo Neto Over a year ago

@Zeliax Yes, if there are more than that another insert will be executed.

|

DexJ · Accepted Answer · 2017-05-23 10:07:27Z

2

With little modification you can achieve this. change your piece of cod as follows

','.join([tuple([value for value in row.values()]).__repr__() for row in rows])

current output is

tuple of tuple

(('something', 123, 1), ('somethingelse', 321, 2))

After changes output will be

in string format as you want

"('something', 123, 1),('somethingelse', 321, 2)"

answered May 23, 2017 at 10:07

DexJ

1,26413 silver badges24 bronze badges

2 Comments

Zeliax Over a year ago

Thanks. This is just what I needed. I just couldn't wrap my head around it.

DexJ Over a year ago

sorry my first language is not English! guessing answer is useful :)

Vsevolod Kulaga · Accepted Answer · 2017-05-23 10:45:55Z

1

The solution that you described is not so well because potentially it may harm your database – that solution does not care about escaping string, etc. So SQL injection is possible. Fortunately, psycopg (and psycopg2) has cursor's methods execute and mogrify that will properly do all this work for you:

import contextlib

with contextlib.closing(db_connection.cursor()) as cursor:
    values = [cursor.mogrify('(%(id)s, %(test1)s, %(test2)s)', row) for row in rows]
    query = 'INSERT INTO films (id, test1, test2) VALUES {0};'.format(', '.join(values))

For python 3:

import contextlib

with contextlib.closing(db_connection.cursor()) as cursor:
    values = [cursor.mogrify('(%(id)s, %(test1)s, %(test2)s)', row) for row in rows]
    query_bytes = b'INSERT INTO films (id, test1, test2) VALUES ' + b', '.join(values) + b';'

edited May 23, 2017 at 10:45

answered May 23, 2017 at 10:12

Vsevolod Kulaga

6867 silver badges8 bronze badges

11 Comments

Zeliax Over a year ago

Yeah. My plan was to use the cursor.execute(query, inputs) method to do this, but I ended up having some problems with that. However I can't specifically remember what those problems entailed right now

Zeliax Over a year ago

Sorry for the spam comments. But the method I have should be able to adapt to the kinds of data that I pass. I will be passing a lot of different data of different data types for a lot of different tables, which should all go through this one function.

Vsevolod Kulaga Over a year ago

@Zeliax mogrify can format any data types just with '%s'

Zeliax Over a year ago

I see. I just tried to test your piece of code out and it makes great sense. I couldn't get it to work however. I assume db_connection should be my connection element? I have just named it conn. I get an error when I try to execute it, and that is after I have renamed my connection element. TypeError: sequence item 0: expected str instance, bytes found

Vsevolod Kulaga Over a year ago

It's a psycopg2 connection initd.org/psycopg/docs/connection.html

|

Collectives™ on Stack Overflow

Python - creating string from dict items (for writing to postgresql db)

3 Answers 3

6 Comments

current output is

After changes output will be

2 Comments

11 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

current output is

After changes output will be

2 Comments

11 Comments

Your Answer

Sign up or log in

Post as a guest

Related