writing array to Postgres using python's psycopg2

Question

I have the following table in Postgres:

   Column   |            Type             | Modifiers 
------------+-----------------------------+-----------
 customer   | text                        | 
 feature    | character varying(255)      | 
 values     | character varying[]         | 
 updated_ts | timestamp without time zone |

And I'm trying to write the following pandas DataFrame

    customer     feature                       values           updated_ts
0     A             B                       [red, black]     2019-01-15 00:00:00 
1     A             B                       [blue, green]    2019-01-16 00:00:00

using the following code:

import psycopg2
...    
sio = BytesIO()
sio.write(df.to_csv(header=False, index=False, sep='\t', quoting=csv.QUOTE_NONE))
sio.seek(0)
with connection.cursor() as cursor: 
    cursor.copy_from(file=sio, table=table, columns=df.columns, sep='\t', null='')
    connection.commit()

But I'm getting the following error:

DataError('malformed array literal: "[\'red\', \'black\']"\nDETAIL: "[" must introduce explicitly-specified array dimensions.\nCONTEXT: COPY test_features_values, line 1, column values: "[\'red\', \'black\']"\n',)

How do I write it correctly?

I would say you are trying to load a list onto a DB column, there are several reawsons why that is NOT a good idea,( look for fourth normal form). Quick fix, convert the array to a string, either by col = str(col), or [less bad] col = ','.join(col). Proper fix, revisit your data models and yoiur DB implementation — E.Serra
– E.Serra, Commented Jan 30, 2019 at 9:33

Adam Bethke · Accepted Answer · 2019-02-01 23:56:26Z

3

I think you need to convert the list to a set:

df['values'] = df['values'].apply(set)

for the insert to work. The reason is that PostgreSQL expects arrays to be inserted using brace ({}) notation, instead of bracket ([]) notation. When you convert from a list to a set, the to_csv method represents the set using the braces in the same configuration PostgreSQL expects (which was a pleasant surprise; I've seen other representations which it ends up being much hackier to convert).

The other thing I'll note is that in order to get it to work, I had to switch from BytesIO to StringIO, because df.to_csv(...) isn't a bytes-like object.

When I made those changes, the insert was successful:

import csv
import pandas
import psycopg2
from io import StringIO 

# initialize connection
connection = psycopg2.connect('postgresql://scott:tiger@localhost:5432/mydatabase')

# create data
df = pandas.DataFrame({
    'customer': ['A', 'A'],
    'feature': ['B', 'B'],
    'values': [['red', 'black'], ['blue', 'green']],
    'updated_ts': ['2019-01-15 00:00:00', '2019-01-16 00:00:00']
})
# cast list to set
df['values'] = df['values'].apply(set)

# write data to postgres
sio = StringIO()
sio.write(df.to_csv(header=False, index=False, sep='\t', quoting=csv.QUOTE_NONE))
sio.seek(0)
with connection.cursor() as cursor: 
    cursor.copy_from(file=sio, table='test', columns=df.columns, sep='\t', null='')
    connection.commit()

answered Feb 1, 2019 at 23:56

Adam Bethke

1,0382 gold badges20 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Grayver Over a year ago

But you should remember that you lose element repetitions and their order after converting list to set

Alec Mather Over a year ago

Anybody every find a solution that preserves repeated elements and element order?

Adam Bethke Over a year ago

Haven't tested the postgres side of it, but something like df['values'].apply(lambda x: "{" + ", ".join(x) + "}") would probably do it. What you're doing is formatting the list to look like {...} instead of [...]. It's all written to string in the I/O stage anyway. Worth noting that this is a few years old now and the JSON support in Postgres has really come a long way so there might be an easier answer.

Collectives™ on Stack Overflow

writing array to Postgres using python's psycopg2

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related