0

I've been trying to use this piece of code:

# df is the dataframe
if len(df) > 0:
df_columns = list(df)
# create (col1,col2,...)
columns = ",".join(df_columns)

# create VALUES('%s', '%s",...) one '%s' per column
values = "VALUES({})".format(",".join(["%s" for _ in df_columns])) 

#create INSERT INTO table (columns) VALUES('%s',...)
insert_stmt = "INSERT INTO {} ({}) {}".format(table,columns,values)

cur = conn.cursor()
cur = db_conn.cursor()
psycopg2.extras.execute_batch(cur, insert_stmt, df.values)
conn.commit()
cur.close()

So I could connect into Postgres DB and insert values from a df.

I get these 2 errors for this code:

LINE 1: INSERT INTO mrr.shipments (mainFreight_freight_motherVesselD...

psycopg2.errors.UndefinedColumn: column "mainfreight_freight_mothervesseldepartdatetime" of relation "shipments" does not exist

for some reason, the columns can't get the values properly

What can I do to fix it?

1 Answer 1

0

You should not do your own string interpolation; let psycopg2 handle it. From the docs:

Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.

Since you also have dynamic column names, you should use psycopg2.sql to create the statement and then use the standard method of passing query parameters to psycopg2 instead of using format.

Sign up to request clarification or add additional context in comments.

2 Comments

It's actually working as is, I just needed to add parenthesis to it could read the columns properly in the 'Execute batch' command.
It may be working for now, but it's bad practice and it will break the first time one of the values contains a special character like a quote.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.