0

I know that insert or update if key exists option for .to_sql() hasn't been implemented yet, so I'm looking for an alternative.


The first thing that comes to mind is to use the append option:

data.to_sql(
    "Dim_Objects",
    con=connection,
    if_exists="append",
    index=False
)

and remove duplicates in the database separately, after I inserted data:

DELETE FROM "Dim_Objects" a
      USING "Dim_Objects" b
      WHERE a."Code" = b."Code" 
        AND a."TimeStampUpdate" < b."TimeStampUpdate"

In this case, if there's a duplicate, I only keep the latest entry.


This approach seems to work but I hoped I could achieve the same using pandas directly.

Any ideas?

0

1 Answer 1

1
     can you try? 
  
    data.to_sql('Dim_Objects', con=connection, if_exists='replace')
    
    sql = """
        UPDATE "different_table" AS f
        SET col1 = b.col1
        FROM your_table_name AS data
        WHERE a."Code" = b."Code"
    """
    
    with engine.begin() as conn:     
        conn.execute(sql)
Sign up to request clarification or add additional context in comments.

2 Comments

if_exists="replace" would drop my whole table and replace it with the data dataframe, which is not what I want; I want to append new rows and replace where key already exists, but also I want to keep all records that are already in the table
sorry, You can try wirting the df in pandas as temp table in your db, then using sql create the matching columns. this code was for it. I updated some of the code. I typed the same db name by mistake

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.