0

In the example below there is two dataframes, df and df1. I also have a MySQL database with two tables. Tables1 and Table2, both has the same columns as df.

What I want to do is:

  1. Use df1 to locate the correct row in table1,
  2. Move the row to table2 and update the columns 'REMOVED' (From False to True) and 'REMOVED_DATE (from None to '2022-01-10).
  3. Remove the row from table1

Below you see my effort. But it dosen't work, and I can imagine there is a better solution available. Any help would be much appreciated.

df=pd.DataFrame({'ID':[25, 10],
              'DATE':['2021-11-13', '2021-12-03'],
              'LOC': ['NY', 'ML'],
                'MAIL':['[email protected]', '[email protected]'],
                 'NR': ['100000', '200000'],
                'REMOVED': [False, False],
                 'REMOVED_DATE':[None,None]})

df1=pd.DataFrame({'ID':[25],
                  'REMOVED':[True],
                 'REMOVED_DATE':['2022-01-11']})

cols = "`,`".join([str(i) for i in df.columns.tolist()]) #use df to mirror the columns in the tables in the database.

for i, row in df1.iterrows():
    id=row['ID']
    sql=f"SELECT ID FROM table1 WHERE id={id}" # Check if ID is in table1.
    cursor.execute(sql)
    z=cursor.fetchall()
    
    if not z:
        print('insert') # if not, move on.
  
    else: #If ID in, update, move and delete
        sql=f"""UPDATE table1 SET (`" +cols + "`) VALUES (" + "%s,"*(len(row)-1) + "%s) WHERE id={id}""" # Update the whole row where ID match.
        cursor.execute(sql, tuple(row))
        mydb.commit()
        sql=f"""INSERT INTO table2 SELECT * FROM table1 WHERE id={id}""" #insert it into table2
        cursor.execute(str(sql), tuple(row))
        mydb.commit()
        sql=f"""DELETE from table1 WHERE id={id}""" # Delete it from table1
        cursor.execute(sql)
        mydb.commit()
3
  • 1
    You're missing parameters in the cursor.execute() for the UPDATE statement. Commented Jan 22, 2022 at 14:44
  • 1
    WHy are you updating table1 if you're going to delete the row? Commented Jan 22, 2022 at 14:47
  • A better solution is, perhaps, to move it first to table2 and then update it? Commented Jan 22, 2022 at 14:51

1 Answer 1

1

You don't need to update table1 -- there's nothing in the specification that indicates this is needed, and you're going to delete it anyway.

Your code doesn't change REMOVED and REMOVED_DATE when it's copying from table1 to table2. You can do this by replacing SELECT * with selecting all the columns other than those, and substituting the new values you want in the SELECT list.

You should also use parametrized queries rather than substituting values with f-strings.

cols = ",".join(f"`{col}`" for col in df.columns.tolist())
new_cols = ",".join('%s' if col in ('REMOVED', 'REMOVED_DATE') else f"`{col}`" for col in df.columns.tolist())
insert_sql=f"""INSERT INTO table2 ({cols}) SELECT {new_cols} FROM table1 WHERE id=%s""" #insert it into table2
delete_sql = """DELETE from table1 WHERE id=%s"""
select_sql = "SELECT ID FROM table1 WHERE id=%s"

for i, row in df1.iterrows():
    id=row['ID']
    cursor.execute(select_sql, (id,))
    row = cursor.fetchone()

    if not row:
        print('insert') # if not, move on.

    else: #If ID in, update, move and delete
        cursor.execute(insert_sql, (False, '2022-01-10', id))
        cursor.execute(delete_sql, (id,))
        mydb.commit()

You should only commit once for each set of operations when moving and deleting, so that they're an atomic set.

Sign up to request clarification or add additional context in comments.

9 Comments

Many thanks. I'm trying it out now. I think the first row of your code has one () to many (after columns. Should be: cols = ",".join(f"{col}" for col in df.columns.tolist())
And the same goes for the second row.
Shouldn't iterrows rows be used as the second argument in execute?
Why? You're copying from the table, not the dataframe. The dataframe is just being used to get the IDs.
You would only use tuple(row) in execute() if you need to substitute all the columns from the df into the query, and in the same order. In most cases you only need some columns, so you write (row['col1'], row['col2'])
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.