Update on SQL Server table from Python Pandas

Question

Following is the code in python that updates the records in the required database tables. Is there a better way to handle the same?

Read in SO that scanning dataframe row by row is a time consuming process. What is the better way to handle the same?

for index, row in outputData.iterrows():
    try:
        updatesql = " update table set [fieldname] = {0:f}   where dt = \'{1:s}\'" .format(fieldvalue , currentdt)
        updatecursor.execute(updatesql)
        sql_conn.commit();
except IOError as e:
            print ("({})".format(e))
            pass
        except (RuntimeError, TypeError, NameError) as e:
            print ("({})".format(e))
            pass

Based on the discussion below, made the changes but facing two problems.

 updatesql = " update table set [fieldname] = ? where dt = ?"  
 data = (outputData.reindex( ['fieldvalue'], currentDt,axis='columns').to_numpy())
 # EXECUTE QUERY AND BIND LIST OF TUPLES 
 updatecursor.executemany(updatesql, data.tolist()) 
 sql_conn.commit()

Problems a) Date is constant and not part of the OutputData dataframe. b) Float values are stored in scientific format. Prefer float values to be stored with precisions.

Before anything, stop using the modulo operator % for string formatting. This method has been de-emphasized in Python but not officially deprecated yet. Instead, use the preferred str.format (Python 2.6+) or the newer F-string (Python 3.6+). (And actually you should be using SQL parametefization anyway for this question). — Parfait
– Parfait, Commented Nov 27, 2020 at 21:37
Using string formatting to insert data values into an SQL statement is still a practice to be discouraged. Also, looping through the DataFrame row-by-row with .execute() is less efficient than .executemany() (or the SQLAlchemy equivalent in my answer). — Gord Thompson
– Gord Thompson, Commented Nov 30, 2020 at 19:45
@GordThompson, Please look into the latest statement. I no longer loop through it. But still looking for a way to format float values. — nsivakr
– nsivakr, Commented Nov 30, 2020 at 19:52

Parfait · Accepted Answer · 2020-11-30 21:45:05Z

2

Consider executemany to avoid the for-loop by using a numpy array output via DataFrame.to_numpy(). Below uses SQL parameterization and not any string formatting.

With iterrows + cursor.execute (to demonstrate parameterization)

# PREPARED STATEMENT (NO DATA)
updatesql = "UPDATE SET [fieldname] = ?  WHERE dt = ?"

for index, row in outputData.iterrows():
    try:
        # EXECUTE QUERY AND BIND TUPLE OF PARAMS
        updatecursor.execute(updatesql, (fieldvalue, currentdt))
    except:
        ...

sql_conn.commit()

With to_numpy + cursor.executemany

# PREPARED STATEMENT (NO DATA)
updatesql = "UPDATE SET [fieldname] = ?  WHERE dt = ?"

# ROUND TO SCALE OR HOW MANY DECIMAL POINTS OF COLUMN TYPE
outputData['my_field_col'] = outputData['my_field_col'].round(4)

# ADD A NEW COLUMN TO DATA FRAME EQUAL TO CONSTANT VALUE   
outputData['currentDt'] = currentDt
                        
# SUBSET DATA BY NEEDED COLUMNS CONVERT TO NUMPY ARRAY
data = (outputData.reindex(['my_field_col', 'currentDt'], axis='columns').to_numpy())

# EXECUTE QUERY AND BIND LIST OF TUPLES
updatecursor.executemany(updatesql, data.tolist())
sql_conn.commit()

edited Nov 30, 2020 at 21:45

answered Nov 27, 2020 at 21:50

Parfait

108k19 gold badges102 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

nsivakr Over a year ago

Thanks. This sounds very promising. Will experiment and get back.

Parfait Over a year ago

Great to hear. Did solution work? If not, what issues did you face?

nsivakr Over a year ago

how to use in place substitution but format the float values? In other words, if I don't format float values, all kinds of scientific notation is updated in the database.

nsivakr Over a year ago

added my updated code but it doesn't work. Let me know, how to fix the same.

Parfait Over a year ago

See edits using round to match decimal points of column type and assigning date constant as a new data frame column.

|

Gord Thompson · Accepted Answer · 2020-11-27 23:04:40Z

2

Here's another way you could do it that would also take advantage of pyodbc's fast_executemany=True:

import sqlalchemy as sa

# …

print(outputData)  # DataFrame containing updates
"""console output:
   my_field_col my_date_col
0             0  1940-01-01
1             1  1941-01-01
2             2  1942-01-01
…
"""

engine = sa.create_engine(connection_uri, fast_executemany=True)

update_stmt = sa.text(
    f"UPDATE [{table_name}] SET [fieldname] = :my_field_col WHERE dt = :my_date_col"
)
update_data = outputData.to_dict(orient="records")
with engine.begin() as conn:
    conn.execute(update_stmt, update_data)

answered Nov 27, 2020 at 23:04

Gord Thompson

125k38 gold badges251 silver badges458 bronze badges

Collectives™ on Stack Overflow

Update on SQL Server table from Python Pandas

2 Answers 2

6 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related