2

Following is the code in python that updates the records in the required database tables. Is there a better way to handle the same?

Read in SO that scanning dataframe row by row is a time consuming process. What is the better way to handle the same?

for index, row in outputData.iterrows():
    try:
        updatesql = " update table set [fieldname] = {0:f}   where dt = \'{1:s}\'" .format(fieldvalue , currentdt)
        updatecursor.execute(updatesql)
        sql_conn.commit();
except IOError as e:
            print ("({})".format(e))
            pass
        except (RuntimeError, TypeError, NameError) as e:
            print ("({})".format(e))
            pass

Based on the discussion below, made the changes but facing two problems.

 updatesql = " update table set [fieldname] = ? where dt = ?"  
 data = (outputData.reindex( ['fieldvalue'], currentDt,axis='columns').to_numpy())
 # EXECUTE QUERY AND BIND LIST OF TUPLES 
 updatecursor.executemany(updatesql, data.tolist()) 
 sql_conn.commit()

Problems a) Date is constant and not part of the OutputData dataframe. b) Float values are stored in scientific format. Prefer float values to be stored with precisions.

4
  • 2
    Before anything, stop using the modulo operator % for string formatting. This method has been de-emphasized in Python but not officially deprecated yet. Instead, use the preferred str.format (Python 2.6+) or the newer F-string (Python 3.6+). (And actually you should be using SQL parametefization anyway for this question). Commented Nov 27, 2020 at 21:37
  • @Parfait, Updated my code. Thanks Commented Nov 30, 2020 at 18:30
  • 2
    Using string formatting to insert data values into an SQL statement is still a practice to be discouraged. Also, looping through the DataFrame row-by-row with .execute() is less efficient than .executemany() (or the SQLAlchemy equivalent in my answer). Commented Nov 30, 2020 at 19:45
  • @GordThompson, Please look into the latest statement. I no longer loop through it. But still looking for a way to format float values. Commented Nov 30, 2020 at 19:52

2 Answers 2

2

Consider executemany to avoid the for-loop by using a numpy array output via DataFrame.to_numpy(). Below uses SQL parameterization and not any string formatting.

With iterrows + cursor.execute (to demonstrate parameterization)

# PREPARED STATEMENT (NO DATA)
updatesql = "UPDATE SET [fieldname] = ?  WHERE dt = ?"

for index, row in outputData.iterrows():
    try:
        # EXECUTE QUERY AND BIND TUPLE OF PARAMS
        updatecursor.execute(updatesql, (fieldvalue, currentdt))
    except:
        ...

sql_conn.commit()

With to_numpy + cursor.executemany

# PREPARED STATEMENT (NO DATA)
updatesql = "UPDATE SET [fieldname] = ?  WHERE dt = ?"

# ROUND TO SCALE OR HOW MANY DECIMAL POINTS OF COLUMN TYPE
outputData['my_field_col'] = outputData['my_field_col'].round(4)

# ADD A NEW COLUMN TO DATA FRAME EQUAL TO CONSTANT VALUE   
outputData['currentDt'] = currentDt
                        
# SUBSET DATA BY NEEDED COLUMNS CONVERT TO NUMPY ARRAY
data = (outputData.reindex(['my_field_col', 'currentDt'], axis='columns').to_numpy())

# EXECUTE QUERY AND BIND LIST OF TUPLES
updatecursor.executemany(updatesql, data.tolist())
sql_conn.commit()
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks. This sounds very promising. Will experiment and get back.
Great to hear. Did solution work? If not, what issues did you face?
how to use in place substitution but format the float values? In other words, if I don't format float values, all kinds of scientific notation is updated in the database.
added my updated code but it doesn't work. Let me know, how to fix the same.
See edits using round to match decimal points of column type and assigning date constant as a new data frame column.
|
2

Here's another way you could do it that would also take advantage of pyodbc's fast_executemany=True:

import sqlalchemy as sa

# …

print(outputData)  # DataFrame containing updates
"""console output:
   my_field_col my_date_col
0             0  1940-01-01
1             1  1941-01-01
2             2  1942-01-01
…
"""

engine = sa.create_engine(connection_uri, fast_executemany=True)

update_stmt = sa.text(
    f"UPDATE [{table_name}] SET [fieldname] = :my_field_col WHERE dt = :my_date_col"
)
update_data = outputData.to_dict(orient="records")
with engine.begin() as conn:
    conn.execute(update_stmt, update_data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.