Export data from Python to SQL Server

Question

I am trying to push back to SQL SERVER a data frame, but I am having a hard time doing so. With the following code, I receive this error :

pyodbc.DataError: ('22008', '[22008] [Microsoft][ODBC SQL Server Driver]Exceeding the capacity of the field datetime (0) (SQLExecDirectW)')

Here's my code until now:

import pandas as pd
import pyodbc
import numpy as np

df = pd.read_excel(r'path.xlsx')

new_names = {"Calendar Date": "CALENDAR_DATE",
             "Origin ID": "ORIGIN_ID",
             "Dest ID": "DEST_ID",
             "Destination Name": "DESTINATION_NAME",
             "Destination City": "DESTINATION_CITY",
             "Destination State": "DESTINATION_STATE",
             "Carrier Name": "CARRIER_NAME",
             "Stop Number": "STOP_NUMBER",
             "Planned Arrival Time Start": "PLANNED_ARRIVAL_TIME_START",
             "Planned Arrival Time End": "PLANNED_ARRIVAL_TIME_END",
             "Delivery App't Time Start": "DELIVERY_APPT_TIME_START",
             "Delivery App't Time End": "DELIVERY_APPT_TIME_END",
             "Actual Delivery Departure Time": "ACTUAL_DELIVERY_DEPARTURE_TIME",
             "Reason Code and Description": "REASON_CODE_AND_DESCRIPTION",
             "Days Late Vs Plan": "DAYS_LATE_VS_PLAN",
             "Hrs Late Vs Plan": "HRS_LATE_VS_PLAN",
             "Days Late Vs Appt": "DAYS_LATE_VS_APPT",
             "Hrs Late Vs Appt": "HRS_LATE_VS_APPT"}

df.rename(columns=new_names, inplace=True)

conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=xxx;'
                      'Database=Business_Planning;'
                      'UID="xxx";'
                      'PWD="xxx";'
                      'Trusted_Connection=yes;')
cursor = conn.cursor()

SQL_Query = pd.read_sql_query('SELECT * FROM Business_Planning.dbo.OTD_1_DELIVERY_TRACKING_F_IMPORT', conn)

df2 = pd.DataFrame(SQL_Query, columns=["CALENDAR_DATE", "ORIGIN_ID", "DEST_ID", "DESTINATION_NAME", "DESTINATION_CITY",
                                       "DESTINATION_STATE", "SCAC", "CARRIER_NAME", "SID", "STOP_NUMBER",
                                       "PLANNED_ARRIVAL_TIME_START", "PLANNED_ARRIVAL_TIME_END",
                                       "DELIVERY_APPT_TIME_START", "DELIVERY_APPT_TIME_END",
                                       "ACTUAL_DELIVERY_DEPARTURE_TIME", "REASON_CODE_AND_DESCRIPTION",
                                       "DAYS_LATE_VS_PLAN", "HRS_LATE_VS_PLAN", "DAYS_LATE_VS_APPT",
                                       "HRS_LATE_VS_APPT"])

df3 = pd.concat([df2, df]).drop_duplicates(["SID", "STOP_NUMBER", "PLANNED_ARRIVAL_TIME_START"],
                                           keep='last').sort_values(
                                           ["SID", "STOP_NUMBER", "PLANNED_ARRIVAL_TIME_START"])
df3['SID'].replace('', np.nan, inplace=True)
df3.dropna(subset=['SID'], inplace=True)

conn.execute('TRUNCATE TABLE Business_Planning.dbo.OTD_1_DELIVERY_TRACKING_F_IMPORT')

for index, row in df3.iterrows():
    conn.execute(
        "INSERT INTO OTD_1_DELIVERY_TRACKING_F_IMPORT([CALENDAR_DATE], [ORIGIN_ID], [DEST_ID], [DESTINATION_NAME], "
        "[DESTINATION_CITY], [DESTINATION_STATE], [SCAC], [CARRIER_NAME], [SID], [STOP_NUMBER], "
        "[PLANNED_ARRIVAL_TIME_START], [PLANNED_ARRIVAL_TIME_END], [DELIVERY_APPT_TIME_START], "
        "[DELIVERY_APPT_TIME_END], [ACTUAL_DELIVERY_DEPARTURE_TIME], [REASON_CODE_AND_DESCRIPTION], "
        "[DAYS_LATE_VS_PLAN], [HRS_LATE_VS_PLAN], [DAYS_LATE_VS_APPT], [HRS_LATE_VS_APPT]) "
        "values (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)",
        row['CALENDAR_DATE'],
        row['ORIGIN_ID'],
        row['DEST_ID'],
        row['DESTINATION_NAME'],
        row['DESTINATION_CITY'],
        row['DESTINATION_STATE'],
        row['SCAC'],
        row['CARRIER_NAME'],
        row['SID'],
        row['STOP_NUMBER'],
        row['PLANNED_ARRIVAL_TIME_START'],
        row['PLANNED_ARRIVAL_TIME_END'],
        row['DELIVERY_APPT_TIME_START'],
        row['DELIVERY_APPT_TIME_END'],
        row['ACTUAL_DELIVERY_DEPARTURE_TIME'],
        row['REASON_CODE_AND_DESCRIPTION'],
        row['DAYS_LATE_VS_PLAN'],
        row['HRS_LATE_VS_PLAN'],
        row['DAYS_LATE_VS_APPT'],
        row['HRS_LATE_VS_APPT'])
    conn.commit()
conn.commit()
conn.close()

The error is coming from that part:

for index, row in df3.iterrows():
    conn.execute(
        "INSERT INTO OTD_1_DELIVERY_TRACKING_F_IMPORT([CALENDAR_DATE], [ORIGIN_ID], [DEST_ID], [DESTINATION_NAME], "
        "[DESTINATION_CITY], [DESTINATION_STATE], [SCAC], [CARRIER_NAME], [SID], [STOP_NUMBER], "
        "[PLANNED_ARRIVAL_TIME_START], [PLANNED_ARRIVAL_TIME_END], [DELIVERY_APPT_TIME_START], "
        "[DELIVERY_APPT_TIME_END], [ACTUAL_DELIVERY_DEPARTURE_TIME], [REASON_CODE_AND_DESCRIPTION], "
        "[DAYS_LATE_VS_PLAN], [HRS_LATE_VS_PLAN], [DAYS_LATE_VS_APPT], [HRS_LATE_VS_APPT]) "
        "values (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)",
        row['CALENDAR_DATE'],
        row['ORIGIN_ID'],
        row['DEST_ID'],
        row['DESTINATION_NAME'],
        row['DESTINATION_CITY'],
        row['DESTINATION_STATE'],
        row['SCAC'],
        row['CARRIER_NAME'],
        row['SID'],
        row['STOP_NUMBER'],
        row['PLANNED_ARRIVAL_TIME_START'],
        row['PLANNED_ARRIVAL_TIME_END'],
        row['DELIVERY_APPT_TIME_START'],
        row['DELIVERY_APPT_TIME_END'],
        row['ACTUAL_DELIVERY_DEPARTURE_TIME'],
        row['REASON_CODE_AND_DESCRIPTION'],
        row['DAYS_LATE_VS_PLAN'],
        row['HRS_LATE_VS_PLAN'],
        row['DAYS_LATE_VS_APPT'],
        row['HRS_LATE_VS_APPT'])
    conn.commit()

The fields listed represent every columns from df3. I can't seem to get it right, anybody has a clue?

Michel Guimarães · Accepted Answer · 2020-03-27 16:09:39Z

1

I strongly advice you to use to_sql() pandas.DataFrame method. Besides that use sqlalchemy lib to connect with your database too. Use this example:

import pyodbc
import sqlalchemy

engine = sqlalchemy.create_engine('mssql+pyodbc://{0}:{1}@{2}:1433/{3}?driver=ODBC+Driver+{4}+for+SQL+Server'.format(username,password,server,bdName,driverVersion))
pd.to_sql("TableName",con=engine,if_exists="append")

answered Mar 27, 2020 at 16:09

Michel Guimarães

4113 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Gord Thompson Over a year ago

Good suggestion, but (1) don't hard-code port 1433, and (2) they probably want index=False too.

Michel Guimarães Over a year ago

Yes! Good observation @GordThompson. Sorry to my inattention.

sgc Over a year ago

Thanks for the help. Seems to work, but I am having the longest timeout error message (too long to post in reply). Here's my code:

engine = sqlalchemy.create_engine('mssql+pyodbc://{0}:{1}@{2}:1433/{3}?driver=ODBC+Driver+{4}+for+SQL+Server'.format("xxx", "xxx", "xxx","Business_Planning", "13"))

df3.to_sql("OTD_1_DELIVERY_TRACKING_F_IMPORT", con=engine, if_exists="append") I feel like this is because of the port. I'm not too familiar with the port situation of my business, is there a way to set a default one or no port at all in that string? @GordThompson

Gord Thompson Over a year ago

1433 is the default port for a "default (unnamed) instance" of SQL Server. If the Server= value in your original connection string (for pyodbc) was of the form servername\instancename then you should not include a port number. The answer also assumes that you have "ODBC Driver xx for SQL Server" installed whereas your original connection string used the old "SQL Server" driver.

Collectives™ on Stack Overflow

Export data from Python to SQL Server

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related