Problem:
When importing dataframe to SQL table, getting a truncation error:
sqlalchemy.exc.DataError: (pyodbc.DataError) ('22001', '[22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW); [22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]The statement has been terminated. (3621)')
The column in question is ProjectDescription (see column list below). I've quoted the line that produces the error. I'm not entirely sure this is the source of the issue, but I think the column was originally created as VARCHAR(300). The value causing trouble is 307 characters long; 9 truncated as below. That leaves 298. Original size limit 300, 2 for 's, that leaves 298 I guess? Not sure but it's a strange length to be truncating to.
Code causing the issue:
dataframe.to_sql('tblImport', eng, schema='dbo', if_exists='append', index=False, dtype=dict_SQL_dtypes)
Attempted solutions:
- Setting TEXTSIZE in SQL
- Changing column type to VARCHAR / NVARCHAR, changing length to 1000 or MAX
- Setting datatypes (dtype) to
sqlalchemy.types.VARCHAR(idea from here) - Before import, sorting dataframe by descending length of ProjectDescription column to put longest value in top row
- Dropping tables and recreating the column as VARCHAR(MAX) as the data type in case the original length of the column was somehow still limiting the import
- Altering column type through SQL query directly
Versions:
- OS: Windows 10
- python: 3.6
- pandas: 0.24.1
- sqlalchemy: 1.2.17
- pyodbc: 4.0.25
- DB management tool: MSSMS
- SQL server 12
DB connection:
pyodbc.connect("DRIVER={{ODBC Driver 13 for SQL Server}};SERVER={0};DATABASE={1};UID={2};PWD={3}".format(serv, db, usr, pwd))
Engine:
create_engine('mssql+pyodbc://{}@sqlserver:{}@{}:1433/{}?driver=ODBC+Driver+13+for+SQL+Server'.format(usr, pwd, serv, db), echo=True)
ProjectDescription column character_maximum_length from DB schema: -1
Columns:
Column Name | Data Type | Allow Nulls
ProjectUID | varchar(60) | Unchecked
Framework | char(5) | Checked
Partner | char(3) | Checked
SCUID | char(9) | Checked
ClientName | varchar(255) | Checked
PartnerProjectNumber | varchar(40) | Checked
ProjectName | varchar(255) | Checked
ProjectDescription | varchar(MAX) | Checked
Traceback:
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 1216, in _execute_context
cursor, statement, parameters, context
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\default.py", line 533, in do_executemany
cursor.executemany(statement, parameters)
pyodbc.DataError: ('22001', '[22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW); [22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]The statement has been terminated. (3621)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "run_for_tests.py", line 258, in <module>
outmsg = run()
File "run_for_tests.py", line 252, in run
process_result = process_data(file_type, file_name, relative_path, dict_clients)
File "run_for_tests.py", line 204, in process_data
update_sql_tables(df_base)
File "server_interaction.py", line 74, in update_sql_tables
dataframe.to_sql('tblImport', eng, schema='dbo', if_exists='append', index=False, dtype=dict_SQL_dtypes)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\generic.py", line 2532, in to_sql
dtype=dtype, method=method)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\io\sql.py", line 460, in to_sql
chunksize=chunksize, dtype=dtype, method=method)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\io\sql.py", line 1174, in to_sql
table.insert(chunksize, method=method)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\io\sql.py", line 686, in insert
exec_insert(conn, keys, chunk_iter)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\io\sql.py", line 599, in _execute_insert
conn.execute(self.table.insert(), data)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 980, in execute
return meth(self, multiparams, params)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\sql\elements.py", line 273, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 1099, in _execute_clauseelement
distilled_params,
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 1240, in _execute_context
e, statement, parameters, cursor, context
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 1458, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\util\compat.py", line 296, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\util\compat.py", line 276, in reraise
raise value.with_traceback(tb)
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\base.py", line 1216, in _execute_context
cursor, statement, parameters, context
File "AppData\Local\Programs\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\sqlalchemy\engine\default.py", line 533, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.DataError: (pyodbc.DataError) ('22001', '[22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW); [22001] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]The statement has been terminated. (3621)') [SQL: 'INSERT INTO dbo.[tblImport] ([ProjectUID], [Framework], [Partner], [SCUID], [ClientName], [PartnerProjectNumber], [ProjectName], [ProjectDescription]) VALUES (?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (('xxx', 'xxxxx', 'xxx', 'xxxxxxx', 'ABC', 123, 'ABC', 'Provide competition facilities. Build this facility with a capacity of 9000 spectators ... (9 characters truncated) ... m pool and diving facilities and athletes changing facilities to a competion standard and then, post games, convert it to a council operated facility'), ....)] (Background on this error at: http://sqlalche.me/e/9h9h)