I am unable to assign an unsigned int type when using .to_sql() to write my dataframe to a MySQL database. I can use the other int types, but just am unable to get unsigned. A small representative sample of what I am trying looks like this:
import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types
db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')
d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')
This gives:
>>> df
items problems
id
100 6 50
101 10 72
102 20000 2147483649
I write to the database as follows:
df.to_sql('my_table',
db_engine,
flavor='mysql',
if_exists='replace',
index_label=['id'],
dtype={'id': sql_types.SMALLINT,
'items': sql_types.INT,
'problems': sql_types.INT}
But what happens is the value of problems in the last row (id==102) gets truncated to 2147483647 (which is 2^31-1) when written to the db.
There are no other issues in the connection or when writing other standard data types, including int. I could get away by using the sql_types.BIGINT option instead (making the maximum 2^63-1), but that would really just be unnecessary as I know my values would fall below 4294967296 (2^32-1), which is basically the unsigned int maximum.
So question is: How can I assign an unsigned int field using the .to_sql() approach above?
I have used the sqlalchemy types from here.
The MySQL types I see are here.
I have seen the question here which does get the unsigned int for MySQL, but it is not using the .to_sql() approach I would like to use. If I can simply create the table from the single .to_sql() statement, that would be ideal.