Unsigned int for dataframe to_sql using sqlalchemy types

Question

I am unable to assign an unsigned int type when using .to_sql() to write my dataframe to a MySQL database. I can use the other int types, but just am unable to get unsigned. A small representative sample of what I am trying looks like this:

import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types

db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')

d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')

This gives:

>>> df
         items    problems
id
100          6          50
101         10          72
102      20000  2147483649

I write to the database as follows:

df.to_sql('my_table',
          db_engine,
          flavor='mysql',
          if_exists='replace',
          index_label=['id'],
          dtype={'id': sql_types.SMALLINT,
                 'items': sql_types.INT,
                 'problems': sql_types.INT}

But what happens is the value of problems in the last row (id==102) gets truncated to 2147483647 (which is 2^31-1) when written to the db.

There are no other issues in the connection or when writing other standard data types, including int. I could get away by using the sql_types.BIGINT option instead (making the maximum 2^63-1), but that would really just be unnecessary as I know my values would fall below 4294967296 (2^32-1), which is basically the unsigned int maximum.

So question is: How can I assign an unsigned int field using the .to_sql() approach above?

I have used the sqlalchemy types from here. The MySQL types I see are here. I have seen the question here which does get the unsigned int for MySQL, but it is not using the .to_sql() approach I would like to use. If I can simply create the table from the single .to_sql() statement, that would be ideal.

joris · Accepted Answer · 2015-07-02 00:26:09Z

2

To get an unsigned int, you can specify this in the sqlalchemy constructor of the INTEGER type for mysql (see the docs on the mysql types of sqlalchemy):

In [23]: from sqlalchemy.dialects import mysql

In [24]: mysql.INTEGER(unsigned=True)
Out[24]: INTEGER(unsigned=True)

So you can provide this to the dtype argument in to_sql instead of the more general sql_types.INT:

dtype={'problems': mysql.INTEGER(unsigned=True), ...}

Note: you need at least pandas 0.16.0 to have this working.

edited Jul 2, 2015 at 0:26

answered Jul 1, 2015 at 22:15

joris

140k37 gold badges257 silver badges207 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

vk1011 Over a year ago

Thanks joris, this looks exactly like what I need! I am trying to run it though, but I run in to a type error: TypeError: issubclass() arg 1 must be a class. I have done pretty much exactly what you suggested. The import works fine. I have pasted the traceback here: pastebin.com/KHeSYAdr

joris Over a year ago

Ah, yes, that was a bug in 0.15 preventing you to use such an instantiated class (which is the case when providing a keyword), see github.com/pydata/pandas/pull/9138. You need 0.16 for that fix (or you can apply it yourself as it is only a 2 line fix, see the linked PR)

vk1011 Over a year ago

Perfect, it works now! I upgraded to Pandas 0.16 (that took a while), but now the dtype gets me the MySQL types I want! Thanks!

Collectives™ on Stack Overflow

Unsigned int for dataframe to_sql using sqlalchemy types

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related