4

I am unable to assign an unsigned int type when using .to_sql() to write my dataframe to a MySQL database. I can use the other int types, but just am unable to get unsigned. A small representative sample of what I am trying looks like this:

import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types

db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')

d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')

This gives:

>>> df
         items    problems
id
100          6          50
101         10          72
102      20000  2147483649

I write to the database as follows:

df.to_sql('my_table',
          db_engine,
          flavor='mysql',
          if_exists='replace',
          index_label=['id'],
          dtype={'id': sql_types.SMALLINT,
                 'items': sql_types.INT,
                 'problems': sql_types.INT}

But what happens is the value of problems in the last row (id==102) gets truncated to 2147483647 (which is 2^31-1) when written to the db.

There are no other issues in the connection or when writing other standard data types, including int. I could get away by using the sql_types.BIGINT option instead (making the maximum 2^63-1), but that would really just be unnecessary as I know my values would fall below 4294967296 (2^32-1), which is basically the unsigned int maximum.

So question is: How can I assign an unsigned int field using the .to_sql() approach above?

I have used the sqlalchemy types from here. The MySQL types I see are here. I have seen the question here which does get the unsigned int for MySQL, but it is not using the .to_sql() approach I would like to use. If I can simply create the table from the single .to_sql() statement, that would be ideal.

1 Answer 1

2

To get an unsigned int, you can specify this in the sqlalchemy constructor of the INTEGER type for mysql (see the docs on the mysql types of sqlalchemy):

In [23]: from sqlalchemy.dialects import mysql

In [24]: mysql.INTEGER(unsigned=True)
Out[24]: INTEGER(unsigned=True)

So you can provide this to the dtype argument in to_sql instead of the more general sql_types.INT:

dtype={'problems': mysql.INTEGER(unsigned=True), ...}

Note: you need at least pandas 0.16.0 to have this working.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks joris, this looks exactly like what I need! I am trying to run it though, but I run in to a type error: TypeError: issubclass() arg 1 must be a class. I have done pretty much exactly what you suggested. The import works fine. I have pasted the traceback here: pastebin.com/KHeSYAdr
Ah, yes, that was a bug in 0.15 preventing you to use such an instantiated class (which is the case when providing a keyword), see github.com/pydata/pandas/pull/9138. You need 0.16 for that fix (or you can apply it yourself as it is only a 2 line fix, see the linked PR)
Perfect, it works now! I upgraded to Pandas 0.16 (that took a while), but now the dtype gets me the MySQL types I want! Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.