3

In short: I want to convert object to string dynamically for all my object columns in all my Pandas dataframes. I saw similar posts about a single conversion, but none of them deals with a dynamic conversion to strings.

I'm writing multiple JSON files to our SQL Server database, using Python 3.x. When I import the JSON files, and store it in a Pandas DF, all strings are stored as an object, since the length is unknown in advance. Hence, when I write the data to the SQL Server, the datatype is chosen to be text rather than varchar(255)

Is there a way to dynamically convert all columns with dtype object to dtype string? The strings may be cut at the first 255 characters.

I tried the following, but made Python crash...

url = 'some-url-to-json-file'
params['connectionString'] = 'driver://user:pw@server/database'
engine = sqlalchemy.create_engine(connectionString)
response = requests.get(url)
pandasDF = pandas.DataFrame(response.json()['value'])

# Convert Objects to Strings
for cName in list(pandasDF.columns.values):
    if pandasDF[cName].dtype == 'object':
        pandasDF[cName] = pandasDF[cName].to_string
pandasDF.to_sql(tableName, engine, if_exists = 'append')

2 Answers 2

6

You can use sqlalchemy String type instead of the default Text type after identifying the object columns present in the dataframe.

Use the dtype argument in to_sql and supply a dictionary mapping of those columns with the sqlalchemy.sql.sqltypes.String as shown:

from sqlalchemy.types import String

obj_cols = pandasDF.select_dtypes(include=[object]).columns.values.tolist()
pandasDF.to_sql(tableName, engine, if_exists='append', dtype={c: String for c in obj_cols})

These would map your object data to the String SQL data type.

Sign up to request clarification or add additional context in comments.

Comments

2

There is no string type for pandas dataframes. The 'Object' dtype is a catch-all for a variety of different types such as strings, or lists or dictionaries. You need to first create your table in sql server with the specified types that you want. You should do this with the create table statement that sql server provides. You can do this all within python through sqlalchemy. Pandas should then be able to append rows that match the type in the table that you created.

Here is the section in the documentation about changing the mapping of pandas dtypes.

You can always override the default type by specifying the desired SQL type of any of the columns by using the dtype argument. This argument needs a dictionary mapping column names to SQLAlchemy types (or strings for the sqlite3 fallback mode). For example, specifying to use the sqlalchemy String type instead of the default Text type for string columns:

from sqlalchemy.types import String

data.to_sql('data_dtype', engine, dtype={'Col_1': String})

And here are the list of all the types in sqlalchemy.types

'ARRAY', 'BIGINT', 'BINARY', 'BLOB', 'BOOLEAN', 'BigInteger', 'Binary', 'Boolean', 'CHAR', 'CLOB', 'Concatenable', 'DATE', 'DATETIME', 'DECIMAL', 'Date', 'DateTime', 'Enum', 'FLOAT', 'Float', 'INT', 'INTEGER', 'Indexable', 'Integer', 'Interval', 'JSON', 'LargeBinary', 'MatchType', 'NCHAR', 'NULLTYPE', 'NUMERIC', 'NVARCHAR', 'NullType', 'Numeric', 'PickleType', 'REAL', 'SMALLINT', 'STRINGTYPE', 'SchemaType', 'SmallInteger', 'String', 'TEXT', 'TIME', 'TIMESTAMP', 'Text', 'Time', 'TypeDecorator', 'TypeEngine', 'Unicode', 'UnicodeText', 'UserDefinedType', 'VARBINARY', 'VARCHAR', 'Variant', '_Binary'

4 Comments

You don't have to create the table I guess. If it doesn't exists pandas will create it. If it already exists, you can append.
Thx for your quick reply. I'm struggling with the question how I can create the table dynamically, given the structure of the underlying JSON file. Is there an easy way to create tables in MSSQL, using sqlalchemy, with the variables/data types given in the JSON file?
It indeeds creates one, but not with the desired structure (text rather than varchars). Is there an easy way to choose for varchars rather than text in SQL, using the to_sql function provided by sqlalchemy?
@Dendrobates There is a schema argument in df.to_sql. Check that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.