Python Pandas Read SQL from IBM DB2 with non ASCII characters

Question

I am using Pandas Read Sql in Python for executing sql-statements in IBM DB2 and saving the response in pandas dataframes. Now I am trying to execute a sql-statement containing a non ASCII character, a letter from the Swedish alphabet: 'Å' (others are Å, Ä and Ö) but I am getting this error:

"DatabaseError: Execution failed on sql 'SELECT * FROM DATA_CONFIG WHERE TAG_NAME='Å'': ibm_db_dbi::Error: Error occure during processing of statement"

This is my code:

import ibm_db
import ibm_db_dbi
import pandas as pd

def sqlToFrame(sql): # Function for creating pandas dataframes from SQL-statements
    con = ibm_db.connect(connection_string, "", "")
    conn = ibm_db_dbi.Connection(con)
    return pd.read_sql(sql, conn)

df = sqlToFrame("SELECT * FROM DATA_CONFIG WHERE TAG_NAME='Å'")

I've tried executing the statement in the IDE in the IBM DB2 interface which works perfectly fine. Therefore I am figuring that the problem might be connected to how I establish the connection/the DB2 driver pandas uses. I have tried finding a way to set an encoding but can't find anything. How can I solve this? I also know this is possible because another package that builds upon ibm_db accept these characters. The characters are a from the ISO-8859-1 series

Which version of Python? Which bitness? Which version of ibm_db, and ibm_db_dbi ? Also what is the encoding of your target Db2 database? — mao
– mao, Commented Feb 24, 2018 at 18:39
Python2.7 with the latest version of those packages. The target DB2 database uses the default encoding @mao — danielo
– danielo, Commented Feb 24, 2018 at 18:47
Is it 32-bit python , or 64-bit python? What operating-system runs the Db2-server? — mao
– mao, Commented Feb 24, 2018 at 18:49
Suggest you try with 64-bit Python 3.5 or higher. Seems ibm_db.c barfs when PyUnicode_FromObject() returns null or Py_None. — mao
– mao, Commented Feb 24, 2018 at 19:13
Can you carefully check your attempted SQL as there is no closing single quote after the character from posted error (NOT posted query)? I passed your special character to a DB2 connection with Python 3.5 and no error was raised. — Parfait
– Parfait, Commented Feb 24, 2018 at 21:38

Parfait · Accepted Answer · 2018-02-25 01:45:41Z

1

Consider parameterizing your query using the params argument of pandas.read_sql and pass the accented character 'Å' with u'' prefix to bind value to the unquoted ? placeholder in SQL query. Do note: params requires a sequence and so below passes a tuple of one item.

Unlike Python 2.x, all strings in Python 3.x are Unicode strings and so accented literals (non-ascii) do not need explicit decoding with u'...'. Hence, why I cannot reproduce your issue in my Python 3.5 running a DB2 SQL query with accented characters.

import pandas as pd
...

# Function for creating pandas dataframes from SQL-statements
def sqlToFrame(sql): 
    db = ibm_db.connect(connection_string, "", "")
    con = ibm_db_dbi.Connection(db)

    return pd.read_sql(sql, con, params = (u'Å',))


df = sqlToFrame("SELECT * FROM DATA_CONFIG WHERE TAG_NAME = ?")

answered Feb 25, 2018 at 1:45

Parfait

108k19 gold badges102 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

danielo Over a year ago

Thank you for the answer. I actually didn’t try your solution for 2.7. I thought ibm_db wasn’t supported in Python 3, so I upgraded my Python version instead which solved the problem. Thanks a lot for the help! @Parfait

Collectives™ on Stack Overflow

Python Pandas Read SQL from IBM DB2 with non ASCII characters

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related