Python and IBM DB2: UnicodeDecodeError

Question

I'm getting this error message

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: ordinal not in range(128)

when I try to execute any sql query in Python, like this one:

>>> import ibm_db
>>> conn = ibm_db.connect("sample","root","root")
>>> ibm_db.exec_immediate(conn, "select * from act")

I checked default encoding and it seems to be 'utf8':

>>> import sys
>>> sys.getdefaultencoding()
'utf-8'

I also know about this thread, where people are discussing quite a similar problem. One of the advices is:

Have you applied the required database PTFs (SI57014 and SI57015 for 7.1 and SI57146 and SI57147 for 7.2)? They are included as a distreq, so they should have been in the order with your PTFs, but won't be automatically applied.

However, I do not know what is database PTF and how to apply it. Need help.

PS. I'm using Windows 10.

EDIT

This is how I get my error message:

>>> print(ibm_db.stmt_errormsg())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38:    
ordinal not in range(128)

But when I run the same query "select * from act" in DB2 CLP, then it's ok. And this is driver information, whcih I got running this code in Python:

if client:
    print("DRIVER_NAME: string(%d) \"%s\"" % (len(client.DRIVER_NAME), client.DRIVER_NAME))
    print("DRIVER_VER: string(%d) \"%s\"" % (len(client.DRIVER_VER), client.DRIVER_VER))
    print("DATA_SOURCE_NAME: string(%d) \"%s\"" % (len(client.DATA_SOURCE_NAME), client.DATA_SOURCE_NAME))
    print("DRIVER_ODBC_VER: string(%d) \"%s\"" % (len(client.DRIVER_ODBC_VER), client.DRIVER_ODBC_VER))
    print("ODBC_VER: string(%d) \"%s\"" % (len(client.ODBC_VER), client.ODBC_VER))
    print("ODBC_SQL_CONFORMANCE: string(%d) \"%s\"" % (len(client.ODBC_SQL_CONFORMANCE), client.ODBC_SQL_CONFORMANCE))
    print("APPL_CODEPAGE: int(%s)" % client.APPL_CODEPAGE)
    print("CONN_CODEPAGE: int(%s)" % client.CONN_CODEPAGE)
    ibm_db.close(conn)
else:
    print("Error.")

it prints:

DRIVER_NAME: string(10) "DB2CLI.DLL"
DRIVER_VER: string(10) "10.05.0007"
DATA_SOURCE_NAME: string(6) "SAMPLE"
DRIVER_ODBC_VER: string(5) "03.51"
ODBC_VER: string(10) "03.01.0000"
ODBC_SQL_CONFORMANCE: string(8) "EXTENDED"
APPL_CODEPAGE: int(1251)
CONN_CODEPAGE: int(1208)
True

EDIT

I also tried this:

>>> cnx = ibm_db.connect("sample","root","root")
>>> query = "select * from act"
>>> query.encode('ascii')
b'select * from act'
>>> ibm_db.exec_immediate(cnx, query)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Exception
>>> print(ibm_db.stmt_errormsg())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: 
ordinal not in range(128)

As you can see, in this case I also get the very same error message.

SUMMARY

Below are all my attemts:

C:\Windows\system32>chcp
Active code page: 65001

C:\Windows\system32>python
Python 3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 20:20:57) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import ibm_db
>>> cnx = ibm_db.connect("sample","root","root")
>>> ibm_db.exec_immediate(cnx, "select * from act")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception
>>> print(ibm_db.stmt_errormsg())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 38: ordinal not in range(128)
>>> ibm_db.exec_immediate(cnx, b"select * from act")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception: statement must be a string or unicode
>>> query = "select * from act"
>>> query = query.encode()
>>> ibm_db.exec_immediate(cnx, query)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception: statement must be a string or unicode
>>> ibm_db.exec_immediate(cnx, "select * from act").decode('cp-1251')
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
Exception

what platform and version of DB2? The PTFs and versions (7.1 & 7.2) are DB2 for IBM i. — Charles
– Charles, Commented May 23, 2016 at 17:20
What is your database configuration? Try a get db cfg when connected to the db to obtain that information. — data_henrik
– data_henrik, Commented May 24, 2016 at 8:24
When I do get db cfg, I get a long-long list of information. In this list for example I see that default database encoding is UTF-8. By the way, I should add that I can work with the database in the console - I can connect to a database instance and execute simple queries. The whole problem is with Python driver. — Jacobian
– Jacobian, Commented May 24, 2016 at 10:19
I should add that, when I run this query in DB2 prompt, then everything i ok. — Jacobian
– Jacobian, Commented May 26, 2016 at 20:03

Peter Brittain · Accepted Answer · 2016-06-01 09:14:21Z

2

What you have here is an incompatibility between your client code (ibm_db) and the DB2 server. As you can see in the client code the logic for your query is basically:

Extract and check the parameters passed in (lines 4873 to 4918).
Allocate native objects for the query (up to 4954).
Do the query and decode the results (the rest of the function).

Based on our investigations so far, you know that the data you're passing in for the query is well-formed (and so it is not step 1). Looking at the error paths in step 2, you'd see simple error messages explaining these failures. You're therefore failing in step 3.

You are getting an empty Exception raised on the query and when you try to get the details of the error you get another Unicode decoding Exception. This looks like either a bug in ibm_db or a configuration error that means your DB2 installation is not compatible. So how can we find out which...?

As flagged elsewhere, the issue is fundamentally to do with codepages. All the ibm_db code basically interprets strings as ASCII (by converting them with StringOBJ_FromASCII which maps down to calls into Python APIs that insist on receiving ASCII chars - and will throw unicode exceptions if not).

Based on your diags, you could try to prove/disprove this problem, by installing/configuring both your systems (client and DB2 server) to use US English. This should get you past the codepage incompatibility to find the real error here.

If the query is really going out over the network, you might just get a network trace that shows the response coming back from the server. However, based on the fact that you saw nothing in the logs, I'm not convinced this will bear any fruit.

Failing that you need to patch the ibm_db code to handle non-ASCII content - either by raising a bug report with the maintainer or trying it yourself (if you know how to build and debug C extensions).

edited Jun 1, 2016 at 9:14

answered Jun 1, 2016 at 9:08

Peter Brittain

13.7k3 gold badges45 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Jacobian Over a year ago

Thanks for trying to help. But now it seems like ibm_db library is simply not compatible with Python 3. Have you ever tried to connect to DB2 from Python 3 code?

Peter Brittain Over a year ago

I've not personally used it, but the ibm_db library is compatible with Python 3. This is clearly documented in pypi.python.org/pypi/ibm_db and tallies with the fact that the library is written for Python 2 and 3. Your issue is not Python 3. I suspect that the maintainers however, have only ever used it on a US English system.

Peter Brittain Over a year ago

On final thought... You might be better off raising an issue against the github project.

Jacobian Over a year ago

Thanks, Peter! I will probably do it.

score 1 · Accepted Answer · 2016-06-01 08:19:15Z

1

The problem is that the DB2 server is returned CP-1251 (also known as Windows-1251) text (as evidenced by APPL_CODEPAGE: int(1251)) in your config output. Python (specifically, the interactive Python REPL) is expecting either UTF-8 or ASCII output, so this causes issues.

The solution is to do:

ibm_db.exec_immediate(conn, "select * from act").decode('cp-1251')

Additionally, you need to make sure that your terminal's text encoding is set to UTF-8. Details on changing that setting will depend on the specific terminal that you are using. Since you have said you are using cmd, the appropriate command is chcp 65001.

edited Jun 1, 2016 at 8:19

answered Jun 1, 2016 at 7:52

user2508324

10 Comments

Jacobian Over a year ago

Thanks! I will check it in a minute!

Jacobian Over a year ago

I wish it worked, but it is not working. When I run ibm_db.exec_immediate(cnx, b"select * from act") I get this error message: Exception: statement must be a string or unicode

Jacobian Over a year ago

Doing query = query.encode() and then "passing the query string to ibm_db.exec_immediate" also does not work. And the error message once again is Exception: statement must be a string or unicode

user2508324 Over a year ago

Try ibm_db.exec_immediate(cnx, "select * from act").decode('cp-1251').

Jacobian Over a year ago

In this case once again I get me very first error message 'ascii' codec can't decode byte 0xc8 in position 38: ordinal not in range(128)

|

mquantin · Accepted Answer · 2016-05-31 08:19:37Z

0

In this kind of case, using an utf8 environment, with a stuff that requires a ascii one; i use the decode method.

'ascii' codec can't decode byte 0xc8

Allright, it's normal, this is not ascii but utf8 string: you should decode it with utf8 encoding.

...  
query.decode('utf8')  
ibm_db.exec_immediate(cnx, query)

After that you may need to re-encode the results to write or print them.

answered May 31, 2016 at 8:19

mquantin

1,1581 gold badge8 silver badges23 bronze badges

2 Comments

Jacobian Over a year ago

I will check it in a minute.

Jacobian Over a year ago

It's not working. query.decode('utf8') results into AttributeError: 'str' object has no attribute 'decode'

Collectives™ on Stack Overflow

Python and IBM DB2: UnicodeDecodeError

3 Answers 3

4 Comments

10 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

10 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related