I'm using sqlalchemy core to execute string based queries. I have set charset to utf8mb4 on the connection string like this:
"mysql+mysqldb://{user}:{password}@{host}:{port}/{db}?charset=utf8mb4"
For some simple select queries (e.g, select name from users where id=XXX limit 1), when the resultset has some unicode characters (e.g, ', ì, etc), it errors out with the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9a in position 11: invalid start byte
But the error itself is not reproducible. When I run the same query from a python shell, it works without errors. But it errors out on a web request or background job.
I'm using Python 3.8 and sqlalchemy 1.3.24.
I have also tried explicitly specifying charset: utf8mb4 as a connect_args property with create_engine().
The underlying database is mysql 5.7 and all the unicode columns have utf8mb4 explicitly set as the characters set in the schema.
Update: The database is actually AWS RDS Aurora MySQL.
Appreciate any insights on the error or how to reproduce it reliably.