3

I am using MySql and Python 3.6/3.8 with mysqlclient to maintain a database with a binary field. The code was constructed before my time in Python 2.7 to insert directly into field with binary data. For example, the table looks like

+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| Field                   | Type                                                               | Null | Key | Default    | Extra |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| license_id              | varchar(36)                                                        | NO   | PRI | NULL       |       |
| data                    | binary(32)                                                         | NO   | UNI | NULL       |       |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+

and the insert command looked like:

 dbconn.execute("""INSERT into datatable (
                            license_id,
                            data)
                        VALUES ("{:s}", "{:s}")
                    """.format(
                        str(self._license_id),                        
                        escape_string(self._data)))

and this would result in Python 2.7 inserting the binary data directly in the command like

 INSERT into datatable (license_id, data) VALUES ("XXX-XXX-52", "������C�...\r�x�b24�B")

but in Python 3 the data is a byte array type and cannot be inserted into a string and attempting to decode it results in:

escape_string(self._auth_hash).decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 5: invalid continuation byte

Other questions seem to indicate a viable path to fix this is to convert the binary to hex (here, for example) but is there any way to insert into a Python 3 unicode string without errors as the original code did?

It appears by removing the quotes around "{:s}" and using binascii.b2a_hex(self._data) would have the same intent, but it's not clear this will have the exact same behavior or if there will be problem. I expect this to be a recurring problem and am looking for the solution with the least side effects possible.

1 Answer 1

2

This can be done by letting the database connector handle the escaping automatically instead of using a combination of manual escaping and string formatting.

Given these bytes (Python 2.7 will accept the b"" form as str):

bs = b"\xec\xa6\x8fH~\x83\\\xfeoc\x92\xf9\xad\xe6'\xc6\x1a\x00j\x99$x\xd9\xda\xef\x0c\x080rm\xe6\xcd"

The code in the question and this Python 3 code will insert the same value into the data column.

conn = MySQLdb.connect(db='test')
cur = conn.cursor()
cur.execute(
        """INSERT INTO datatable (license_id, data) VALUES (%s, %s)""",
        ('python3', bs))
conn.commit()
conn.close()
Sign up to request clarification or add additional context in comments.

2 Comments

How do we go about the same for update query? I have a update query like this: query = """UPDATE files SET file = '%s' WHERE id = %d""" % (self.file, self.id) Where the self.file is a bytes data type
Probably by not using string formatting - try query = """UPDATE files SET file = '%s' WHERE id = %s""", (self.file, self.id)) . In general, don't uses string formatting to set values in SQL queries, it's error prone and dangerous.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.