MySQL export to CSV file as UTF-8 via Python script

Question

I'm able to export a MySQL table into a CSV file via Python csv module but there are no utf-8 characters. (example: ???? chars insted of ąöę).

The table data is in utf-8 format (phpMyAdmin let me see correct data).

I found some information that in Python all data should be decoded in utf-8 and then encoded into CSV in utf-8 via for example unicodewritter (because the native csv module doesn't support Unicode correctly).

I tried a lot but no success.

Question : Is there any example script to export MySQL database in utf-8 to CSV file in utf-8 format in Python?

I use ubuntu 14.04 and there is a problem with mysql.connector so I use MySQLdb with Gord Thompson code :

# -*- coding: utf-8 -*-
import csv
import MySQLdb
from UnicodeSupportForCsv import UnicodeWriter
import sys
reload(sys)  
sys.setdefaultencoding('utf8')
#sys.setdefaultencoding('Cp1252')

conn = MySQLdb.Connection(db='sampledb', host='localhost',           
user='sampleuser', passwd='samplepass')

crsr = conn.cursor()
crsr.execute("SELECT * FROM rfid")
with open(r'test.csv', 'wb') as csvfile:
    uw = UnicodeWriter(
    csvfile, delimiter=',',
    quotechar='"', quoting=csv.QUOTE_MINIMAL)
for row in crsr.fetchall():
    uw.writerow([unicode(col) for col in row])

Error still exist : UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 2: invalid continuation byte

Have you tried using the UnicodeWriter class shown at the very bottom of the documentation page for the csv module? I've used it with Python 2.7 and it worked fine for me. — Gord Thompson
– Gord Thompson, Commented Jan 4, 2016 at 21:49
Thanks for quick reply Gord Thompson. I tried UnicodeWritter but with no success with mysql. It seams it needs some function to decode utf8 sql querry before writing into csv. Could you tell me if you are using this class with mysql utf8 DB? — AvS
– AvS, Commented Jan 4, 2016 at 21:56

Remi Guan · Accepted Answer · 2016-01-06 12:40:14Z

3

MySQL is great in converting character sets. But you need to tell it to set up a connection using the correct collation.

On default it returns how it is put into the database. Add the required charset to the connection:

conn = MySQLdb.Connection(db='sampledb', host='localhost',           
user='sampleuser', passwd='samplepass', charset='utf-8', )

Is this helpful?

edited Jan 6, 2016 at 12:40

Remi Guan

22.5k17 gold badges68 silver badges90 bronze badges

answered Jan 6, 2016 at 9:20

Prikkeldraad

1,3959 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gord Thompson · Accepted Answer · 2016-01-04 23:30:59Z

This works for me with Python 2.7.5 and MySQL Connector/Python 2.0.4:

# -*- coding: utf-8 -*-
import csv
import mysql.connector
from UnicodeSupportForCsv import UnicodeWriter

conn = mysql.connector.connect(
    host='localhost', port=3307,
    user='root', password='whatever',
    database='mydb')
crsr = conn.cursor()
crsr.execute("SELECT * FROM vocabulary")
with open(r'C:\Users\Gord\Desktop\test.csv', 'wb') as csvfile:
    uw = UnicodeWriter(
        csvfile, delimiter=',',
        quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for row in crsr.fetchall():
        uw.writerow([unicode(col) for col in row])

The UnicodeWriter class is taken directly from the last example on the documentation page for the csv module, which I stored in a file named "UnicodeSupportForCsv.py":

import csv, codecs, cStringIO

class UTF8Recoder:
    """
    Iterator that reads an encoded stream and reencodes the input to UTF-8
    """
    def __init__(self, f, encoding):
        self.reader = codecs.getreader(encoding)(f)

    def __iter__(self):
        return self

    def next(self):
        return self.reader.next().encode("utf-8")

class UnicodeReader:
    """
    A CSV reader which will iterate over lines in the CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        f = UTF8Recoder(f, encoding)
        self.reader = csv.reader(f, dialect=dialect, **kwds)

    def next(self):
        row = self.reader.next()
        return [unicode(s, "utf-8") for s in row]

    def __iter__(self):
        return self

class UnicodeWriter:
    """
    A CSV writer which will write rows to CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        # Redirect output to a queue
        self.queue = cStringIO.StringIO()
        self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
        self.stream = f
        self.encoder = codecs.getincrementalencoder(encoding)()

    def writerow(self, row):
        self.writer.writerow([s.encode("utf-8") for s in row])
        # Fetch UTF-8 output from the queue ...
        data = self.queue.getvalue()
        data = data.decode("utf-8")
        # ... and reencode it into the target encoding
        data = self.encoder.encode(data)
        # write to the target stream
        self.stream.write(data)
        # empty queue
        self.queue.truncate(0)

    def writerows(self, rows):
        for row in rows:
            self.writerow(row)

I use ubuntu 14.04 and there is a problem with mysql.connector so I use MySQLdb with Your code :

AvS · Accepted Answer · 2016-01-06 09:44:16Z

0

Finaly it Works! Thanks to : Gord Thompson and Prikkeldraad. Thanks Guys !

# -*- coding: utf-8 -*-
import csv
import MySQLdb
from UnicodeSupportForCsv import UnicodeWriter
import sys
reload(sys)  
sys.setdefaultencoding('utf8')
#sys.setdefaultencoding('Cp1252')

conn = MySQLdb.Connection(db='testdb', host='localhost', user='testuser', passwd='testpasswd', use_unicode=0,charset='utf8')

crsr = conn.cursor()
crsr.execute("SELECT * FROM rfid")

with open(r'test.csv', 'wb') as csvfile:
    uw = UnicodeWriter(
        csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)

    for row in crsr.fetchall():
        uw.writerow([unicode(col) for col in row])

answered Jan 6, 2016 at 9:44

AvS

1611 gold badge2 silver badges10 bronze badges

2 Comments

Louis Barranqueiro Over a year ago

A code block alone does not provide a good answer. Please add explanations (why it solve the issue, where was the mistake, etc...)

Remi Guan Over a year ago

@LouisBarranqueiro: Nah, this is a Thanks comment posted by OP. Please check Gord's answer.

Murali Mopuru · Accepted Answer · 2016-01-06 12:10:03Z

0

Try this one ..make easy for you

https://github.com/jdunck/python-unicodecsv

The unicodecsv is a drop-in replacement for Python 2.7's csv module which supports unicode strings without a hassle. Supported versions are python 2.6, 2.7, 3.3, 3.4, 3.5, and pypy 2.4.0.

>>> import unicodecsv as csv
>>> from io import BytesIO
>>> f = BytesIO()
>>> w = csv.writer(f, encoding='utf-8')
>>> _ = w.writerow((u'é', u'ñ'))
>>> _ = f.seek(0)
>>> r = csv.reader(f, encoding='utf-8')
>>> next(r) == [u'é', u'ñ']
True

edited Jan 6, 2016 at 12:10

answered Jan 6, 2016 at 9:11

Murali Mopuru

6,7205 gold badges37 silver badges53 bronze badges

6 Comments

Remi Guan Over a year ago

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review

Murali Mopuru Over a year ago

@Kevin, Now you are happy!?

Kyll Over a year ago

Hey, thanks for the edit. It definitely improved your answer! Now people can actually start coding right away when reading your post rather than having to go to another website. Thanks!

Remi Guan Over a year ago

@MuraliMopuru: Probably, yes. And I didn't downvote. However, now this is being a real answer. Thanks for the edit.

Remi Guan Over a year ago

@MuraliMopuru: But however, your edit was just added an example of the module's basic usage. What about add the working code based on OP's question and program? I think that would be more helpful.

|

Collectives™ on Stack Overflow

MySQL export to CSV file as UTF-8 via Python script

4 Answers 4

Comments

1 Comment

2 Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

2 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related