2

I encountered an error while export an utf-8 csv file in python. The error says

AttributeError: 'int' object has no attribute 'encode'

First, I use pyodbc to connect microsoft access database and get data there.

MDB = "E:/Research/2000-01.mdb"; DRV = '{Microsoft Access Driver (*.mdb)}'; PWD = 'pw'
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
SQL = 'SELECT * FROM 200001;'
rows = cur.execute(SQL).fetchall()
cur.close()
con.close()

then use the class,

class UnicodeWriter:
    """
    A CSV writer which will write rows to CSV file "f",
    which is encoded in the given encoding.
    """

def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
    # Redirect output to a queue
    self.queue = cStringIO.StringIO()
    self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
    self.stream = f
    self.encoder = codecs.getincrementalencoder(encoding)()

def writerow(self, row):
    self.writer.writerow([s.encode("utf-8") for s in row])
    # Fetch UTF-8 output from the queue ...
    data = self.queue.getvalue()
    data = data.decode("utf-8")
    # ... and reencode it into the target encoding
    data = self.encoder.encode(data)
    # write to the target stream
    self.stream.write(data)
    # empty queue
    self.queue.truncate(0)

def writerows(self, rows):
    for row in rows:
        self.writerow(row)

I started to write the utf-8 csv file

with open("E:/Research/200001.txt", 'wb') as f:
    writer = UnicodeWriter(f)
    writer.writerows(rows)

An exemplary line in rows is

(577540, u'1', datetime.datetime(2000, 1, 1, 0, 0), u'85411000', u'53', u'4403944851', u'44039', u'10', u'116', u'110', u'4', u'01', 89956, 0.15575717389583588, u'\u5916\u5546\u72ec\u8d44\u4f01\u4e1a', u'\u5c71\u7279\u7535\u5b50 (\u6df1\u5733) \u6709\u9650\u516c\u53f8', u'\u6df1\u5733\u5b9d\u5b8972\u533a\u5b9d\u77f3\u8def\u53f7', u'755 27757943', u'', u'518101', u'', u'\u90d1\u66fc\u5a1c', u'\u4e8c\u6781\u7ba1\uff0c\u4f46\u5149\u654f\u4e8c\u6781\u7ba1\u6216\u53d1\u5149\u4e8c\u6781\u7ba1\u9664\u5916', u'\u5e7f\u4e1c\u7701\u6df1\u5733', u'\u65e5\u672c', u'\u6df1\u5733\u6d77\u5173', u'\u4e00\u822c\u8d38\u6613', u'\u6c7d\u8f66\u8fd0\u8f93', u'\u4e2a/\u5957', u'\u9999\u6e2f', u'\u8fdb\u53e3') 

It looks like each line contains some integers and datetime stuff. Any idea of solving this problem? Thanks a lot!

1 Answer 1

1

You probably need to do something like this first, just before writer.writerows:

rows = [[unicode(x) for x in row] for row in rows]

Or, my guess is it's crapping out on trying to write the database row ID. So you could also probably try slicing that off:

rows = [row[1:] for row in rows]

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.