Reading csv with unicodecsv: UnicodeDecodeError

Question

I have these lines of code:

zf = zipfile.ZipFile(self.temp_file, 'r')
data = zf.open('myfile.csv', mode='r')
result = [link for link in unicodecsv.DictReader(data)]

And here's the exception code:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 13: invalid continuation byte

Input string is:

CAFÉ RESTAURANT

So what am I doing wrong and why unicodecsv can't handle utf-8?

because the content is not UTF-8.

Antti Haapala
– Antti Haapala

2015-04-06 19:05:41 +00:00
Commented Apr 6, 2015 at 19:05 — Antti Haapala
– Antti Haapala, Commented Apr 6, 2015 at 19:05

Antti Haapala · Accepted Answer · 2015-04-06 19:09:06Z

3

It is because your input is not UTF-8, but Latin-1 (or similar). In UTF-8, É is encoded as 2 bytes: '\xc3\x89'. The error informs that the \xc9 byte was met in the input; this is És encoding in Latin-1 or Win-1252 codepages.

answered Apr 6, 2015 at 19:09

Antti Haapala

135k23 gold badges297 silver badges349 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Reading csv with unicodecsv: UnicodeDecodeError

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related