Pandas read_csv() returns UnicodeDecodeError on some specific rows.
If I use nrows=n1 it works without any error. But when I use nrows=n2 (>n1) somehow it returns
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 12: invalid start byte
It worked fine before, but at some point it keeps me returning the error. Sometimes it works again when I reboot the computer, but only for the first time I try to call it.
Tried read_csv with and without encoding option. Also tried error_bad_lines=False.
This is driving me crazy. Any ideas? If this is related to system issue, at least I want to know how to get the row number of problematic row.
(I exported table from MATLAB with encoding specified as etf-8 (also tried CP949, which is my system's default encoding). Importing from SAS wass successful.)
read_csvas shown here.chardet.detect, or any text editor able to read your file and tell you what encoding it uses, or one of the many online tools that let you detect your encoding...encoding='latin1'inread_csv;)