1

I am getting parser error from pandas lib...not sure what could be the issue.

Traceback (most recent call last):
  File "C:/2020/python-nifi/test.py", line 4, in <module>
    df = pd.read_csv("C:\\2020\\test\\sum.csv", '\t')
  File "C:\2020\python-nifi\venv\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\2020\python-nifi\venv\lib\site-packages\pandas\io\parsers.py", line 454, in _read
    data = parser.read(nrows)
  File "C:\2020\python-nifi\venv\lib\site-packages\pandas\io\parsers.py", line 1133, in read
    ret = self._engine.read(nrows)
  File "C:\2020\python-nifi\venv\lib\site-packages\pandas\io\parsers.py", line 2037, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 5, saw 4



import pandas as pd


df = pd.read_csv("C:\\2020\\test\\sum.csv", sep='\t')
print(df) 

file trying to read is ...

enter image description here

7
  • 1
    The error is printed here Expected 1 fields in line 5, saw 4, could assist better if you share a sample of your dataframe df Commented Apr 29, 2020 at 8:56
  • 1
    If you want, you can skip erroneous lines using: pandas.read_csv(fileName, sep='delimiter' , error_bad_lines=False) Commented Apr 29, 2020 at 8:57
  • @ Cavin Dsouza, add the screenshot Commented Apr 29, 2020 at 9:01
  • @narendra-choudhary, you mean remove/escape pipe '|' in the cell? Commented Apr 29, 2020 at 9:02
  • after adding "error_bad_lines=False..got little different error.. Commented Apr 29, 2020 at 9:05

2 Answers 2

1

And if you use df = pd.read_csv("filename", sep='[:,|_]',engine='python' ) ? As you can use multiple seperators on import.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Karel, it was issue with the corrupted csv
0

This error comes because of encoding error:

try this:

df = pd.read_csv('filename', encoding='utf-8')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.