1

I have a CSV file where columns are separated using a non-standard symbol (||/).

df = pd.read_csv('data_analyst_assignment.csv',sep='||/', engine='python')

This throws an error:

ParserError: Expected 61 fields in line 3, saw 68. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

Can you please help me how to read this file?

5
  • No knowing even data structure, how one can diagnose the problem? Commented Aug 16, 2020 at 20:44
  • 3
    Try \|\|/ and tell us (escape pipe) Commented Aug 16, 2020 at 20:45
  • 2
    looks like you have 61 columns, but in line 3 you have 68 values. If you could share a sample dataset, that would be helpful to diagnose the problem. Commented Aug 16, 2020 at 20:47
  • Thank you azro it worked! So, @ipj you do not need the know the data structure, huh? Commented Aug 16, 2020 at 21:01
  • I've set down an answer so ;) Commented Aug 17, 2020 at 6:15

1 Answer 1

4

From .read_csv()

sep:str, default ‘,’ : Delimiter to use. ... In addition, separators longer than 1 character and different from '\s+' will be interpreted as regular expressions and will also force the use of the Python parsing engine.

And | is special char in regex grammar (means OR) so you need to escape it, so you need

df = pd.read_csv('data_analyst_assignment.csv',sep='\|\|/', engine='python')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.