0

I am working for hours on loading a CSV file into Python using the well-known pd.read_csv('..')

However, there is a problem:

Error message : Error tokenizing data. C error: Expected 3991 fields in line 14, saw 4572

But yes, my code is without mistakes.

The CSV looks like this..

{"_id":{"$oid":"5cf683d88eb9ad12c84f6469"},"ID":"22991137","name":"M. Lundströ 

Maybe the problem occurs because MongoDB is using strict BSON formats, but honestly - I do not know anything about that.

Does anyone have a solution ?

5
  • Sorry, Error Code is this : ** : Error tokenizing data. C error: Expected 3991 fields in line 14, saw 4572** Commented Jul 5, 2019 at 19:07
  • Have you checked what is wrong in line 14? Did you use mongoexport ? Also, give this a try: pd.read_csv(filename, delimiter=",", encoding='utf-8') Commented Jul 5, 2019 at 19:18
  • 1
    I think you shouldn't try to load a json file using read_csv. Have you already tried pandas.io.json.json_normalize? See pandas.pydata.org/pandas-docs/version/0.17.0/generated/… Commented Jul 5, 2019 at 20:51
  • The point is, that in a json file fields, that are not filled are normally omitted, while in a csv file they have to be present and also the order plays no role in json while in csv it's important. Commented Jul 5, 2019 at 20:53
  • Are you able to use something like pastebin to provide a link to the whole file? Commented Jul 7, 2019 at 12:56

1 Answer 1

1

You can use pd.read_csv() only on a csv file. However the format looks like invalid JSON to me(parenthesis not closed).

You need to export this way for mongodb -

mongoexport --db dbname --collection col --type=csv --fields _id,field1,feild2 --out outfile.csv

EDIT:

if you want to read the JSON file only, you may read it like this -

import json

with open('filepath', 'rb') as f:
    data = json.load(f)
    print(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.