0

I have downloaded a sample dataset from here that is a series of JSON objects.

{...}
{...}

I need to load them to a pandas dataframe. I have tried below code

import pandas as pd
import json

filename = "sample-S2-records"

df = pd.DataFrame.from_records(map(json.loads, "sample-S2-records"))

But there seems to be parsing error

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

What am I missing?

1
  • Rather than from_records, have you tried using the built-in pd.from_json() function? It may handle the formatting more easily. Commented Oct 2, 2018 at 21:35

2 Answers 2

2

You can try pandas.read_json method:

import pandas as pd 
data = pd.read_json('/path/to/file.json', lines=True) 
print data

I have tested it with this file, it works fine

Sign up to request clarification or add additional context in comments.

Comments

2

The function needs a list of JSON objects. For example, data = [ json_obj_1,json_obj_2,....]

The file does not contain the syntax for list and just has series of JSON objects. Following would solve the issue:

import pandas as pd
import json

# Load content to a variable
with open('../sample-S2-records/sample-S2-records', 'r') as content_file:
    content = content_file.read().strip()

# Split content by new line
content = content.split('\n')

# Read each line which has a json obj and store json obj in a list
json_list = []
for each_line in content:
    json_list.append(json.loads(each_line))

# Load the json list in form of a string
df = pd.read_json(json.dumps(json_list))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.