1

I have large json file with two lists of json objects.

example data:

data.json

[{"a":1}][{"b":2}]

parser.py

import json

message = json.load(open("data.json"))

for m in message:
    print m

As expected, I get ValueError.

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 10 - line 1 column 19 (char 9 - 18)

I thought of splitting the file by tracking the character count. What would be the pythonic way to handle this issue?

1 Answer 1

2

You could use json.JSONDecoder.raw_decode() which will parse one complete object and return it with the character position it ended at, allowing you to iterate through each one:

from json import JSONDecoder, JSONDecodeError

decoder = JSONDecoder()
data = '[{"a":1}][{"b":2}]'

pos = 0
while True:
    try:
        o, pos = decoder.raw_decode(data, pos)
        print(o)
    except JSONDecodeError:
        break

Result:

[{'a': 1}]
[{'b': 2}]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.