0

I have this json file

{"tstp":1383173780727,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14},{"nb":903,"state":"open","freebk":2,"freebs":18}]}{"tstp":1383173852184,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14}]}

I want to take all the values of nb while inside the first tstp only and stop when reaching the other tstp.

What I am trying to do is to create a file for each tstp and inside this file, it will have nb, state, freebk, freebs as columns in this file.

First time asking a question here...

3
  • That's multiple JSON objects in a single file so afaik it's not valid JSON in aggregate. Is each object guaranteed to be on a separate line of the input file? Commented Feb 20, 2022 at 16:52
  • Yes each object is on a new line Commented Feb 20, 2022 at 17:01
  • OK, then you can read the file using readLines to get a list of lines of text and then parse the relevant line of text from JSON using json.loads. Commented Feb 20, 2022 at 17:07

1 Answer 1

1

This is an interesting problem. It appears that you have multiple JSON files just concatenated together in a file and want to split them. Since the python JSON library only works with complete JSON files and does not accept multiples, it is not sufficient on its own.

Fortunately, Python has really great error handling that you can take advantage of here.

def json_split(text):
    '''Split text into JSON files and yield each JSON object.'''
    while text:
        try:
            yield json.loads(text)
            return
        except json.JSONDecodeError as error:
            yield json.loads(text[:error.pos])
            text = text[error.pos:].strip()

Then:

with open('some_file.json') as file:
    text = file.read()
for json_object in json_split(text):
    print('json_object = ' + repr(json_object))

The function json_split works by trying to parse a JSON object, which then fails if there are more than one JSON objects. Fortunately, the JSONDecodeError that results when it fails tells us exactly which position the next JSON object starts at (pos), so we just split the string there, parse the JSON, and then rinse and repeat.

There are more efficient ways to do this (e.g. using virtual files or views from the io module to cut down on memory copying), but the above will work well for short collections of files.

Edit: Also, to write the files back out, just use json.dump, example below.

for v in json_split(text):
    with open(str(v['tstp']) + '.json', 'w') as file:
        json.dump(v, file)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.