2

I have a directory with a large number of json files. Now I want python to read them all in and creates a single jsonl output file.

Here is a post that did something similar (Python conversion from JSON to JSONL), but in comparison to this post the starting point of my question is reading the jsons in to create python object first, before converting them into jsonl.

1 Answer 1

2

Here's how you read json files from a directory in python and then output the loaded json files into a single jsonl file:

import os, json
import pandas as pd

directory = '/Path/To/Your/Json/Directory'  #Specify your json directory path here

json_list=[]    #Initiate a new blank list for storing json data in list format
for dirpath, subdirs, files in os.walk(directory):
    print(dirpath)
    print(filename)
    print(file)
    for file in files:
        if file.endswith(".json"):
            with open(os.path.join(dirpath, file)) as json_file: 
                data = json.load(json_file) 
                json_list.append(data)

#Now, output the list of json data into a single jsonl file
with open('output.jsonl', 'w') as outfile:
    for entry in json_list:
        json.dump(entry, outfile)
        outfile.write('\n')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.