Creating Pandas dataframe from multiple json files

Question

I have a directory full of JSON files that I need to extract information from and convert into a Pandas dataframe. My current solution works, but I have a feeling that there is a more elegant way of doing this:

for entry in os.scandir(directory):
    if entry.path.endswith(".json"):
        with open(entry.path) as f:
            data = json.load(f)
            ...
            newline = field1 + ',' + field2 + ',' + ... +  ',' + fieldn
            output.append(newline)
...
df = pd.read_csv(io.StringIO('\n'.join(output)))

why not read each json into dataframe and combine all these dataframes into one big dataframe — buran
– buran, Commented Jun 16, 2021 at 5:43
data is the source for all the values (field1, field2, ...) that I need to store in a df. I basically convert data into a comma separated value string to be used later. — Keek
– Keek, Commented Jun 16, 2021 at 5:51

Harxish · Accepted Answer · 2021-06-16 05:58:25Z

4

Yes, this can be done better.

import os
import pandas as pd
from glob import glob

all_files = glob(os.path.join(path, "*.json"))
ind_df = (pd.read_json(f) for f in all_files)
df = pd.concat(ind_df, ignore_index=True)

Using generators will save a lot of computation and memory.

answered Jun 16, 2021 at 5:58

Harxish

4673 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Creating Pandas dataframe from multiple json files

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related