Dump multiple json objects to one folder path in python

Question

I have a large dataset with one column, where each row contains text and I would like to transform each row to a json object and then dump all of them to a folder path. So, the folder path will contain as many json files as the rows of the dataset, with every json file containing the id and the text of every row of the dataset.

Is this possible? Because, for similar cases I only saw how to create one huge json object - and this is not what I want in this case. Here is my code so far:

SOLVED

import pandas as pd
import os
import sys
from os.path import expanduser as ospath
import simplejson as json
import numpy as np



sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')))
data_folder = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "data", "model", 'Final.xlsx'))
single_response = pd.read_excel(ospath(data_folder), sheetname='Sheet 1')
answers_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "processes"))


class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.int64):
            return int(obj)
        elif isinstance(obj, np.float):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, dict):
            return dict(obj)
        else:
            return super(MyEncoder, self).default(obj)

#TODO: function will return idx and text and dump json files for each answer (idx, value) to "answers" path

def create_answer_process(Answer, idx):
    #answers = []
    for idx, value in single_response.iterrows():
        answer = {
            "id": idx,
            "pattern": value['Answer']
        }
        #answers.append(answer)

        #process = json.dumps(answers, cls=MyEncoder, indent=2)

        with open(os.path.join(answers_path, str(idx)) + '.json', 'w') as f:
             json.dump(answer, f, cls=MyEncoder, indent=2)

    return idx

Thanks @keredson !

keredson · Accepted Answer · 2017-07-26 15:41:49Z

1

You look pretty close. The problem is this line:

process = json.dumps(answers, cls=MyEncoder, indent=2)

You should dump answer, not answers. You likely don't need answers at all. So something like:

def create_answer_process(Answer, idx):
    for idx, value in single_response.iterrows():
        answer = {
            "id": idx,
            "pattern": value['Answer']
        }
        with open(os.path.join(answers_path, idx), 'w') as f:
          json.dump(answer, f, cls=MyEncoder, indent=2)
    return idx

answered Jul 26, 2017 at 15:41

keredson

3,1071 gold badge20 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

joasa Over a year ago

What you suggest, makes sense. I changed my code, but now i get TypeError: join() argument must be str or bytes, not 'int64' here: with open(os.path.join(answers_path, idx), 'w') as f: . If I change this line to with open(os.path.join(answers_path), 'w') as f: I get a Permission denied error and if I change it to with open((answers_path, idx), 'w') as f: I get TypeError: expected str, bytes or os.PathLike object, not tuple

joasa Over a year ago

In this case : with open(os.path.join(answers_path), 'w') as f: I don't know why I get a Permission denied error. I wonder if it has to do with the fact that each JSON object that will be dumped, should be given a name / extension. So, my answers_path, is a path to the folder I wanna store all the JSON objects, and if I were to create a huge JSON object, I would use something like that: answers_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "processes", "processes.json")). But I want each JSON to be dumped separately. How to reform my path?

keredson Over a year ago

just os.path.join(answers_path) because you're trying to open a dir for writing, not a file in the dir. the first error is because idx not a string. what filename would you like in the dir? os.path.join(answers_path, str(idx)) should do it, as would os.path.join(answers_path, '%s.json'%idx), etc.

Collectives™ on Stack Overflow

Dump multiple json objects to one folder path in python

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related