I have a code which get nested object and remove all nesting (make the object flat):
def flatten_json(y):
"""
@param y: Unflated Json
@return: Flated Json
"""
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
out[name[:-1]] = x
else:
out[name[:-1]] = x
flatten(y)
return out
def generatejson(response):
sample_object = pd.DataFrame(response.json())['results'].to_dict()
flat = {k: flatten_json(v) for k, v in sample_object.items()}
return json.dumps(flat, sort_keys=True)
respons= requests.get(urlApi, data=data, headers=hed, verify=False)
flat1 = generatejson(respons)
....
storage.Bucket(BUCKET_NAME).item(path).write_to(flat1, 'application/json')
This does the following:
- Get call from API
- remove nested objects
- generate json
- upload json to Google Storage.
This works great. The problem is that BigQuery does not support Json so I need to convert it to newline Json standard format before the upload.
Is there a way to change return json.dumps(flat, sort_keys=True) so it will return the new Json format and not regular Json?
Sample of my Json:
{"0": {"code": "en-GB", "id": 77, "languageName": "English", "name": "English"},
"1": {"code": "de-DE", "id": 78, "languageName": "Deutsch", "name": "German"}}
Edit:
the expected result is of the new line json is:
{"languageName":"English","code":"en-GB","id":2,"name":"English"}
{"languageName":"Deutsch","code":"de-DE","id":5,"name":"German"}
For example if I take the API call and do:
df['results'].to_json(orient="records",lines=True)
This will give the desired output. but I can't do that with json.dumps(flat, sort_keys=True) there is no use of dataframe there.
[1]is still a valid list, despite not containing any commas.json.dumps(flat, sort_keys=True).replace('\n', ''). You might need to add back a newline on the end.{"languageName":"English","code":"en-GB","id":2,"name":"English"} {"languageName":"Deutsch","code":"de-DE","id":5,"name":"German"}For example if you take the sample of my json from question and you'll do df['results'].to_json(orient="records",lines=True) on it (panda dataframe).. this is the output...