0

I am trying to read bulk data from server of which I have no control over.

Error:

json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1794)

Error Image here

This .json() call throws error json.decoder.JSONDecodeError json.decoder.

import requests
Data = requests.get(Data_Url ,headers=session["headers"]).json()
print(Data) 

This .text returns data in a string which I can't manipulate.

import requests
Data = requests.get(Data_Url ,headers=session["headers"]).text
print(Data) 

As shown below the format of data seems to be

{}
{}
{}

How can manipulate request.get response so that I have JSON format and separated by {},{}?

{ "Data" : [{},{}]}
{"resourceType":"Person","id":"cg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"3g3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"pg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"GA3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"zQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"qQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Fw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Nw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"hw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"DSQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
3
  • Looks like JSONL format... Commented Jan 28, 2022 at 15:51
  • if it's not actually json, and you have no control over it, then don't try to parse it as json. Figure out what it is and read that. It doesn't even close the brackets? Commented Jan 28, 2022 at 16:03
  • The format is not JSONL, nor is it valid JSON (as already noted) — which make "fixing" things a little trickier… Commented Jan 28, 2022 at 16:47

1 Answer 1

1

The results you're getting from .text aren't in a valid JSON (or Python literal) format. After studying the results, I determined that each line in the string returned is missing the characters "}]}]}" at the end that would correct that problem.

The code below adds them to each line, and then parses/evaluated it using the ast.literal_eval() function to turn it into a Python dictionary. A list comprehension is then utilized to put them into a list. In other words, you don't have to bother nesting them inside a dictionary like the {"Data": [{}, {}, ...]} you proposed (unless you really want to for some unknown reason).

from ast import literal_eval
import json
from pprint import pprint

requests_get_text = """\
{"resourceType":"Person","id":"cg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"3g3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"pg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"GA3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"zQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"qQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Fw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Nw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"hw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"DSQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
"""

# Convert result from requests.get().text into valid JSON/Python format.
data = [literal_eval(f'{line}' '}]}]}\n')
                    for line in requests_get_text.splitlines()]
pprint(data, sort_dicts=False)

Output:

[{'resourceType': 'Person',
  'id': 'cg3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': '3g3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'pg3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'GA3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'zQ3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'qQ3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'Fw3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'Nw3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'hw3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]},
 {'resourceType': 'Person',
  'id': 'DSQ3',
  'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                'code': 'UNK',
                                                'display': 'Unknown'}}]}]}]

Sign up to request clarification or add additional context in comments.

1 Comment

I'm glad to hear that...you're welcome. Note that the real problem is that the producer of the data isn't doing it correctly — so this is just a workaround.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.