0

I am trying to get the contents of a file from GitHub - using Python Requests and GitHub API. Eventhough the contents look like JSON to me, when I programmatically check, it says it is not JSON format and I am unable to get the JSON data out of it. Here is what I have so far (for a small subset of my URls):

These are the two URLs I have tried and both fail:

import requests

from requests.exceptions import HTTPError
from json.decoder import JSONDecodeError

myurl = 'https://github.com/bitpod-io/arsenal/blob/a5af2f9bff13d8b6c6592437a19a712edb49ecb0/tests/utils/samplePolicies.json'
myurl = 'https://github.com/yasuhisa1984/achieve/blob/fa6334c484ee9e7c08e2d280fa16ec5bba5f2369/vendor/bundle/gems/fog-aws-1.2.1/lib/fog/aws/iam/default_policy_versions.json'

token = 'mylongtoken'

    try:
        #response = requests.get(myurl, headers={'Authorization': 'token {}'.format(token), 'Accept': 'application/vnd.github.v3.raw'})
        response = requests.get(myurl, headers={'Authorization': 'token {}'.format(token)})

        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')
    except Exception as err:
        print(f'Some other error occurred: {err}')
    else:
        print('Success')
        print(response.headers)
        if 'json' in response.headers.get('Content-Type'):
            try:
                resp_json = response.json()
            except JSONDecodeError:
                print(f'Unable to get data in JSON format.')
            else:
                print(resp_json)
        else:
            print('Response content is not in JSON format.')

Can anyone please let me know if I am wrong somewhere. I want to get the file contents as either text or json (preferable) using API.

1 Answer 1

1

Make sure you are downloading from the "raw" content URL, and not the HTML github page. I believe this is the URL for the raw JSON file you want:

https://raw.githubusercontent.com/yasuhisa1984/achieve/master/vendor/bundle/gems/fog-aws-1.2.1/lib/fog/aws/iam/default_policy_versions.json

The following code works for me with that URL:

import requests
url = "https://raw.githubusercontent.com/yasuhisa1984/achieve/master/vendor/bundle/gems/fog-aws-1.2.1/lib/fog/aws/iam/default_policy_versions.json"
resp = requests.get(url)
data = resp.json()
Sign up to request clarification or add additional context in comments.

3 Comments

I still get the same error: 'Response content is not in JSON format.' Is my requests.get() properly formulated?
@user1717931 this is probably to late to help you now but if anyone else faces the same problem, response.content is where to look.
@AndrewCalder This code still works for me as-is. You shouldn't need to use response.content if you are downloading the raw json file.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.