0

I'm trying to get JSON from a Google Trends URL, but I can't convert it to JSON because content goes as b''. How I can get this result as JSON?

My simple code:

import requests
r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
print(r.content)

r.content starts with:

b')]}\'\n{"featuredStoryIds":[],"trendingStoryIds":["RU_lnk_iJ8H1AAwAACP-M_ru","RU_lnk_7H7L0wAwAAAnHM_ru","RU_lnk_Q-IB1AAwAABChM_ru","RU_lnk_EErj0wAwAADzKM_ru","RU_lnk_VY2s0wAwAAD57M_ru","RU_lnk_sdUP1AAwAAC-sM_ru","RU_lnk_ILv60wAwAADa2M_ru","RU_lnk_O6j70wAwAADAyM_ru","RU_lnk_fVQS1AAwAABvMM_ru","RU_lnk_TJ8D1AAwAABP-M_ru","RU_lnk_I97F0wAwAADmvM_ru","RU_lnk_tCrq0wAwAABeSM_ru","RU_lnk_W8EA1AAwAABbpM_ru","RU_lnk_IYX90wAwAADc5M_ru","RU_lnk_bz4M1AAwAABjWM_ru","RU_lnk_EJ-...

Decoding this with the r.json() method fails:

simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2
  • r.content is indeed the raw binary data. Have you looked at the response.json() method yet? What happens when you call that? Commented Jun 14, 2017 at 10:02
  • yep, simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Commented Jun 14, 2017 at 10:03

2 Answers 2

3

You are contacting a Google service, and Google is prefixing JSON with some extra data to prevent JSON hijacking:

>>> import requests
>>> r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
>>> r.content[:10]
b')]}\'\n{"fea'

Note the )]}' and newline characters at the start.

You need to remove this extra data first and manually decode; there are no other newlines in the payload so we can just split on the newline:

import json

json_body = r.text.splitlines()[-1]
json_data = json.loads(json_body)

I used Response.text here to get decoded string data (the server sets the correct content type encoding in the headers).

This gives you a decoded dictionary:

>>> json_body = r.text.splitlines()[-1]
>>> json_data = json.loads(json_body)
>>> type(json_data)
<class 'dict'>
>>> sorted(json_data)
['date', 'featuredStoryIds', 'hideAllImages', 'storySummaries', 'trendingStoryIds']
Sign up to request clarification or add additional context in comments.

3 Comments

TypeError: the JSON object must be str, not 'bytes'
@KonstantinRusanov: ah, older Python 3 version, 3.6 accepts bytes. Will update.
Perfect! Thanks a lot
-2

Maybe try this it it might help:

 import requests
    r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
    page=r.status_code
    print page

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.