I am using this code to call the NY Times API and get the json data about the articles on a chosen search query.
import urllib
import re
import json
htmltext = urllib.urlopen('******http://call******')
data = json.load(htmltext)
print data
It prints out a result structured like this:
{u'status': u'OK', u'response': {u'docs': [{u'type_of_material': u'Article', u'blog': [], u'news_desk': None, u'lead_paragraph': u'Hon. PRESTON KING, one of the ablest, most upright, and most influential members of the Democratic party of this State, thus expression his opinion in regard to political prospects in a letter to a friend: OGDENSBURG, Saturday, Sept. 16, 1854.', u'headline': {u'main': u'POLITICAL.; New-York Politics--Letter from Preston King.'}, u'abstract': u'Letter to Jerry Rescue Celebration', u'print_page': u'8', u'word_count': 1526, u'_id': u'4fbfd3e945c1498b0d00ddca', u'snippet': u'Hon. PRESTON KING, one of the ablest, most upright, and most influential members of the Democratic party of this State, thus expression his opinion in regard to political prospects in a letter to a friend: OGDENSBURG, Saturday, Sept. 16, 1854.', u'source': u'The New York Times', u'web_url': u'http://query.nytimes.com/gst/abstract.html?res=950CE6DE1238EE3BBC4B53DFB667838F649FDE', u'multimedia': [], u'subsection_name': None, u'keywords': [{u'name': u'persons', u'value': u'KING, PRESTON'}, {u'name': u'persons', u'value': u'SUMNER CHARLES'}, {u'name': u'persons', u'value': u'BEECHER, HENRY WARD'}], u'byline': None, u'document_type': u'article', u'pub_date': u'1854-10-03T00:03:58Z', u'section_name': None}, {u'type_of_material': u'Article', u'blog': [], u'news_desk': None, u'lead_paragraph': u'MISSISSIPPI LAWS IN WANT OF REMODELING. GOVERNOR McWILLIE, of Mississippl, has summened an extra session of the State Legislature, to assemble on the first Monday in November next, In this State, as in others which have adopted the system of biennial sessions of their Legislatures. the plan has not been found to be the best for the interests of the people.', u'headline': {u'main': u'Article 1 -- No Title', u'kicker': u'1'}, u'abstract': None, u'print_page': u'3', u'word_count': 334, u'_id': u'4fbfe29945c1498b0d04bed8', u'snippet': u'MISSISSIPPI LAWS IN WANT OF REMODELING. GOVERNOR McWILLIE, of Mississippl, has summened an extra session of the State Legislature, to assemble on the first Monday in November next, In this State, as in others which have adopted the system of biennial...', u'source': u'The New York Times', u'web_url': u'http://query.nytimes.com/gst/abstract.html?res=9F06E7D61331EE34BC4952DFBE668383649FDE', u'multimedia': [], u'subsection_name': None, u'keywords': [], u'byline': None, u'document_type': u'article', u'pub_date': u'1858-08-11T00:03:58Z', u'section_name': None}, ... u'meta': {u'hits': 150, u'offset': 0, u'time': 38}}, u'copyright': u'Copyright (c) 2013 The New York Times Company. All Rights Reserved.'}
For the purposes of example, I pasted here data for just 2 articles (it actually gives you data for 10 articles).
Now I want to parse that data and extract all the 'web_url' attributes. How can I do that?
I tried that code:
import urllib
import re
import json
htmltext = urllib.urlopen('******http://call******')
data = json.load(htmltext)
print data['web_url']
But it gives me this error:
Traceback (most recent call last):
File "json_trying.py", line 10, in <module>
print data["web_url"]
KeyError: 'web_url'