1

I am calling the URL :

http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json

using urllib2 and decoding using the json module

url = "http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json"
request = urllib2.Request(query)
response = urllib2.urlopen(request)
issue_report = json.loads(response.read())

I run into the following error :

ValueError: Invalid control character at: line 1 column 1120 (char 1120)

I tried checking the header and I got the following :

Content-Type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *
Expires: Sun, 03 Jul 2011 17:38:38 GMT
Date: Sun, 03 Jul 2011 17:38:38 GMT
Cache-Control: private, max-age=0, must-revalidate, no-transform
Vary: Accept, X-GData-Authorization, GData-Version
GData-Version: 1.0
ETag: W/"CUEGQX47eCl7ImA9WxJaFEw."
Last-Modified: Tue, 04 Aug 2009 19:20:20 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE
Connection: close

I also tried adding an encoding parameter as follows :

issue_report = json.loads(response.read() , encoding = 'UTF-8')

I still run into the same error.

1
  • It looks like that what you get is not a valid json encoded string. Commented Jul 3, 2011 at 17:48

2 Answers 2

4

The feed has raw data from a JPEG in it at that point; the JSON is malformed, so it's not your fault. Report a bug to Google.

Sign up to request clarification or add additional context in comments.

1 Comment

Oh! I suspect people have run into such issues earlier too. code.google.com/p/gdata-issues/issues/detail?id=942
2

You could consider using lxml instead, since the JSON is malformed. It's XPath support makes working with XML pretty straight-forward:

import lxml.etree
url = 'http://code.google.com/feeds/issues/p/chromium/issues/full/291'
doc = lxml.etree.parse(url)
ns = {'issues': 'http://schemas.google.com/projecthosting/issues/2009'}
issues = doc.xpath('//issues:*', namespaces=ns)

Fairly easy to manipulate elements, for instance to strip namespace from tags, convert to dict:

>>> dict((x.tag[len(ns['issues'])+2:], x.text) for x in issues)
<<<    
{'closedDate': '2009-08-04T19:20:20.000Z',
 'id': '291',
 'label': 'Area-BrowserUI',
 'stars': '13',
 'state': 'closed',
 'status': 'Verified'}

2 Comments

Thanks but I always prefer JSON objects since they are very easily converted into dictionaries.
I prefer JSON too, but sometimes you don't have a choice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.