0

I am trying to open a json file from an API with includes characters of the polish alphabet. I have tried to encode the url into utf-8 but still all kind of problems pop up. I include the code I wrote and the error that appears.

import urllib.request as request
import json
url='https://api.um.warszawa.pl/api/action/dbtimetable_get?id=myapiID&busstopId=wartość&busstopNr=wartość&line=wartość&apikey=wartość'
url=url.encode('utf-8')
with request.urlopen(url) as response:
    source = response.read()
    data = json.loads(source)

Then the error: 'bytes' object has no attribute 'timeout' appears.

6
  • post the full traceback. are you on python 2 or 3? Commented Sep 17, 2020 at 16:15
  • Could you also try posting the result of printing the object that it says has no attribute 'timeout' ? Commented Sep 17, 2020 at 16:47
  • That's interesting... here somebody seemed to have solved that stackoverflow.com/questions/1916684/…. But I tried your version and I also have the timeout error: AttributeError: 'bytes' object has no attribute 'timeout'. I tried to tweak it with a custom class: class StringWithTimeout(str): def __new__(cls, string, timeout): obj = str.__new__(cls, string) setattr(obj, 'timeout', timeout) return obj. But then I get URLError: <urlopen error unknown url type: b'https> Commented Sep 17, 2020 at 17:17
  • Which python version are you using? Commented Sep 17, 2020 at 17:21
  • Yet another potential solution stackoverflow.com/questions/36395705/… Commented Sep 17, 2020 at 17:32

1 Answer 1

0

There are two problems here, probably both stemming from the requirement to access a url with query components that include non-ASCII characters.

  • Firstly, passing a bytes instance to urlopen will lead to unexpected behaviour, as described here
  • Secondly, non-ASCII characters in a URL's query parameters are not permitted, so the query parameters must be urlencoded.

So given the invalid url, you need to do something like this:

import json
from urllib import parse
from urllib import request

parts = parse.urlsplit(url)
query_dict = parse.parse_qs(parts.query)
encoded_query = parse.urlencode(query_dict)
fixed_url = parse.urlunsplit((parts.scheme, parts.netloc, parts.path, encoded_query, parts.fragment))
response = request.urlopen(fixed_url)

print(json.load(response))
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for the suggestion @snakecharmerb. I tried and it seems that there is an issue with non-string sequence or mapping object. I include the exact error for detail. # non-empty strings will fail this 890 if len(query) and not isinstance(query[0], tuple): --> 891 raise TypeError 892 # Zero-length sequences of all types will get here and succeed, 893 # but that's a minor nit. Since the original implementation TypeError: not a valid non-string sequence or mapping object
Sorry, I missed a step in the answer. It should work now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.