0

Some http response header without English (using utf-8) has encoding error. original text (2. 핵심정보를 담은 발표 형성평가 5월 19일) I uploaded this text on server.

and I responded to server with get function. (requests.get()) but I got this text (2. íµì¬ì ë³´ë¥¼ ë´ì ë°í íì±íê° 5ì 19ì¼) from http response header.

I converted the text has encoding error to ascii to utf8 converter. It converts successfully. maybe requests package gets http response header with ascii encoding.

EDIT

I tried this code req.encoding = 'utf-8' but this code didn't work.

code:

headers = {
            'Accept-Encoding': 'gzip, deflate, br',
            'Accept-Language': 'ko-KR,ko;q=0.9,en-US;q=0.8,en;q=0.7',
            'Content-type': 'text/plain; charset=utf-8'
        }

        req = requests.get(link, headers=headers, allow_redirects=True)
        req.encoding = 'utf-8'
print(req.headers['Content-Disposition']) # this code prints the text has encoding error

Also you can view my issue on github python requests https://github.com/psf/requests/issues/5463 )

3
  • whats the actual url so we can try it out? Commented May 19, 2020 at 16:54
  • @Tarique Actually, you have to know send-anywhere. because I'm making API for send-anywhere. To get URL, you have to send a file from send-anywhere and write the number and get link using this program ( go to my github repo and run test-for-requests.py sorry you can tell me if you don't know to use send-anywhere. I will tell you how to send a file from send-anywhere. Commented May 19, 2020 at 17:57
  • @Tarique Thank you. but I fixed now. you can view how I fixed it. Commented May 20, 2020 at 5:24

1 Answer 1

1

ANSWER

text.encode("ISO-8859-1").decode("utf-8")

I just fixed it. I had to encode with ISO-8859-1 and decode with utf-8.

Sign up to request clarification or add additional context in comments.

1 Comment

useful, worked with python3.6.4 and requests=2.21.0, win64 default codepage 936. I was using requests to download file, and when trying to get file name from response headers, I only got some unreadable text. I don't know why this happens, but re-encode the text with iso8859-1 fixed the problem. May be the requests library doesn't treat the response headers properly and mis-decode it using iso8859-1?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.