Can't use a string pattern on a bytes-like object - python's re error [duplicate]

Question

I'm doing the python challenge and trying to familiarize myself with python, so without looking at the answers, I tried using python's url reader to read the html and then find the letters needed. However in the code below I get an error, which was originally the python 3 urllib.request but after resolving it I get a new error:

<module>
    print ("".join(re.findall("[A-Za-z]", data)))
  File "C:\Python34\lib\re.py", line 210, in findall
    return _compile(pattern, flags).findall(string)
TypeError: can't use a string pattern on a bytes-like object

Now I tried looking this error up on google, but all I got was about json, which I shouldn't need? My python isn't that strong, so maybe I am doing this incorrectly?

#Question 2 - find rare characters

import re
import urllib.request

data = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/ocr.html")
mess = data.read()
messarr = mess.split("--")

print ("".join(re.findall("[A-Za-z]", data)))

#Question 3 - Find characters in list

page = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/equality.html")
mess = page.read()
messarr = mess.split("--")
print ("".join(re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+", page)))

wouter bolsterlee · Accepted Answer · 2015-05-27 09:53:44Z

18

The problem is that you're mixing bytes and text strings. You should either decode your data into a text string (unicode), e.g. data.decode('utf-8'), or use a bytes object for the pattern, e.g. re.findall(b"[A-Za-z]") (note the leading b before the string literal).

edited May 27, 2015 at 9:53

answered May 27, 2015 at 9:47

wouter bolsterlee

4,06725 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Can't use a string pattern on a bytes-like object - python's re error [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related