1
import re
import urllib
p = urllib.urlopen("http://sprunge.us/QZhU")
page = p.read()
pos = page.find("<h2><span>")
print page[pos:pos+48]
c = re.compile(r'<h2><span>(.*)</span>')
print c.match(page).group(1)

When I run it:

shadyabhi@archlinux $ python2 temp.py 
<h2><span>House.S08E02.HDTV.XviD-LOL.avi</span> 
Traceback (most recent call last):
  File "temp.py", line 8, in <module>
    print c.match(page).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
shadyabhi@archlinux $ 

If I can find a string using string.find then what is the problem when I use regex. I have tried looking http://docs.python.org/howto/regex.html#regex-howto but no help.

1 Answer 1

6

match only matches at the beginning of the string. Use search, finditer or findall.

Also note that * is greedy. You might want to change your regex to r'<h2><span>(.*?)</span>'.

In summary, the following works for me:

import re
import urllib
p = urllib.urlopen("http://sprunge.us/QZhU")
page = p.read()
pos = page.find("<h2><span>")
print page[pos:pos+48]
c = re.compile(r'<h2><span>(.*?)</span>')
print c.search(page).group(1)
Sign up to request clarification or add additional context in comments.

4 Comments

Does by beginning mean "till the first newline comes"? Also, please tell what happened when I add "?"? It matches zero or 1 repetition of previous RE. Whats previoud RE? Isn't the while string a RE?
@shadyabhi: No, it means that the first character of the string must be the first character of the match. Matches beginning at the second character and thereafter are not considered. On the words, for match to work, the HTML document must begin with <h2><span>..., not contain it somewhere in the middle.
Also, please tell what happened when I add "?"? It matches zero or 1 repetition of previous RE. Whats previous RE? Isn't the whole string a RE?
@shadyabhi: *? is a single operator. It matches as few characters as possible, whereas * matches as many as possible. See docs.python.org/library/re.html#regular-expression-syntax

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.