parsing html tag in python with reg not working

Question

im having a bit of trouble with this code as it is not working how i intend it. i know regular expressions arent the best way to fo this but i couldnt figure out how to do it with the html parser and beautiful soup isnt an option. heres what im trying to do. i have an html file and i need to extract the value between

<div class="e_mail"> and </div>

when i use the below code however it returns the email address as such:

['[email protected]']

how can i get the email address without the brackets and quotes? id rather use something cleaner than reg but as i said couldnt figure out the html parser.

f=urllib.urlopen('results.html')
s = str(f.read())
return re.compile('<div class="e_mail">(.*?)</div>', re.DOTALL).findall(s)

that worked great. i was trying to do that but was going about it all wrong. i know RE isnt the way to do this but i don't really need anything better. thanks again. — Bobbin Threadbare
– Bobbin Threadbare, Commented Nov 15, 2012 at 22:36

Joel Cornett · Accepted Answer · 2012-11-15 22:32:14Z

1

Do

return re.compile(expr, re.DOTALL).findall(s)[0]

Alternatively:

return re.findall(r'<div class="e_mail">(.*?)</div>', s, re.DOTALL)[0]

Note that if there are no results, you'll get an IndexError because re.findall will simply return an empty list.

answered Nov 15, 2012 at 22:32

Joel Cornett

24.8k9 gold badges69 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sibi · Accepted Answer · 2012-11-15 22:31:20Z

0

This may work for you:

f=urllib.urlopen('results.html')
s = str(f.read())
email = re.compile('<div class="e_mail">(.*?)</div>', re.DOTALL).findall(s)
return email[0]

Also make sure it is not an empty list before returning it.

answered Nov 15, 2012 at 22:31

Sibi

49k18 gold badges105 silver badges172 bronze badges

Collectives™ on Stack Overflow

parsing html tag in python with reg not working

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related