update:I've tested my Regular Expression by such code:
import re
pattern = r'^data-id="*/d"$'
html='data-id="89897907"'
m=re.search(pattern,html)
print m.group()
And I sitll got a m of none.
I'm writing a web-spider using python,but when I try to use Regular Expression to get all the strings like "data-id="798789"" I met a problem. My code is as below:
import sys
import urllib
import urllib2
import cookielib
import re
from urllib2 import Request, urlopen, URLError, HTTPError
url="https://www.secure.pixiv.net/login.php"
#Process the cookie
cookie = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))
#POST data to Pixiv
headers = {'User-Agent', 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0'}
values={'mode':'login','pixiv_id':'username','pass':'password','skip':'1'}
data=urllib.urlencode(values)
req=urllib2.Request(url,data)
#ERRORS
try:
response = opener.open(req,timeout=10)
except URLError, e:
if hasattr(e, 'code'):
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
elif hasattr(e, 'reason'):
print 'We failed to reach a server.'
print 'Reason: ', e.reason
else:
print 'No exception was raised.'
res=opener.open('http://www.pixiv.net/ranking.php?mode=daily')
html = res.read()
pattern = r'^data-id="*/d"$'
m=re.search(pattern,html)
print m.group()
I run the code an got a m of none.Is there anything wrong?