I have the following code(gdaten[n][2] gives an URL, n is the index):
try:
p=urlparse(gdaten[n][2])
while p.scheme == "javascript" or p.scheme == "mailto":
p=urlparse(gdaten[n][2])
print(p," was skipped (", gdaten[n][2],")")
n += 1
print ("check:", gdaten[n][2])
f = urllib.request.urlopen(gdaten[n][2])
htmlcode = str(f.read())
parser = MyHTMLParser(strict=False)
parser.feed(htmlcode)
except urllib.error.URLError:
#do some stuff
except IndexError:
#do some stuff
except ValueError:
#do some stuff
Now I have the following error:
urllib.error.URLError: <urlopen error unknown url type: javascript>
in line 8. How is that possible? I thought with the while-loop I skip all those links with the scheme javascript? Why does the except not work? Where's my fault?
MyHTMLParserappends the links found on the website to gdaten like that [[stuff,stuff, link][stuff,stuff, link]