I wrote a python-daemon that parses some web pages. But sometimes there are errors due to the fact that some of the pages are not compatible with the parser.
Actually the question: how to make the script when errors did not stop, but just continued to work? And if possible, record all the errors in the log file.
Thanks.
Part of my code:
# row - array of links
for row in result:
page_html = getPage(row['url'])
self.page_data = row
if page_html != False:
self.deletePageFromIndex(row['id'])
continue
parser.mainlink = row['url']
parser.feed(page_html)
links = parser.links # get links from page
words = wordParser(page_html); # words from page
# insert data to DB
self.insertWords(words)
self.insertLinks(links)
# print row['url'] + ' parsed. sleep... '
self.markAsIndexed(row['id'])
sleep(uniform(1, 3)) # sleep script
try:ed?