I am reading a webpage content and checking for a word with umlauts. The word is present in the page content. But the python find('ü') function is not finding the word.
import urllib2
opener = urllib2.build_opener()
page_content = opener.open(url).read()
page_content.find('ü')
I have tried to convert the search string with u'ü'. Then the error is
'SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xfc in position 0'
I have used # -- coding: utf-8 -- in my .py file.
I have print the page_content. There the umlaut ü is converting to 'ü'. If I try with page_content.find('ü'), it is working fine. Please let me know if there is any better solution for this.
I would greatly appreciate any suggestions.