I have a python script as a subversion pre-commit hook, and I encounter some problems with UTF-8 encoded text in the submit messages. For example, if the input character is "å" the output is "?\195?\165". What would be the easiest way to replace those character parts with the corresponding byte values? Regexp doesn't work as I need to do processing on each element and merge them back together.
code sample:
infoCmd = ["/usr/bin/svnlook", "info", sys.argv[1], "-t", sys.argv[2]]
info = subprocess.Popen(infoCmd, stdout=subprocess.PIPE).communicate()[0]
info = info.replace("?\\195?\\166", "æ")