I have to process an input text file, which can be in ANSI and convert it to UTF8, whilst doing doing some processing of the lines read. In python, that'll amount to
with open(input_file_location, 'r', newline='\r\n', encoding='cp1252') as old, open(output_file_location, 'w', encoding='utf_8') as new:
for line in old:
modified = ... do processing here ....
new.write(modified)
However, this will work as expected only if the input file is ANSI (windows). If however, the input file was UTF8 originally, the above code works silently, reading it assuming ANSI and thus things in output are not as expected.
So - question is - how to handle the scenario if existing file was already UTF8, so either read it as UTF8, or better, avoid the whole of above processing.
Thanks