From various comments it appears the text is in the IBM Extended 8-bit ASCII codepage, also known as 437. To load files in that codepage use Encoding.GetEncoding(437), eg :
var cp437=Encoding.GetEncoding(437);
var input = File.ReadAllText(filePath, cp437);
The ? or � characters are the conversion error replacement characters returned when trying to read text using the wrong codepage. It's not possible to recover the original text from them.
Encoding.Default is the system's default codepage, not some .NET-wide default. As the docs say:
The Default property in the .NET Framework
In the .NET Framework on the Windows desktop, the Default property always gets the system's active code page and creates a Encoding object that corresponds to it. The active code page may be an ANSI code page, which includes the ASCII character set along with additional characters that vary by code page. Because all Default encodings based on ANSI code pages lose data, consider using the Encoding.UTF8 encoding instead. UTF-8 is often identical in the U+00 to U+7F range, but can encode characters outside the ASCII range without loss.
Finally, both File.ReadAllText and the StreamReader class it uses will try to detect the encoding from the file's BOM (Byte Order Marks) and fall back to UTF8 if no BOM is found.
Detecting codepages
There's no reliable way to detect the encoding as many codepages may use the same bytes. One can only identify bad matches reliably because the resulting text will contain �
What one can do is load the file's bytes once and try multiple encodings, eliminating those that contain �. Another step would be to check for expected non-English words or characters and eliminate the encodings that don't produce them.
Encoding.GetEncodings() will return all registered encodings. A rough method that finds probable encodings could be :
IEnumerable<Encoding> DetectEncodings(byte[] buffer)
{
var candidates=from enc in Encoding.GetEncodings()
let text=enc.GetString(byte)
where !text.Contains('�')
select enc;
return candidates;
}
or, using value tuples :
IEnumerable<(Encoding,string)> DetectEncodings(byte[] buffer)
{
var candidates=from enc in Encoding.GetEncodings()
let text=enc.GetString(byte)
where !text.Contains('�')
select (enc,text);
return candidates;
}
Default, ie the system's locale, if it can'tEncoding.Defaultdoesn't do what you seem to think it does...Encoding.Defaultis in fact specifying ANSI encoding for the current code page, which is a legacy encoding.Encoding.Unicode. 1st of all it is the C# standard (source for this claim), and 2nd of all it is backwards compatible with ASCII and ANSI. So even if the file is encoded as ASCII or ANSI you will still read the right letters with Unicode