1

I've searched everywhere for this answer so hopefully it's not a duplicate. I decided I'm just finally going to ask it here.

I have a file named Program1.exe When I drag that file into Notepad or Notepad++ I get all kinds of random symbols and then some readable text. However, when I try to read this file in C#, I either get inaccurate results, or just a big MZ. I've tried all supported encodings in C#. How can notepad programs read a file like this but I simply can't? I try to convert bytes to string and it doesn't work. I try to directly read line by line and it doesn't work. I've even tried binary and it doesn't work.

Thanks for the help! :)

1
  • What class are you using to read it? Have any sample code to look at? Commented Mar 13, 2014 at 20:37

2 Answers 2

5

Reading a binary file as text is a peculiar thing to do, but it is possible. Any of the 8-bit encodings will do it just fine. For example, the code below opens and reads an executable and outputs it to the console.

const string fname = @"C:\mystuff\program.exe";
using (var sw = new StreamReader(fname, Encoding.GetEncoding("windows-1252")))
{
    var s = sw.ReadToEnd();
    s = s.Replace('\x0', ' '); // replace NUL bytes with spaces
    Console.WriteLine(s);
}

The result is very similar to what you'll see in Notepad or Notepad++. The "funny symbols" will differ based on how your console is configured, but you get the idea.

By the way, if you examine the string in the debugger, you're going to see something quite different. Those funny symbols are encoded as C# character escapes. For example, nul bytes (value 0) will display as \0 in the debugger, as NUL in Notepad++, and as spaces on the console or in Notepad. Newlines show up as \r in the debugger, etc.

As I said, reading a binary file as text is pretty peculiar. Unless you're just looking to see if there's human-readable data in the file, I can't imagine why you'd want to do this.

Update

I suspect the reason that all you see in the Windows Forms TextBox is "MZ" is that the Windows textbox control (which is what the TextBox ultimately uses), uses the NUL character as a string terminator, so won't display anything after the first NUL. And the first thing after the "MZ" is a NUL (shows as `\0' in the debugger). You'll have to replace the 0's in the string with spaces. I edited the code example above showing how you'd do that.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! However, how do I get the same results in a windows form textbox? I try the same but I still get MZ.
Dude...!!! I can finally do what I've been meaning! - Funny thing: If I open an executable(*.exe), near the middle-ish / bottom-ish of the RichTextBox, I am finding bits of the application's source code... I loaded MozillaFirefox.exe into it, and I found this.21TG20.Reloa1621-, which looks like the code for the Reload button! I am sooo happy right now! Thumbs up, man!
0

The exe is a binary file and if you try to read it as a text file you'll get the effect that you are describing. Try using something like a FileStream instead that does not care about the structure of the file but treats it just as a series of bytes.

1 Comment

I've used StreamReader, FileStream, and a BufferedStream. None return what I see when I open the file in Notepad or Notepad++

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.