37

I'm trying to parse a binary file, and I need some help on where to go. I've looking online for "parsing binary files", "reading binary files", "reading text inside binaries", etc. and I haven't had any luck.

For example, how would I read this text out of this binary file? Any help would be MUCH appreciated. I am using powershell.

enter image description here

1
  • what are those numbers you get when running [System.IO.File]::ReadAllBytes ? I tried to create an empty text file and then read it via [System.IO.File]::ReadAllBytes. The output was : 255 254 13 0 10 0� ~ Commented Oct 12, 2012 at 10:12

4 Answers 4

49

It seems that you have a binary file with text on a fixed or otherwise deducible position. Get-Content might help you but... It'll try to parse the entire file to an array of strings and thus creating an array of "garbage". Also, you wouldn't know from what file position a particular "rope of characters" was.

You can try .NET classes File to read and Encoding to decode. It's just a line for each call:

# Read the entire file to an array of bytes.
$bytes = [System.IO.File]::ReadAllBytes("path_to_the_file")
# Decode first 12 bytes to a text assuming ASCII encoding.
$text = [System.Text.Encoding]::ASCII.GetString($bytes, 0, 12)

In your real case you'd probably go through the array of bytes in a loop finding the start and end of a particular string sequence and using those indices to specify the range of bytes you want to extract the text from by the GetString.

The .NET methods I mentioned are available in .NET Framework 2.0 or higher. If you installed PowerShell 2.0 you already have it.

Sign up to request clarification or add additional context in comments.

1 Comment

As an aside, you can convert bytes back to a file with:[System.IO.File]::WriteAllBytes($outputPath, $bytes)
6

You can read in the file via Get-Content -Encoding byte . I'm not sure how to parse it though.

8 Comments

technet.microsoft.com/en-us/library/hh849787.aspx doesn't list Encoding as a valid flag anywhere.
@AlwaysLearning, have you tried it? It converts every byte into its base-10 string representation when output. (In fact, exactly the same as [System.IO.File]::ReadAllBytes). Not exactly what's asked for, you have to convert the output first.
@SilverbackNet: Also try Get-Content ... -Encoding Byte -Raw which returns [System.Byte[]] instead of [System.Object[]] but otherwise is handled exactly the same.
byte isn’t supported in PowerShell 6.
RE comment by Franklin Yu: byte isn't supported as of PS6 since, strictly, it is not a character encoding. The functionality is still available by specifying -AsByteStream instead of -Encoding. Reading the entire file into a variable results in an Object[] of the bytes (or an actual Byte[] if -Raw used) which can then be processed as desired. In the example, the strings (note the embedded '\0') could be obtained by -join [char[]]$bytearray[0xe00..0xe11] and -join [char[]]$bytearray[0xe13..0xe54]. The method used rather depends upon what is meant by parse a binary file.
|
6

If you're just looking for strings, check out the strings.exe utility from SysInternals.

Comments

0

There is an example on microsoft.com here:

$byteArray = Get-Content -Path C:\temp\test.txt -AsByteStream -Raw
Get-Member -InputObject $bytearray

After this we can do whatever we want with the data. I prefer to omit -Raw and deal with the array. So I could for example write:

$byteArray = Get-Content -Path C:\temp\test.txt -AsByteStream -TotalCount 50 | where { $_ -gt 32 -and $_ -lt 128 }
[System.Text.Encoding]::UTF8.GetString($byteArray)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.