1

I'm trying to read in a binary file, store the data as a string of hex values, use regular expressions to modify some hex values, and spit the data out into a new binary file (that is a slightly modified version of the original).

I can do all of those steps except for the last one. When I open the new binary file in the hex editor, it all seems to be wrong... I want it to be exactly how my hex string is, but in the hex editor.

Here is what I'm attempting to do when creating the new file:

# Format data string into an array for file processing
@data_ary = unpack('H*', $data_str);

generate_new_file($filname, \@data_ary);


sub generate_new_file
{
     my $fname = "mod -" . shift(@_);
     my $aref = shift(@_);

     open(BIN, ">", $fname) or die;
     binmode(BIN);

     for my $nybble(@$aref)
     {
         print (BIN $nybble)
     }
     close(BIN);
}

I'm guessing my problem has to do with my use of unpack. But I'm not really sure how else to get a huge hex string into a form where it will actually be read as hex and not ASCII characters. Any suggestions are greatly appreciated!

The code shown above is only for trying to output the data into a new file. I already have all the hex I want to output in $data_str. So the unpack is an attempt to get the string of hex into a list of hex values.

I'm getting closer. I removed the unpack from the beginning since my data is already a single string of hex. So I just split it and put it into the array. This at least gets the size of the file correct now. However, my new problem is that it's cropping off the second part of every byte and replacing it with a 0 (when viewed in the hex editor)... But when I print, the elements of the array get the correct data. Any ideas? New code below:

# Format data string into an array for file processing
@data_ary = split //, $data_str;
generate_new_file($filname, \@data_ary);


sub generate_new_file
{
     my $fname = "mod -" . shift(@_);
     my $aref = shift(@_);

     open(BIN, ">", $fname) or die;
     binmode(BIN);

     for (my $i = 0; $i < @$aref; $i += 2)
     {
         my ($hi, $lo) = @$aref[$i, $i+1];
         print BIN pack "H", $hi.$lo;
     }

     close(BIN);
}

I figured it out! I forgot the "*" when calling pack, so it would do more than just the first character! The finished code is below. Thanks Amon!

# Format data string into an array for file processing
@data_ary = split //, $data_str;
generate_new_file($filname, \@data_ary);


sub generate_new_file
{
     my $fname = "mod -" . shift(@_);
     my $aref = shift(@_);

     open(BIN, ">", $fname) or die;
     binmode(BIN);

     for (my $i = 0; $i < @$aref; $i += 2)
     {
         my ($hi, $lo) = @$aref[$i, $i+1];
         print BIN pack "H*", $hi.$lo;
     }
     close(BIN);
}
1
  • If you unpack, you usually have to pack back to get the same format. Commented Jul 25, 2013 at 14:36

1 Answer 1

3

Here, the unpack returns a single string, not an array of values. If you want to have an array of hex characters (each denoting 4 bits), then you have to split the resulting string:

my @data = split //, unpack "H*", $data;

(use split /..\K/, $data to split into byte-equivalents)

Before printing this data to a filehandle, you also have to pack it to get the original data again. I would recommend to do this at least on 8-bit parts of the original data:

for (my $i = 0; $i < @$aref; $i += 2) {
   my ($hi, $lo) = @$aref[$i, $i+1];
   print OUT pack "H*", $hi.$lo;
}
Sign up to request clarification or add additional context in comments.

5 Comments

Just for clarification how is it that it returns a single string? That unpack should not be in scalar context because of the assignment to an array right? I'm really confused now lol. Just tried to implement this and I'm getting a different result but still not correct. The resulting file is somehow twice as large as the orginial when it should be the same size. It's also completely messed up in the hex editor.
Because the H pattern creates a hex string, only a single value is returned; independent from context. It does not create a list like qw/00 A0 97/ but "00A097". When we split this string into nybbles (qw/0 0 A 0 9 7/) then we have to join it before packing it. This is what my loop is about. If we don't do that, we'd create the equivalent of "\x00\x00\xA0\x00\x90\x70"
Ok that makes sense. I'm getting closer. I removed the unpack from the beginning since my data is already a single string of hex. So i just split it and put it into the array. This at least gets the size of the file correct now. However my new problem is that it's cropping off the second part of every byte and replacing it with a 0... Any ideas?
Sorry about that. I've made an edit to the question with the code I'm now using. It's basically what you suggested but without the first unpack.
my fault, just use the pattern H* and it should work fine. (sorry, didn't see your edit)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.