5

I am working on a PHP application that has to parse strings being sent by another program. the problem is that some strings have octal characters and some other escapes in the middle.

So instead of script>XYZ, I am getting:

\103RI\120T>XYZ%6En \151\156 d%6Fcu\155%65n..

And I need to print back this string decoded... I tried using octdec, url_decode, etc, but one only works with one char and the other doesn't decode octal... Anyone have suggestions?

2
  • 1
    It's difficult to tell from such a small snippet: Have you tried base64? Commented Jun 24, 2010 at 13:20
  • 3
    @Mark: No chance that could ever be base64. Commented Jun 24, 2010 at 13:22

4 Answers 4

1

Use preg_replace_callback(). Use a pattern that matches both the octal number, and the escapes (being sure to match also the \, and % characters. Basing on the first character, the callback should be able to understand if to convert a octal number, or to convert an escape sequence.

The callback can convert the number from octal, or hexadecimal, using base_convert() (base_convert($match, 8, 10) in the first case; base_convert($match, 16, 10) in the second case).

Sign up to request clarification or add additional context in comments.

3 Comments

How should the latter be handled?
@Alix Axel: The difference is that the first is a octal number, and the other is a hexadecimal number. If the callback receive the character before the number, it should be able to understand if received an octal number (the number starts with ), or a hexadecimal number (it starts with %).
That's what I though, hexdec() as throwing an error but I've solved it now. There is no need to use a callback, preg_replace() will do just fine, check my answer.
1

Try this:

$str = '\103RI\120T>XYZ%6En \151\156 d%6Fcu\155%65n..';

// CRIPT>XYZnn in documen..
echo preg_replace(array('~\\\(\d+)~e', '~%([0-9A-F]{2})~e'), array('chr(octdec("$1"))', 'chr(hexdec("$1"))'), $str);

Regarding the %AD parts, I'm not sure what are meant to representing, could you explain?

Comments

1
urldecode(stripcslashes("\103RI\120T>XYZ%6En \151\156 d%6Fcu\155%65n.."));

1 Comment

stripcslashes() doesn't handle %AE; it handles \xAE.
0
$octstr = '\103RI\120T>XYZ%6En \151\156 d%6Fcu\155%65n';

preg_match_all('/\\\[0-9]{3}/',$octstr,$matches);

$oct = $matches[0];

foreach($oct as $o){
    $octstr = str_replace($o,chr(octdec($o)),$octstr);
}

echo urldecode($octstr);

outputs:

CRIPT>XYZnn in documen

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.