7

I want only the unencoded characters to get converted to html entities, without affecting the entities which are already present. I have a string that has previously encoded entities, e.g.:

gaIUSHIUGhj>‐ hjb×jkn.jhuh>hh> …

When I use htmlentities(), the & at the beginning of entities gets encoded again. This means ‐ and other entities have their & encoded to &:

×

I tried decoding the complete string, then encoding it again, but it does not seem to work properly. This is the code I tried:

header('Content-Type: text/html; charset=iso-8859-1');
...

$b = 'gaIUSHIUGhj>‐ hjb×jkn.jhuh>hh> …';
$b = html_entity_decode($b, ENT_QUOTES, 'UTF-8');
$b = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $b);
$b = htmlentities($b, ENT_QUOTES, 'UTF-8'); 

But it does not seem to work the right way. Is there a way to prevent or stop this from happening?

2 Answers 2

6

Set the optional $double_encode variable to false. See the documentation for more information.

Your resulting code should look like:

$b = htmlentities($b, ENT_QUOTES, 'UTF-8', false);
Sign up to request clarification or add additional context in comments.

Comments

5

You did good looking at the documentation, but you missed the best part. It can be hard to decipher this sometimes:

//     >    >    >    >    >    >    Scroll    >>>    >    >    >    >    >     Keep going.    >    >    >    >>>>>>  See below.  <<<<<<
string htmlentities ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = 'UTF-8' [, bool $double_encode = true ]]] )

Look at the very end.

I know. Confusing. I usually ignore the signature line and go straight down to the next block (Parameters) for the blurbs on each argument.

So you want to use the double_encoded argument at the end to tell htmlentities not to re-encode (and you probably want to stick with UTF-8 unless you have a specific reason not to):

$str = "gaIUSHIUGhj>&hyphen; hjb&times;jkn.jhuh>hh> &hellip;";

// Double-encoded!
echo htmlentities($str, ENT_COMPAT, 'utf-8', true) . "\n";

// Not double-encoded!
echo htmlentities($str, ENT_COMPAT, 'utf-8', false);

https://ignite.io/code/513ab23bec221e4837000000

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.