3

Does PHP have any standard function(s) to convert Unicode strings to plain, good old-fashioned ANSI strings (or whatever format PHP's htmlentities understands?

Is there any function that converts UTF-8 strings to HTML that can be understood by the most popular browsers?

5
  • "ANSI strings"? joelonsoftware.com/articles/Unicode.html Commented Jan 14, 2011 at 13:25
  • It isn't like my software will be used by some random guy in Japan. We know our market. Commented Jan 14, 2011 at 13:27
  • 2
    Why not just keep everything (web page, database tables, connection and collation, etc.) in UTF-8? Even if you don't take advantage of any non-ASCII characters you'd at least have a consistent approach. Commented Jan 14, 2011 at 13:33
  • Okay, let me rephrase my function. Commented Jan 14, 2011 at 13:34
  • 2
    Sometimes I wish I could downvote my own questions... Commented Jan 26, 2017 at 8:32

3 Answers 3

7

This can't work properly. Stored with Unicode there are many more Characters than with ANSI. So if you "convert" to ANSI, you will loose lots of charackters.

http://php.net/manual/en/function.htmlentities.php

You can use Unicode (UTF-8) charset with htmlentities:

string htmlentities ( string $string [, int $flags = ENT_COMPAT [, string $charset [, bool $double_encode = true ]]] )

htmlentities($myString, ENT_COMPAT, "UTF-8"); should work.

Sign up to request clarification or add additional context in comments.

2 Comments

Wow. I didn't know that was possible. I thought htmlentities only accepted one parameter.
Is there any function that converts á to &aacute, for example?
7

Whilst I'd really recommend keeping everything in UTF-8 (as per my comment on the question), you can use the mb_convert_encoding function to convert any known UTF-8 string to US-ASCII as such:

$asciiString = mb_convert_encoding ($sourceString, 'US-ASCII', 'UTF-8');

However, this may not be a lossless conversion depending on the source character string. (Characters such as "é" will simply disappear into the void.)

Comments

1

Browsers already understand UTF-8. If you want them to know that you're sending them UTF-8 then you need to tell them.

2 Comments

Does IE6 understand UTF-8? Some of the viewers of my Web site might be using it.
Yes, it does. However, it doesn't auto-detect UTF-8, so make sure to declare the Content-Type appropriately.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.