How can I know what encoding will be used by PHP when sending data to the browser? I.e. with the Cotent-Type header, for instance: iso-8859-1.
6 Answers
Usually Apache + PHP servers of webhosters are configured to send out NO charset header.
The shortest way to test how your server is configured are these:
- Use this tool to see the server header by getting any one of your pages on your webiste.
If in the server headers you see a
charsetit means your server is using it, usually it won't contain acharset. - Another way is to run this simple script on your server:
<?php echo ini_get('default_charset'); ?>As said above this usually prints out an empty string, if different it will show you thecharsetof the PHP.
The 2nd solution is supposing Apache is not configured with AddDefaultCharset some_charset which is not usually the case, but in such case I'm afraid Apache setting might override PHP deafult_charset ini directive.
Comments
You can use the header() solution that William suggested, however if you are running Apache, and the Apache config is using a default charset, that will win everytime (Internet Explorer will go crazy) See: AddDefaultCharset
Comments
Keep in mind that content-types and encodings are two different things. text/html is a content-type; ISO-8859-1 and UTF-8 are encodings.
The HTTP response header that the server sends typically looks like this:
Content-Type: text/html; charset=utf-8
"charset" is actually the character encoding. It's not in a separate header; however there is a header called "Content-Encoding" which actually specifies what kind of compression the response uses (e.g. gzip).
If you want to change the character encoding to UTF-8, in a file that contains HTML:
<?
header("Content-Type: text/html; charset=utf-8");
1 Comment
You can set your own with header('Content-type: xxx/yyy');, but I believe that text/html is sent by default.
1 Comment
AFAIK, PHP sends strings bytewise. that is, if your variables hold UTF-8, it will send UTF-8. if you have iso-8859-1, it will send that too. if you mix them, it won't be pretty.
1 Comment
If your server is not configured to have a default content or charset, and neither is PHP, PHP will send only Content-Type: text/html - it won't specify a charset at all, and will send the bytes as it sees them in the script.
If a browser receives a page without charset specified, various things can happen:
- most browsers have an "Encoding/Charset" menu; if the user explicitly selects one, the browser will try to apply it. Doesn't happen too often, so:
- some browsers try to render it with a default charset (which is locale-dependent, e.g. for FF and cs_CZ it used to be
iso-8859-2; YMMV) - IE will try to determine the charset heuristically (it will take a guess, based on character distribution - and many times it gets it right; sometimes it gets it wrong and you get a page in Romanian interpreted as Chinese text, which usually means "unreadable")
- some old browsers will fall back on
us-ascii
If with this procedure, the PHP script's charset and the browser's charset matches, the text will - accidentally - be readable. If not, there will be weird signs and similar phenomena.