0

Here's a link I found, which even has a character I need to play with for other projects of mine.

http://www.fileformat.info/info/unicode/char/2446/index.htm

There is a box with the Title of: "Encodings" on that page. And I am wondering about some of the rows.

I obviously need a course on this sort of thing, but I'm wondering what the difference is between "HTML Entity (decimal)" and "HTML Entity (hex)".

The funny thing is, which confuses me, I throw those characters on a web page, and they display fine. But I haven't specified any UTF-8 encoding in the php page.

<?php
$string1 = '&#x2446;';
$string2 = '&#9286;';

echo $string1;
echo '<br>';
echo $string2;
?>

Does the browser know how to display both automatically? And to make it weirder, I can only see those characters on my Mac, in Firefox. But my windows box doesn't want to show them. I've tested it in chrome, and firefox. Do I need to tell the browsers to view them correctly? Or is it an operating system modification?

3 Answers 3

2

They're both valid numeric HTML entities, and the browser does indeed know how to decode them. The difference is the first is a hexadecimal number, while the latter is decimal.

0x2446 = 9286

Note that 0x means hexadecimal.

Also note that it is good practice to always have your server explicitly specify an encoding. The W3C explains how to do so. UTF-8 is a good choice.

If you use any Unicode encoding, you can always put the character right on your page, so you don't have to use entities.

Sign up to request clarification or add additional context in comments.

Comments

2

To be exact, neither is an entity reference. &amp; is an entity reference that refers to the entity named amp that is defined as:

<!ENTITY amp     CDATA "&#38;"   -- ampersand, U+0026 ISOnum -->

Here you can see that the entity’s value is just another reference: &#38;.

&#x2446; and &#9286; are “just” character references (numeric character references to be exact) and refers to characters by specifying the code position of a character in the Universal Character Set, i.e. the Unicode character set.

8 Comments

You still haven't told me where your Gravatar is from. Do I need to start a Meta question? :)
What causes browsers on separate computers to translate them differently?
@Pekka: Was my hint about reading the Titanic more carefully not helpful?
@user271619: Character references do only describe characters, i.e. atoms of textual information. This information is separate from the visual representation of character, the glyphs. And the computers need to have fonts that contain those glyphs to display the text. If the used font does not contain a glyph for a certain character or if it doesn’t have any other font containing it, it displays a replacement character or glyph. In Unicode that’s represented by the REPLACEMENT CHARACTER U+FFFD .
@Gumbo nope! I did work through several issues, but didn't catch the clue. The only recurring theme that's in there since... forever is "Die endgültige Teilung Deutschlands ist unser Auftrag." No beavers :)
|
0

You can use any "HTML Entity" in any encoding and in practice, if You have installed appropriate fonts, every browser will work fine. Well, it was created for displaying characters that are not included in current encoding. In Your situations it looks You have to install some fonts on Your Windows box.

On the other hand, it has almost nothing to do with PHP.

2 Comments

Seems that's the thing. My fonts on my windows box are not as hardy as my Mac's fonts. Out of curiosity, where does one go to update their fonts?
You may buy them, download free fonts from Internet, install optional pack of OS, install application with additional fonts...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.