0

I'm building a website that fetches text from another page and insert it into the database.

The problem is that all the special characters are saved in the database using the HTML encoding so then I need to convert the output using:

<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" />

I mean, what I have right now is instead of just saving the character " ' " the html version " &#x27; " is saved in the database. This happens also when spanish characters or another special ones are saved. Instead of the letter " ñ " for ejample, I get " &ntilde; " saved.

This wastes space in the database and also I need to later convert the output using content-type so:

How can I just convert or set the charset before is saved or just let MySQL convert it??

In case you need to know here's how I connect to the database:

function dbConnect() {      
    $conn = new mysqli(DB_SERVER, DB_USER, DB_PASSWORD, DB_NAME) or die ('Error.');
    return $conn;
}

    $conn = dbConnect();
    $stmt = $conn->stmt_init();

Hope you can help me!! Thanks.

1
  • Agree that storing HTML-encoded data (with no actual markup in it) in the database is totally the Wrong Thing (the amount of extra space it takes not really being the important part of that). Text should stay as plain text until the point it needs to be encoded into some other output format. Commented Apr 19, 2009 at 15:32

3 Answers 3

1

You can use html_entity_decode() to convert from HTML to a (real) character encoding.

<? echo html_entity_decode("&ntilde;", ENT_COMPAT, "UTF-8"); ?>
ñ

Please note that "HTML" isn't a character encoding in the usual sense, so isn't understood by libraries such as iconv, nor by MySQL itself.

I'd also recommend (per example above) having the whole application use UTF-8. Single character encodings such as ISO8859 are effectively obsolete now that Unicode is so widely supported.

Sign up to request clarification or add additional context in comments.

Comments

1

I suggest using UTF-8 if there are any non-English characters. You can run the SQL

SET NAMES UTF-8

to make your dbase connection in UTF-8 just after you connect to the dbase.

When you do this, you shouldn't use "htmlspecialchars" or "htmlentities" while saving the data.

1 Comment

It's better to use mysqli_set_charset() instead.
0

Maybe you should use htmlspecialchars rather that htmlentities where the first just replaces the HTML special characters &, <, > and " and not every character that can be represented by a named entity character reference like the latter does.

4 Comments

Con you explain how to use htmlspecialchars in my case??
Well how do you store the data into the database? Or are you just reading the data from it?
htmlspecialchars doesn't help because it's for encoding HTML entities, not decoding them.
But don’t encoding them in the first place would avoid this problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.