0

I want to display on screen data send by the user, remembering it can contain dangerous code, it is the best to clean this data with html entities.

Is there a better way to do html entities, besides this:

$name = clean($name, 40);
$email = clean($email, 40);
$comment = clean($comment, 40);

and this:

$data = array("name", "email," "comment") 

function confHtmlEnt($data)
{
return htmlentities($data, ENT_QUOTES, 'UTF-8');
}

$cleanPost = array_map('confHtmlEnt', $_POST);

if so, how, and how does my wannabe structure for html entities look?

Thank you for not flaming the newb :-).

5
  • I don't understand your question. Do you want a string with HTML entities "unencoded" or do you want to "batch encode" some strings? What do you mean with "do html entities"? Commented Nov 24, 2009 at 10:44
  • Now, the edit hasn't made it better. What's the function clean for? Commented Nov 24, 2009 at 10:45
  • I only have one function confHtmlEnt Commented Nov 24, 2009 at 10:47
  • No, you don't. You also have a function clean(). Commented Nov 24, 2009 at 10:50
  • 2
    kinda hasty on the accept there champ Commented Nov 24, 2009 at 10:56

6 Answers 6

4

"Clean POST", the only problem is you might not know in what context will your data appear. I have a Chat server now that works via browser client and a desktop client and both need data in a different way. So make sure you save the data as "raw" as possible into the DB and then worry about filtering it on output.

Sign up to request clarification or add additional context in comments.

1 Comment

+1 HTML-escaping is an output problem and must be dealt with in the output stage.
4

Do not encode everything in $_POST/$_GET. HTML-escaping is an output-encoding issue, not an input-checking one.

Call htmlentities (or, usually better, htmlspecialchars) only at the point where you're taking some plain text and concatenating or echoing it into an HTML page. That applies whether the text you are using comes from a submitted parameter, or from the database, or somewhere else completely. Call mysql_real_escape_string only at the point you insert plain text into an SQL string literal.

It's tempting to shove all that escaping stuff in its own box at the top of the script and then forget about it. But text preparation really doesn't work like that, and if you pretend it does you'll find your database irreparably full of double-encoded crud, backslashes on your HTML page and security holes you didn't spot because you were taking data from a source other than the (encoded) parameters.

You can make the burden of remembering to mysql_real_escape_string go away by using mysqli's parameterised queries or another higher-level data access layer. You can make the burden of typing htmlspecialchars every time less bothersome by defining a shorter-named function for it, eg.:

<?php
    function h($s) {
        echo(htmlspecialchars($s, ENT_QUOTES));
    }
?>
<h1> Blah blah </h1>
<p>
    Blah blah <a href="<?php h($link); ?>"><?php h($title); ?></a> blah.
</p>

or using a different templating engine that encodes HTML by default.

Comments

1

If you wish to convert the five special HTML characters to their equivalent entities, use the following method:

function filter_HTML($mixed)
{
 return is_array($mixed)
  ? array_map('filter_HTML',$mixed)
  : htmlspecialchars($mixed,ENT_QUOTES);
}

That would work for both UTF-8 or single-byte encoded string.

But if the string is UTF-8 encoded, make sure to filter out any invalid characters sequence, prior to using the filter_HTML() function:

function make_valid_UTF8($str)
{
 return iconv('UTF-8','UTF-8//IGNORE',$str)
}

Also see: http://www.phpwact.org/php/i18n/charsets#character_sets_character_encoding_issues

Comments

0

You need to clean every element bevor displaying it. I do it usually with a function and an array like your secound example.

2 Comments

That mostly true. In some cases, you'd want that the user will be able to write HTML code.
Well yeah you're basicly true, but Newb said in his post he wants to display the use input. So I actully just said that he needs to clean every element he want to display, the state of cleaning, or what you exactly want to clean depends on the result you want to have. and this a case solution in my optinion
0

If you use a framework with a template engine, there is quite likely a possibility to auto-encode strings. Apart from that, what's simpler than calling a function and getting the entity-"encoded" string back?

Comments

0

Check out the filter libraries in php, in particular filter_input_array.

filter_input_array(INPUT_POST, FILTER_SANITIZE_SPECIAL_CHARS);

2 Comments

Quite apart from the badness of filtering $_POST, that's a URL-encoder, not an HTML-encoder.
Fair call, fixed. His given code filters $_POST, I'm just showing a way to do it using a standard library as opposed to custom functions and array_map

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.