3

I have the following input:

Hi! How are you? <script>//NOT EVIL!</script>

Wassup? :P

LOOOL!!! :D :D :D

Which is then run through emoticon library and it become this:

Hi! How are you? <script>//NOT EVIL!</script>

Wassup? <img class="smiley" alt="" title="tongue, :P" src="ui/emoticons/15.gif">

LOOOL!!! <img class="smiley" alt="" title="big grin, :D" src="ui/emoticons/5.gif"> <img class="smiley" alt="" title="big grin, :P" src="ui/emoticons/5.gif"> <img class="smiley" alt="" title="big grin, :P" src="ui/emoticons/5.gif">

I have a function that escapes HTML entites to prevent XSS. So running it on raw input for the first line would produce:

Hi! How are you? &lt;script&gt;//NOT EVIL!&lt;/script&gt;

Now I need to escape all the input, but at the same time I need to preserve emoticons in their initial state. So when there is <:-P emoticon, it stays like that and does not become &lt;:-P.

I was thinking of running a regex split on the emotified text. Then processing each part on its own and then concatenating the string together, but I am not sure how easily can Regex be bypassed? I know the format will always be this:

[<img class="smiley" alt="]
[empty string]
[" title="]
[one of the values from a big list]
[, ]
[another value from the list (may be matching original emoticon)]
[" src="ui/emoticons/]
[integer from Y to X]
[.gif">]

Using the list MAY be slow, since I need to run that regex on text that may have 20-30-40 emoticons. Plus there may be 5-10-15 text messages to process. What could be an elegant solution to this? I am ready to use third-party library or jQuery for this. PHP preprocessing is possible as well.

2
  • Are the emoticons placed in JavaScript? Why you don't do that in PHP too so you can htmlentities() before that and have a simplier, safer and cleaner solution? You would even reduce both band and CPU usage with that. Commented Oct 24, 2013 at 22:26
  • And emoticons that use special symbols, you can simply make your script understand that &lt;:-P is the emoticon, instead of <:-P. Commented Oct 24, 2013 at 22:30

1 Answer 1

2

Maybe this will help you:

//TODO:Add the rest of emoticons here
var regExpEmoticons = /(\:P|\:\-P|\:D|\:\-D)/img;

function emoticonTag(title, filename) {
    return "<img class=\"smiley\" alt=\"\" title=\"" + title + "\" src=\"ui/emoticons/" + filename + "\">";
}

function replaceEmoticon(emoticon) {
    switch (emoticon.toUpperCase()) {
    case ':P':
    case ':-P':
        return emoticonTag("tongue, :P", "15.gif");
    case ':D':
    case ':-D':
        return emoticonTag("big grin, :D", "5.gif");
    //TODO: Add more emoticons
    }
}

function escapeHtml(string) {
    //TODO: Insert your HTML escaping code here
    return string;
}

function escapeString(string) {
    if (string == "") {
        return string;
    }
    var splittedString = string.split(regExpEmoticons);

    var result = "";
    for (var i = 0; i < splittedString.length; i++) {
        if (splittedString[i].match(regExpEmoticons)) {
            result += replaceEmoticon(splittedString[i]);
        } else {
            result += escapeHtml(splittedString[i]);
        }
    }
    return result;
}

There are 3 places you must change:

  1. Add all your emoticons to the regExpEmoticons variable.
  2. Add all your emoticons to the switch statement of the replaceEmoticon function, or change the whole function for the one you have for replacing only the emoticon string into the HTML string containing the tag.
  3. Add your HTML escaping code into the escapeHtml function, or change the call to this function to the one you are using.

After that, if you call escapeString method with your string, I think it will do the work.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.