Javascript text highlight function

Question

Scenario

I am trying to develop a Javascript text highlight function. It receives in input a text to search inside, an array of tokens to be searched, a class to wrap the found matches:

var fmk = fmk || {};

fmk.highlight = function (target, tokens, cls) {
    var token, re;
    if (tokens.length > 0) {
        token = tokens.pop();
        re = new RegExp(token, "gi");
        return this.highlight(
            target.replace(re, function (matched) {
                return "<span class=\"" + cls + "\">" + matched + "</span>";
            }), tokens, cls);
    }
    else { return target; }
};

It is based on a recursive replace that wraps a <span> tag around the found matches.

JsFiddle demo.

Issues

if there are two tokens, and the latter is a substring of the former then only the latter token will be highligthed. In the jsFiddle example try these tokens: 'ab b'.
if the tokens contains a substring of the wrapper sequence (i.e. <span class="[className]"></span>) and another matching token, then the highlight fails and returns a dirty result. In the jsFiddle example try these tokens: 'red ab'.

Note that single character tokens are admitted in the actual application.

Questions

How to avoid these errors? I figured out these approaches:

To pre-process the tokens, removing the tokens that are substrings of others. Disadvantages: it requires O(n^2) searches in the pre-processing phase in case of n tokens; good matches are cut off.
To pre-process the matches BEFORE applying the wrapper, in order to cut off only the substrings matches. Disadvantages: again, further computation required. Anyway, I don't know where to start from implement this inside the replace callback function.

This isn't as simple as what you're doing, and you've already proven that. I don't think jQuery is the route to take with this. You'll need to take an element, recursively search through its descendent nodes, and for any text nodes, replace it with a new span node — Ian
– Ian, Commented May 13, 2013 at 14:22
Why do you think jQuery needs to be used? It's great for certain things, but I wouldn't say this is one. I mean, you can use jQuery to initially select the elements you want to do this highlighting in, but I wouldn't use jQuery for the actual operation — Ian
– Ian, Commented May 13, 2013 at 14:28
@Ian. I do not think jQuery is necessary in this scenario. I mean, the algorithm does not use jQuey at all, and I do not even add jQuery in the question's tag list. It is a pure Javascript solution. Though I eventually use jQuery to interact with the DOM. — Alberto De Caro
– Alberto De Caro, Commented May 13, 2013 at 14:34
Ahh okay, I was just confused why you asked "where do I use jQuery" (or whatever the question was). I swear I answered a question similar to this recently, so let me see if I can find my answer — Ian
– Ian, Commented May 13, 2013 at 15:00
I just posted an answer. I realized that when I tried doing something like this before, and when I went to post the answer, someone had already figured it out, so I gave up because I couldn't get it working. So I took it upon myself to create it all myself, because this kind of manipulation intrigues me, and I kind of went overboard with how much I did. Nonetheless, I want to make a mini library out of it and maintain it, improving wherever and adding if possible, so that's why I developed it so much. I hope it helps, and let me know if you have any questions — Ian
– Ian, Commented May 13, 2013 at 17:32

Ian · Accepted Answer · 2013-05-13 17:30:52Z

I think the way to handle this is to loop through all descendants of an element, check if it's a text node, and replace the appropriate content wrapped with a span/class.

var MyApp = {};

MyApp.highlighter = (function () {
    "use strict";

    var checkAndReplace, func,
        id = {
            container: "container",
            tokens: "tokens",
            all: "all",
            token: "token",
            className: "className",
            sensitiveSearch: "sensitiveSearch"
        };

    checkAndReplace = function (node, tokenArr, classNameAll, sensitiveSearchAll) {
        var nodeVal = node.nodeValue, parentNode = node.parentNode,
            i, j, curToken, myToken, myClassName, mySensitiveSearch,
            finalClassName, finalSensitiveSearch,
            foundIndex, begin, matched, end,
            textNode, span;

        for (i = 0, j = tokenArr.length; i < j; i++) {
            curToken = tokenArr[i];
            myToken = curToken[id.token];
            myClassName = curToken[id.className];
            mySensitiveSearch = curToken[id.sensitiveSearch];

            finalClassName = (classNameAll ? myClassName + " " + classNameAll : myClassName);

            finalSensitiveSearch = (typeof sensitiveSearchAll !== "undefined" ? sensitiveSearchAll : mySensitiveSearch);
            if (finalSensitiveSearch) {
                foundIndex = nodeVal.indexOf(myToken);
            } else {
                foundIndex = nodeVal.toLowerCase().indexOf(myToken.toLowerCase());
            }

            if (foundIndex > -1) {
                begin = nodeVal.substring(0, foundIndex);
                matched = nodeVal.substr(foundIndex, myToken.length);
                end = nodeVal.substring(foundIndex + myToken.length, nodeVal.length);

                if (begin) {
                    textNode = document.createTextNode(begin);
                    parentNode.insertBefore(textNode, node);
                }

                span = document.createElement("span");
                span.className += finalClassName;
                span.appendChild(document.createTextNode(matched));
                parentNode.insertBefore(span, node);

                if (end) {
                    textNode = document.createTextNode(end);
                    parentNode.insertBefore(textNode, node);
                }

                parentNode.removeChild(node);
            }
        }
    };

    func = function (options) {
        var iterator,
            tokens = options[id.tokens],
            allClassName = options[id.all][id.className],
            allSensitiveSearch = options[id.all][id.sensitiveSearch];

        iterator = function (p) {
            var children = Array.prototype.slice.call(p.childNodes),
                i, cur;

            if (children.length) {
                for (i = 0; i < children.length; i++) {
                    cur = children[i];
                    if (cur.nodeType === 3) {
                        checkAndReplace(cur, tokens, allClassName, allSensitiveSearch);
                    } else if (cur.nodeType === 1) {
                        iterator(cur);
                    }
                }
            }
        };

        iterator(options[id.container]);
    };

    return func;
})();

window.onload = function () {
    var container = document.getElementById("container");
    MyApp.highlighter({
        container: container,
        all: {
            className: "highlighter"
        },
        tokens: [{
            token: "sd",
            className: "highlight-sd",
            sensitiveSearch: false
        }, {
            token: "SA",
            className: "highlight-SA",
            sensitiveSearch: true
        }]
    });
};

DEMO: http://jsfiddle.net/UWQ6r/1/

I set it up so you can change the values in id so that you can use different keys in the {} passed to highlighter.

The two settings in the all object refer to a class being added no matter what, as well as a case sensitive search override. For each token, you specify the token, class, and whether the match should be case sensitive.

References:

nodeType: https://developer.mozilla.org/en-US/docs/DOM/Node.nodeType
childNodes: https://developer.mozilla.org/en-US/docs/DOM/Node.childNodes
substr: https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/String/substr
substring: https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/String/substring
insertBefore: https://developer.mozilla.org/en-US/docs/DOM/Node.insertBefore

גלעד ברקן · Accepted Answer · 2013-05-14 03:29:01Z

1

This seems to work for me:

(line 17 in your JsFiddle demo)

Issue 1: var tokens = [['ab','b'].join("|")];

Issue 2: var tokens = ['<span'.replace(/</g,"<")];

All together, then:

var tokens = [[..my tokens..].sort().join("|").replace(/</g,"&lt;")];

(by the way, I did test tokens such as '"', '"s' or 'span' and they seem to work fine. Also, I'm not sure why .sort() is important here but I left it in since I like to stay close to the original code.)

edited May 14, 2013 at 3:29

answered May 13, 2013 at 20:50

גלעד ברקן

24k3 gold badges29 silver badges64 bronze badges

Collectives™ on Stack Overflow

Javascript text highlight function

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related