0

Basically, I need to get the words from the array which are contained in the main string

I have a looping code here but I think there is a one-liner to do the trick. I need the code to be optimized not only in code length but also in performance.

Thanks

var aValidWords = ["ex", "exes", "expert", 
                   "experts", "expertise", "sex", "sexes", 
                   "exchange", "change", "changes"];
var sMainWord = "expertsExchange";
var aPossibleWords = new Array();

var sMainWordLower = sMainWord.toLowerCase();
for(i=0; i < aValidWords.length; i++){
    var sCurrentWord = aValidWords[i].toLowerCase();
    if(sMainWordLower.indexOf(sCurrentWord) != -1){
        aPossibleWords.push(aValidWords[i]);
    }
}

document.write(aPossibleWords.join("<br />"));
6
  • My first thought was to convert the array into a regex. But that will just find the matching substrings, not tell you which valid words were found. It would be OK if there weren't overlapping substrings in the valid words, but there are lots of them. Commented Dec 23, 2012 at 8:55
  • You could create a kind of mapping table. Since "experts" contains "ex", "expert" and "experts" itself, you would only have to process "experts" to know three words are valid. Commented Dec 23, 2012 at 9:46
  • @asgoth thing is, the sMainWord will be changed from time to time. Commented Dec 23, 2012 at 10:01
  • I know, but it is independent from the main word. Let's say you have such mapping table (pseudo): (experts -> expert, expert, ex), (expertise -> expert, ex), (exchange -> change), ... Then you would only have to traverse the largest words first and if any of these largest words match, you can remove the smaller words from the mapping table too. Commented Dec 23, 2012 at 10:13
  • 2
    Unless that mapping table comes from a server, no ;) Otherwise one small optimization: replace new Array() by [] Commented Dec 23, 2012 at 10:32

3 Answers 3

3

I compared three possible implementations:

Alternative 1 - Using for loop:

function alternative1(aValidWords, sMainWordLower) {
    var aPossibleWords1 = [];
    for(i=0; i < aValidWords.length; i++){
        if(sMainWordLower.indexOf(aValidWords[i]) != -1){
            aPossibleWords1.push(aValidWords[i]);
        }
    }
    return aPossibleWords1;
}

Alternative 2 - Using jQuery grep function:

function alternative2(aValidWords, sMainWordLower) {
    return $.grep(aValidWords, function(word) {
              return sMainWordLower.indexOf(word) != -1;                    
         }
    )
}

Alternative 3 - Using JavaScript native filter method (IE9, Chrome, Firefox, Opera, Safari):

function alternative3(aValidWords, sMainWordLower) {
    return aValidWords.filter(function(word) {
              return sMainWordLower.indexOf(word) != -1;                    
         }
    )
}

I measured the execution times with Chrome Profile tool. Each alternative was executed 10 times with array of random milion words. The results are:

  • Alternative 1: 20 executions -> 21,68s
  • Alternative 2: 20 executions -> 26,31s
  • Alternative 3: 20 executions -> 34,66s

I was surprised by the fact, that native JavaScript Filter function was so slow.

If you wan't to measure execution times by yourself, there is jsFiddle - it will take some time for script to finish.

In general those three alternatives are easiest. If those execution times suits you, use one of them, otherwise the answer of @Pumbaa80 is the right one.

[UPDATE]

For the explanation of the results (why JQuery grep funcion is faster then native JavaScript filter function) please take a look at this question/answer. jsFiddle code was also ported to jsPerf thanks to @Alexander.

Sign up to request clarification or add additional context in comments.

1 Comment

I once thought of that but i was afraid the performance would not be optimal. +1 for shorter code though;
1

I think your code is decent.

If you're really worried about performance, you may want to try some sophisticated method like Boyer-Moore. But, if you just have a handful of patterns and relatively short strings, then the initialization overhead is higher than the benefit, so you should stick with the simple approach. See Wikipedia's comparison table http://en.wikipedia.org/wiki/String_searching_algorithm#Single_pattern_algorithms

Comments

1

This loop is definitely more concise than what you have, but it'd be worth running some tests in various browsers to find out which is faster. I imagine Regex matching is probably faster, but I'm curious whether there's any performance penalty for compiling the Regex.

for(var i=0; i<aValidWords.length; i++) {
    if (new RegExp(aValidWords[i], 'i').test(sMainWord))
        aPossibleWords.push(aValidWords[i]);
}

1 Comment

well, using regular expressions would be slower, but +1 for the shorter code :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.