0

I'm trying to create a code that will take a sentence as a param, split that sentence into an array of words and then create a loop that checks if any of theses word matches a word in some other arrays.

In the example below, I have a sentence that contains the word "ski". This means that the return value should be categories.type3.

How can I have make the loop check this? Could I have a function switching between different categories ? (ie : if a word is not in action, look in adventure and so on).

var categories = {

    type1: "action",
    type2: "adventure",
    type3: "sport"
}

var Sentence = "This sentence contains the word ski";

var sport = ["soccer", "tennis", "Ski"];
var action = ["weapon", "explosions"];
var adventure = ["puzzle", "exploring"];

var myFreeFunc = function (Sentence) {

    for (var i = 0; i < arrayLength; i++) {

        if (typeArr[i] == word) {

        }
    }
}
5
  • 1
    a word may not be so obvious as you think. Is "mountain bike" a word ? Commented Nov 5, 2014 at 13:02
  • You are right of course. In this scenario "mountain" and "bike" are two different words. Commented Nov 5, 2014 at 13:06
  • have you considered using regex? Commented Nov 5, 2014 at 13:09
  • Im not familiar with such a thing but I can have a look at it. Im a js-novice. Commented Nov 5, 2014 at 13:12
  • 2
    @Goodword if he did, he'd have two problems... Commented Nov 5, 2014 at 13:17

3 Answers 3

3

You appear to want to know which categories match the sentence.

To start with, get rid of the meaningless type1 etc identifiers and re-arrange your fixed data into objects that directly represent the required data, specifically a Map of key/value pairs, where each key is a "category" name, and each value is a Set of keywords associated with that category:

var categories = new Map([
    ['action', new Set(['weapon', 'explosions'])],
    ['adventure', new Set(['puzzle', 'exploring'])],
    ['sport', new Set(['soccer', 'tennis', 'ski'])]
]);

[NB: Set and Map are new ES6 features. Polyfills are available]

You now have the ability to iterate over the categories map to get the list of categories, and over the contents of each category to find the key words:

function getCategories(sentence) {
    var result = new Set();
    var words = new Set(sentence.toLowerCase().split(/\b/g)); /* "/b" for word boundary */
    categories.forEach(function(wordset, category) {
        wordset.forEach(function(word) {
             if (words.has(word)) {
                 result.add(category);
             }
        });
    });
    return result.values();  // NB: Iterator interface
}

NB: I've avoided for .. of because it's not possible to polyfill that, whereas Set.prototype.forEach and Map.prototype.forEach can be.

Sign up to request clarification or add additional context in comments.

4 Comments

This looks interesting and complete. I´ll have a go at this one too. Thank you
i like this ECMAScript 6 (Map and Set) approach - just make sure, the target environment supports it (as well as forEach and .has()).
yep, just empathising it :)
@BananaAcid ITYM "emphasising" ;-)
1

I would rewrite the code (you should always combine var statements).

I've added a small fiddle snippet, how i would rewrite the function. Just as an example, how you could iterate your data. Of course you should check out the other posts to optimise this code snipped ( e.g. fix for multiple spaces! ).

// make sure, your dictionary contains lower case words
var categories = {
    action: ["soccer", "tennis", "ski"],
    adventure: ["weapon", "explosions"],
    sport: ["puzzle", "exploring"]
}

var myFreeFunc = function myFreeFunc(Sentence) {

    // iterates over all keys on the categories object
    for (var key in categories) {

        // convert the sentence to lower case and split it on spaces
        var words = Sentence.toLowerCase().split(' ');

        // iterates the positions of the words-array            
        for (var wordIdx in words)
        {
            // output debug infos
            console.log('test:', words[wordIdx], categories[key], categories[key].indexOf(words[wordIdx]) != -1, '('+categories[key].indexOf(words[wordIdx])+')');

            // lets the array function 'indexOf' check for the word on position wordIdx in the words-array
            if (categories[key].indexOf(words[wordIdx]) != -1 ) {
                // output the found key
                console.log('found', key);

                // return the found key and stop searching by leaving the function
                return key;
            }

        }//-for words


    }//-for categories

    // nothing found while iterating categories with all words
    return null;
}

stripped down the function part snippet (no comments, no extra spaces, no console.log):

var myFreeFunc = function myFreeFunc(Sentence) {
    for (var key in categories) {
        var words = Sentence.toLowerCase().split(' ');          
        for (var wordIdx in words)
        {
            if (categories[key].indexOf(words[wordIdx]) != -1 ) {
                return key;
            }
        }
    }
    return null;
}

Accumulated the topics covered in the comments

  • check if the Object really owns the property: obj.hasOwnProperty(prop)
  • split string by word bounds, as mentioned by Alnitak (using RegExp): /\b/g
  • collecting categories for multiple matching

Snippet:

var myFreeFunc = function myFreeFunc(Sentence) {
    var result = []; // collection of results.
    for (var key in categories) {
        if (categories.hasOwnProperty(key)) { // check if it really is an owned key
            var words = Sentence.toLowerCase().split(/\b/g);  // splitting on word bounds        
            for (var wordIdx in words)
            {
                if (categories[key].indexOf(words[wordIdx]) != -1 ) {
                    result.push(key);
                }
            }
        }
    }
    return result;
}

10 Comments

This one looks very nice and I can actually understand most of it =)
it's actually incorrect, though - it's using for .. in to iterate over an array.
also, what if more than one category is matched?
var result = []; and in case of a return, do a result.push(key);, and at the end return result; .. but thats the usual - left to the reader - exercise. i actually removed it to make the code cleaner to read.
for .. in: objects may be addressed like arrays. obj[accessor] as well as obj.accessor - and sporting the same iteration interfaces. btw for .. in support is broader
|
0

One simple way would be to do like this :

function determineCategory(word){

    var dictionnary = {
       // I assume here you don't need category1 and such

        action: ["weapon", "explosions"],
        aventure: ["puzzle", "exploring"],
        sport: ["soccer", "tennis", "ski"]
    }
    var categories = Object.keys(dictionnary);
    for(var i = 0; i<categories.length; i++){
        for(var j = 0; j<categories[i].length;j++){
            var wordCompared = dictionnary[categories[i]][j];
            if(wordCompared == word){
                return categories[i];
            }
        }
    }
    return "not found"; 
}

var sentence = "This sentence contains the word ski";
var words = sentence.split(" "); // simple separation into words
var result = [];
for(var i=0; i<words.length; i++){
    result[i] = determineCategory(words[i]);
}

A few notes on this approach :

  • it needs you to change your existing structure (I don't know if its possible)
  • it doesn't do much for your sentence splitting (just using the white space). For more clever approach, see Alnitak's answer, or look for tokenization/lemmatization methods.
  • it is up to you to determine what to do when a word doesn't belong to a category (right now, it just stores "not found".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.