2

Hi I have a very long list of key value pairs in json key:value, key:value and so on

car <--> wheel
wheel <--> tyre
bed <--> sheets
guitar <--> strings
guitar <--> pickup
tyre <--> rubber

What I want is to group all relations into arrays no matter how distant like this

[car, wheel, tyre, rubber]
[guitar, strings, pickup]
[bed, sheets]

What is an efficient way to do this with Javascript?

19
  • 1
    Why the java tag? what does this have to do with programming in the Java programming language? Commented Aug 20, 2012 at 22:27
  • What have you tried? Do you have some code which you're having trouble with? Commented Aug 20, 2012 at 22:28
  • 1
    What do you mean "in json". Do you mean you have an array of single-property objects? And is it actual JSON - a string that will need to be parsed - or an array or object? Please show your actual input. @elclanrs: "tyre" is the correct spelling where I live... Commented Aug 20, 2012 at 22:29
  • 1
    What would you want to do if the item "truck -> wheel" was added to your sample input above? Can "wheel" be in two output arrays at once? (Also, I don't think people are actually angry about the Java thing - at least I'm not. We're just trying to make sure the appropriate tags apply, which is why I removed the Java tag. If you'd like to put it back that's fine, but perhaps edit your question to ask for a solution in either language...) Commented Aug 20, 2012 at 22:38
  • 1
    If you really need an efficient algorithm, what you're trying to do is called "partitioning an undirected and unweighted graph". Efficient and correct language agnostic algorithms exist that you don't need to reinvent. Commented Aug 20, 2012 at 23:57

2 Answers 2

1

First of all, I would store the relationships as arrays so that you can have duplicate "keys." Key methods: an initial dictionary including every word related to each individual word; a recursive chain expander using map and reduce; filtering chains based on equivalency.

Array.prototype.getUnique = function(){
   var u = {}, a = [];
   for(var i = 0, l = this.length; i < l; ++i){
      if(u.hasOwnProperty(this[i])) {
         continue;
      }
      a.push(this[i]);
      u[this[i]] = 1;
   }
   return a;
}
var links = {};
var pairs = [
    ["car", "wheel"],
    ["wheel", "tyre"],
    ["bed", "sheets"],
    ["guitar", "strings"],
    ["guitar", "pickup"],
    ["rubber", "tyre"],
    ["truck", "wheel"],
    ["pickup", "car"]
];
pairs.map(function(pair) {
    links[pair[0]] = links[pair[0]] || [];
    links[pair[1]] = links[pair[1]] || [];

    links[pair[0]].push(pair[1]);
    links[pair[1]].push(pair[0]);
});
var append = function(list) {
    var related = list.map(function(item) {
        return links[item];
    }).reduce(function(listA, listB) {
        return listA.concat(listB);
    }).filter(function(item) {
        // make sure related only includes new links
        return list.indexOf(item) == -1
    }).getUnique();

    return related.length ? append(list.concat(related)) : list.concat(related);
};
var branches = [];
for( var word in links ) {
    branches.push(append(links[word].concat(word)));
}
var compareArrays = function(listA, listB) {
    if( listA.length != listB.length ) return false;
    return listA.map(function(element) {
        if( listB.indexOf(element) == -1 ) return 0;
        return 1;
    }).filter(function(el) {
        return el == 1;
    }).length == listA.length;
};
var _branches = branches;
var chains = branches.filter(function(branch1, i) {     
    var isUnique = _branches.filter(function(branch2) {
        // are they equivalent
        return compareArrays(branch1, branch2);
    }).length == 1; 
    delete _branches[i];
    return isUnique;
});
Sign up to request clarification or add additional context in comments.

7 Comments

this seems to give me 4 empty arrays and an array containing just rubber jsfiddle.net/r4tdN
thanks for the heads up, it's now fully working - at least with the example.
Sometimes the relationships are quite deep. How do I know how many times to loop through match_chain so that it does not miss these out? jsfiddle.net/smghf
It cycles through all chains including either of the pair in the relationship. I don't understand the issue.
So are you saying the two arrays should be merged?
|
0

I'd go with a map of words, linking the sets they are currently in. The map (a javascript object) with nearly O(1) runtime for accessing a key should help the performance. Start with the same format as proposed by @matt3141:

var pairs = [
    ["car", "wheel"],
    ["wheel", "tyre"],
    ["bed", "sheets"],
    ["guitar", "strings"],
    ["guitar", "pickup"],
    ["rubber", "tyre"],
    ["truck", "wheel"],
    ["pickup", "car"]
];

var setsByWord = {};
for (var i=0; i<pairs.length; i++) {
    var pair = pairs[i];
    if (pair[0] in setsByWord && pair[1] in setsByWord) {
        // both words are already known
        if (setsByWord[pair[0]] === setsByWord[pair[1]]) {
             ; // We're lucky, they are in the same set
        } else {
             // combine the two sets
             var sets = [setsByWord[pair[0]], setsByWord[pair[1]]];
             var larger = sets[1].length > sets[0].length ? sets[1] : sets[0],
                 smaller = sets[+(larger===sets[0])];
             for (var j=0; j<smaller.length; j++)
                 setsByWord[smaller[j]] = larger;
             Array.prototype.push.apply(larger, smaller);
        }
    } else {
        // add the missing word to the existing set
        // or create a new set
        var set = setsByWord[pair[0]] || setsByWord[pair[1]] || [];
        if (!(pair[0] in setsByWord)) {
            set.push(pair[0]);
            setsByWord[pair[0]] = set;
        }
        if (!(pair[1] in setsByWord)) {
            set.push(pair[1]);
            setsByWord[pair[1]] = set;
        }
    }
}
return setsByWord;

This will split your graph in its connected components (In the setsByWord object these component arrays are indexed by the nodes):

> var results = [];
> for (var word in setsByWord)
>     if (results.indexOf(setsByWord[word])<0)
>         results.push(setsByWord[word]);
> return results;

[
    ["car","wheel","tyre","rubber","truck","guitar","strings","pickup"],
    ["bed","sheets"]
]

If you have a directed graph, and want arrays of all successors by word, you could use this:

var pairs = […],
    graph = pairs.reduce(function(map, pair) {
         (map[pair[0]] || (map[pair[0]] = [])).push(pair[1]);
         return map;
    }, {});

var successors = {};
for (var word in graph) (function getSuccessors(word) {
    if (word in successors)
        return successors[word];
    successors[word] = [true]; // some marker against circles
    return successors[word] = word in graph
        ? [].concat.apply(graph[word], graph[word].map(getSuccessors))
        : [];
})(word);
return successors;

If you are sure to have no circles in the graph and only want lists for the beginners of paths, you might add this:

var results = [];
for (var word in successors)
    for (var i=0; word in successors && i<successors[word].length; i++)
        delete successors[successors[word][i]];
for (var word in successors)
    results.push([word].concat(successors[word]));
return results;

// becomes:
[
   ["bed","sheets"],
   ["guitar","strings","pickup","car","wheel","tyre"],
   ["rubber","tyre"],
   ["truck","wheel","tyre"]
]

2 Comments

It appears the OP would rather only have the unique sets, you can see how I implemented this with compareArrays, branches.filter, etc.
In my first solution the sets are unique, with the second approach this if not a must of course - the OP needs to explain his needs more exact.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.