0

I need your expertise on the following situation.

I have a collection as such:

"array" : {
    "item" : 1,
    "1" : [100, 130, 255],
}

"array" : {
    "item" : 2,
    "1" " [0, 70, 120],
}

"array" : {
    "item" : 3,
    "1" : [100, 90, 140],

}

I am querying this collection as such:

 db.test.find(array.1 : {$in : [100, 80, 140]});

This returns me item number 1 and 3 since it matches any values in the provided array with the ones in the collection. However I would like to sort this array to give me the results with more similar numbers. The result should be items 3 and 1 respectively.

I can however grab the results and use a k-nearest neighbor algorithm to sort the array. However dealing with huge datasets makes this very undesirable (or is it?) Are there any functions within MongoDB to provide this? I am using Java, any algorithms to achieve this fast enough? Any help is appreciated.

Thanks.

1 Answer 1

5

You can do this with the aggregation framework, although it's not easy. The trouble lies with there not being an $in operator as part of the aggregation framework. So you have to programatically match each of the items in the array, which gets very messy. edit: reordered so that the match is first, in case that $in helps you filter a good portion out.

db.test.aggregate(
  {$match:{"array.1":{$in:[100, 140,80]}}}, // filter to the ones that match
  {$unwind:"$array.1"}, // unwinds the array so we can match the items individually
  {$group: { // groups the array back, but adds a count for the number of matches
    _id:"$_id", 
    matches:{
      $sum:{
        $cond:[
          {$eq:["$array.1", 100]}, 
          1, 
          {$cond:[
            {$eq:["$array.1", 140]}, 
            1, 
            {$cond:[
              {$eq:["$array.1", 80]}, 
              1, 
              0
              ]
            }
            ]
          }
          ]
        }
      }, 
    item:{$first:"$array.item"}, 
    "1":{$push:"$array.1"}
    }
  }, 
  {$sort:{matches:-1}}, // sorts by the number of matches descending
  {$project:{matches:1, array:{item:"$item", 1:"$1"}}} // rebuilds the original structure
);

outputs:

{
"result" : [
    {
        "_id" : ObjectId("50614c02162d92b4fbfa4448"),
        "matches" : 2,
        "array" : {
            "item" : 3,
            "1" : [
                100,
                90,
                140
            ]
        }
    },
    {
        "_id" : ObjectId("50614bb2162d92b4fbfa4446"),
        "matches" : 1,
        "array" : {
            "item" : 1,
            "1" : [
                100,
                130,
                255
            ]
        }
    }
],
"ok" : 1
}

You can leave the matches field out of the result if you leave it out of the $project at the end.

Sign up to request clarification or add additional context in comments.

3 Comments

Hey, thanks @Stennie -- maybe I should put in a request for $in functionality in the $cond expressions; this would be a lot cleaner!
There doesn't appear to be a request for $in yet, so please do add one.
Related feature suggestion in the MongoDB issue tracker, if anyone wants to watch or vote on this: SERVER-7162.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.