1

seasons.json

{
  "_id" : "unique_1",
  "spring" : [{
      "fruit" : "mango",
      "person_id" : [101.0, 102.0, 103.0, 104.0]
    }, {
      "fruit" : "banana",
      "person_id" : [151.0, 152.0, 153.0, 154.0]
    }],
  "summer" : [{
      "fruit" : "mango",
      "person_id" : [201.0, 202.0, 203.0, 204.0]
    }, {
      "fruit" : "banana",
      "person_id" : [251.0, 252.0, 253.0, 254.0]
    }],
  "fall" : [{
      "fruit" : "mango",
      "person_id" : [301.0, 302.0, 303.0, 304.0]
    }, {
      "fruit" : "banana",
      "person_id" : [351.0, 352.0, 353.0, 354.0]
    }],
  "winter" : [{
      "fruit" : "mango",
      "person_id" : [401.0, 402.0, 403.0]
    }, {
      "fruit" : "banana",
      "person_id" : [451.0, 452.0, 453.0]
    }]
}

/* 2 */
{
  "_id" : "unique_2",
  "spring" : [{
      "fruit" : "banana",
      "person_id" : [151.0, 152.0, 153.0, 154.0]
    }],
  "summer" : [{
      "fruit" : "mango",
      "person_id" : [201.0, 202.0, 203.0, 204.0]
    }, {
      "fruit" : "banana",
      "person_id" : [251.0, 252.0, 253.0, 254.0]
    }],
  "fall" : [{
      "fruit" : "banana",
      "person_id" : [351.0, 352.0, 353.0, 354.0]
    }],
  "winter" : [{
      "fruit" : "mango",
      "person_id" : [401.0, 402.0, 403.0]
    }, {
      "fruit" : "banana",
      "person_id" : [451.0, 452.0, 453.0]
    }]
}

Above JSON records shows which season which person has eaten mango and which has eaten banana.

Here's what I want to find: when i know the _id(primary key) of the record in advance or prior to record finding -

1) all the person_id ranging from 101 - 350 in which person_id is unique 2) person_id eating only mango 3) total number of person in a record eating fruit either mango or banana.

2
  • Please show your effort (i.e. code) first. Commented Apr 17, 2015 at 12:45
  • I find out person_id range within a season with a particular fruit- db.season.find({ "spring.person_id" : { "$elemMatch" : { "$gt" : 101, "$lt" : 104 } } }); Commented Apr 17, 2015 at 12:55

1 Answer 1

3

With a schema like this it's going to be pretty difficult to run queries of such a nature like the ones you require. Consider changing the schema such that you have for each subdocument, one main key say for instance seasons which can have four different array elements i.e. spring, summer, winter and fall. Change the schema to this:

/* 1 */
{
    "_id" : "unique_1",
    "seasons" : [ 
        {
            "name" : "spring",
            "fruits" : [ 
                {
                    "name" : "mango",
                    "person_id" : [ 
                        101, 
                        102, 
                        103, 
                        104
                    ]
                }, 
                {
                    "name" : "banana",
                    "person_id" : [ 
                        151, 
                        152, 
                        153, 
                        154
                    ]
                }
            ]
        }, 
        {
            "name" : "summer",
            "fruits" : [ 
                {
                    "name" : "mango",
                    "person_id" : [ 
                        201, 
                        202, 
                        203, 
                        204
                    ]
                }, 
                {
                    "name" : "banana",
                    "person_id" : [ 
                        251, 
                        252, 
                        253, 
                        254
                    ]
                }
            ]
        }, 
        {
            "name" : "fall",
            "fruits" : [ 
                {
                    "name" : "mango",
                    "person_id" : [ 
                        301, 
                        302, 
                        303, 
                        304
                    ]
                }, 
                {
                    "name" : "banana",
                    "person_id" : [ 
                        351, 
                        352, 
                        353, 
                        354
                    ]
                }
            ]
        }, 
        {
            "name" : "winter",
            "fruits" : [ 
                {
                    "name" : "mango",
                    "person_id" : [ 
                        401, 
                        402, 
                        403
                    ]
                }, 
                {
                    "name" : "banana",
                    "person_id" : [ 
                        451, 
                        452, 
                        453
                    ]
                }
            ]
        }
    ]
}

With this schema it becomes much easier to run the following aggregation queries:

1) all the person_id ranging from 101 - 350 in which person_id is unique

var pipeline1 = [
    { "$match": { "_id": "unique_1" },
    { "$unwind": "$seasons" },
    { "$unwind": "$seasons.fruits" },
    { "$unwind": "$seasons.fruits.person_id" },
    {
        "$match": {
            "seasons.fruits.person_id": {
                "$gte": 101,
                "$lte": 350
            }
        }
    },    
    {
        "$group": {
            "_id": 0,
            "person_ids": {
                "$addToSet": "$seasons.fruits.person_id"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "person_ids": 1
        }
    }
];

db.season.aggregate(pipeline1);

Output:

/* 1 */
{
    "result" : [ 
        {
            "person_ids" : [ 
                304, 
                253, 
                201, 
                251, 
                301, 
                203, 
                252, 
                204, 
                152, 
                102, 
                202, 
                154, 
                254, 
                101, 
                302, 
                153, 
                104, 
                103, 
                303, 
                151
            ]
        }
    ],
    "ok" : 1
}

2) person_id eating only mango

var pipeline2 = [
    { "$match": { "_id": "unique_1" },
    { "$unwind": "$seasons" },
    { "$unwind": "$seasons.fruits" },
    { "$unwind": "$seasons.fruits.person_id" },
    {
        "$match": {
            "seasons.fruits.name": "mango"
        }
    },    
    {
        "$group": {
            "_id": 0,
            "person_ids": {
                "$addToSet": "$seasons.fruits.person_id"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "person_ids": 1
        }
    }
];

db.season.aggregate(pipeline2);

Output:

/* 1 */
{
    "result" : [ 
        {
            "person_ids" : [ 
                402.0000000000000000, 
                304.0000000000000000, 
                303.0000000000000000, 
                302.0000000000000000, 
                301.0000000000000000, 
                204.0000000000000000, 
                202.0000000000000000, 
                201.0000000000000000, 
                203.0000000000000000, 
                104.0000000000000000, 
                102.0000000000000000, 
                103.0000000000000000, 
                403.0000000000000000, 
                401.0000000000000000, 
                101.0000000000000000
            ]
        }
    ],
    "ok" : 1
}

3) total number of person in a record eating fruit either mango or banana.

var pipeline3 = [
    { "$match": { "_id": "unique_1" },
    { "$unwind": "$seasons" },
    { "$unwind": "$seasons.fruits" },
    { "$unwind": "$seasons.fruits.person_id" },
    {
        "$match": {
            "seasons.fruits.name": {
                "$in": ["mango", "banana"]
            }
        }
    },    
    {
        "$group": {
            "_id": "$_id",
            "count": {
                "$sum": 1
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "count": 1
        }
    }
];

db.season.aggregate(pipeline3);

Output:

/* 1 */
{
    "result" : [ 
        {
            "count" : 30
        }
    ],
    "ok" : 1
}
Sign up to request clarification or add additional context in comments.

6 Comments

chrisdam -thanks a lot for your detailed answer. But something get missed while pasting aggregate query method in your post. Its not running properly. I would be highly grateful to you, if you explain any one of the above query with detailed explaination .
@joshi No worries mate. Before you ran the aggregation queries, did you manage to change the schema? Sorry I couldn't well explain what each aggregation pipeline entailed since there were quite a handful of queries you needed and I assumed you knew the aggregation framework concepts hence why I delved straight into the implementation.
@chrisdam - yes, i did that change in db schema. but not able to run the aggregation query . Some brackets is misplaced from its location i think so. As i m new to this bussiness & trying to adopt myself for mongodb framework. So for me, its not easy to walk through your code specially over the aggregation query, but i highly appreciate you for your speedy responses
Thanks @joshi. Please read the aggregation framework documentaion to grasp the basic concepts, that may help you understand the aggregation above better because it uses the same principles: when dealing with arrays, your aggregation is bound to have some $unwind operations that will allow you to easily do aggregation operations further down the pipeline.
@chrisdam - i found a tutorial at thefourtheye.in/2013/04/… which i found very good for beginners . Once again thanks chris for ur valuable inputs..
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.