0

I have a MongoDB Collection which has Documents in Given format,

{
    "_id" : ObjectId("595f5661f34ae7b2adee31bc"),
    "app_userUpdatedOn" : "2017-03-09T12:01:07.615Z",
    "appId" : 31625,
    "app_lastCommunicatedAt" : "2017-03-09T12:18:53.067Z",
    "currentDate" : "2017-03-09T12:19:28.626Z",
    "objectId" : "58c14850e4b0b2406992b29e",
    "name" : "APPSESSION",
    "action" : "START",
    "installationId" : "98088f6641a0fa79",
    "userName" : "98088f6641a0fa79",
    "properties" : [
        [
            "userid",
            "98088f6641a0fa79"
        ],
        [
            "app_os_version",
            "6.0.1"
        ],
        [
            "app_installAt",
            "2017-03-09T12:01:01.307Z"
        ],
        [
            "app_model",
            "SM-J210F"
        ],
        [
            "app_lastCommunicatedAt",
            "2017-03-09T12:18:53.067Z"
        ],
        [
            "app_carrier",
            "Jio 4G"
        ],
        [
            "app_counter",
            1
        ],
        [
            "app_brand",
            "samsung"
        ],
        [
            "app_lib_version",
            "1.0"
        ],
        [
            "app_app_version",
            "3.0.2"
        ],
        [
            "app_os",
            "Android"
        ]
    ],
    "date" : "2017-03-09"
}
{
    "_id" : ObjectId("595f5661f34ae7b2adee31bd"),
    "app_userUpdatedOn" : "2017-02-05T07:38:32.866Z",
    "appId" : 31625,
    "app_lastCommunicatedAt" : "2017-03-09T08:09:05.342Z",
    "currentDate" : "2017-03-09T12:19:28.806Z",
    "objectId" : "58c14850e4b06ec88ecaa9c6",
    "name" : "APPINSTALL",
    "action" : "START",
    "installationId" : "eef436554fbdf4ac",
    "userName" : "eef436554fbdf4ac",
    "properties" : [
        [
            "userid",
            "eef436554fbdf4ac"
        ],
        [
            "app_os_version",
            "5.1"
        ],
        [
            "app_installAt",
            "2017-02-05T11:20:49.809Z"
        ],
        [
            "app_model",
            "Micromax Q465"
        ],
        [
            "app_lastCommunicatedAt",
            "2017-03-09T08:09:05.342Z"
        ],
        [
            "app_carrier",
            "JIO 4G"
        ],
        [
            "app_counter",
            1
        ],
        [
            "app_brand",
            "Micromax"
        ],
        [
            "app_lib_version",
            "1.0"
        ],
        [
            "app_app_version",
            "3.0.2"
        ],
        [
            "app_os",
            "Android"
        ]
    ],
    "date" : "2017-03-09"
}

I want to Fetch the Count and Unique Count of the Documents where currentDate lies in between, startDate and endDate, name is x (eg. APPSESSION), Containing multiple Properties Nested Array (like ["app_installAt","This can be any value instead of null"] ,["app_model","This can be any value instead of null"], and so on... ), Group By userName

Previously i have created a Query in which Nested Array Both Element are Known, and it is as follows

db.testing.aggregate(
      [
            {$match: {currentDate: {$gte:"2017-03-01T00:00:00.000Z", $lt:"2017-03-02T00:00:00.000Z"},name:"INSTALL"}},
            {$match: {properties: ["app_os_version","4.4.2"]}},
            {$match: {properties: ["app_carrier","telenor"]}},
            {$match: {properties: ["app_brand","Micromax"]}},
            {$group: {_id: "$userName"}},
            {$count: "uniqueCount"}
      ]
);

But i am unable to find the Data where i know only 0th index of Property Data Nested Array.

Please do Help.

Thanks in Advance.... :)

4
  • Can I point out that the only way the data would look like this is because of an error in the code updating to it in the first place. Would it not be more logical to fix the code writing it incorrectly instead? This is completely the wrong way to store this. Commented Jul 8, 2017 at 10:17
  • Yes i know, but the Algorithm writing the data is the same and the Data is so huge that changing the Structure of Data is also not feasible as it contains TB's of Data. Commented Jul 8, 2017 at 10:43
  • 2
    If it's Terrabytes of data then all the more reason to fix it. Right now you cannot effectively use an index to aid in the query results. Where all the criteria was actually based on arrays with keys and values of specific paths, then an index would be far more effective and speed results. Anyhow, I have already answered the question with the query for the current format. Commented Jul 8, 2017 at 10:51
  • I got the answer to my question, your answer below did my work. i am working to improve the structural changes required, and to migrate the existing data source. Thanks again Commented Jul 20, 2017 at 12:54

1 Answer 1

2

The query for this is essentially the use of $all for the multiple conditions to match in the array and then use $elemMatch and $eq to match the individual array elements.

For example to match and count the first document supplied in your question "only" the parameters would be:

db.testing.find({
  "currentDate": { 
    "$gte": "2017-03-09T00:00:00.000Z",
    "$lt": "2017-03-10T00:00:00.000Z"
  },
  "properties": {
    "$all": [
      { "$elemMatch": { "$eq": ["app_os_version","6.0.1"] } },
      { "$elemMatch": { "$eq": ["app_carrier", "Jio 4G"] } },
      { "$elemMatch": { "$eq": ["app_brand", "samsung"] } }
    ]   
  }
})

With .aggregate() then you put the whole query into a single $match stage as in:

db.testing.aggregate([
  { "$match": {
    "currentDate": { 
      "$gte": "2017-03-09T00:00:00.000Z",
      "$lt": "2017-03-10T00:00:00.000Z"
    },
    "properties": {
      "$all": [
        { "$elemMatch": { "$eq": ["app_os_version","6.0.1"] } },
        { "$elemMatch": { "$eq": ["app_carrier", "Jio 4G"] } },
        { "$elemMatch": { "$eq": ["app_brand", "samsung"] } }
      ]   
    }
  }},
  { "$group": { "_id": "$userName" }
  { "$count": "unique_count"
])

So $elemMatch in this context is going to examine each "inner" array and see if it matches the supplied conditions, which we give in argument as an "array" to the $eq operator.

The wrapping $all means that "all" the provided $elemMatch conditions "must" be met in order to fulfill the query conditions. And that is how the selection gets made with this type of structure.

If you needed to adjust one of those then the "inner" match is using the element of the array. So on the key it would use the "0" for the index position. i.e:

   { "$elemMatch": { "0": "app_os_version" } },
Sign up to request clarification or add additional context in comments.

2 Comments

and What about the case in which i know "app_os_version" in array and not "6.0.1", i want to take all results after filter from properties array of array in which i know only the array 1st element and not the second
@Shashank Use the "index" position inside the $elemMatch. Added to answer to demonstrate.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.