3

Given the following Data:

> db.users.find({}, {name: 1, createdAt: 1, updatedAt: 1}).limit(5).pretty()
{
    "_id" : ObjectId("5ec8f74f32973c7b7cb7cce9"),
    "createdAt" : ISODate("2020-05-23T10:13:35.012Z"),
    "updatedAt" : ISODate("2020-08-20T13:37:09.861Z"),
    "name" : "Patrick Jere"
}
{
    "_id" : ObjectId("5ec8ef8a2b6e5f78fa20443c"),
    "createdAt" : ISODate("2020-05-23T09:40:26.089Z"),
    "updatedAt" : ISODate("2020-07-23T07:54:01.833Z"),
    "name" : "Austine Wiga"
}
{
    "_id" : ObjectId("5ed5e1a3962a3960ad85a1a2"),
    "createdAt" : ISODate("2020-06-02T05:20:35.090Z"),
    "updatedAt" : ISODate("2020-07-29T14:02:52.295Z"),
    "name" : "Biasi Phiri"
}
{
    "_id" : ObjectId("5ed629ec6d87382c608645d9"),
    "createdAt" : ISODate("2020-06-02T10:29:00.204Z"),
    "updatedAt" : ISODate("2020-06-02T10:29:00.204Z"),
    "name" : "Chisambwe Kalusa"
}
{
    "_id" : ObjectId("5ed8d21f42bc8115f67465a8"),
    "createdAt" : ISODate("2020-06-04T10:51:11.546Z"),
    "updatedAt" : ISODate("2020-06-04T10:51:11.546Z"),
    "name" : "Wakun Moyo"
}
...

Sample Data

I use the following query to return new_users by months:

db.users.aggregate([
    {
        $group: {
            _id: {$dateToString: {format: '%Y-%m', date: '$createdAt'}},
            new_users: {
                $sum: {$ifNull: [1, 0]}
            }
        }
    }
])

example result:

[
  {
    "_id": "2020-06",
    "new_users": 125
  },
  {
    "_id": "2020-07",
    "new_users": 147
  },
  {
    "_id": "2020-08",
    "new_users": 43
  },
  {
    "_id": "2020-05",
    "new_users": 4
  }
]

and this query returns new_users, active_users and total users for a specific month.

db.users.aggregate([
    {
        $group: {
            _id: null,
            new_users: {
                $sum: {
                    $cond: [{
                        $gte: ['$createdAt', ISODate('2020-08-01')]
                    }, 1, 0]
                }
             },
            active_users: {
                $sum: {
                    $cond: [{
                        $gt: ['$updatedAt', ISODate('2020-02-01')]
                    }, 1, 0]
                }
            },
            total_users: {
                $sum: {$ifNull: [1, 0]}
            }
        }
    }
])

How can I get the second query to return results by months just like in the first query?

expected results based on one month filter:

[
  { _id: '2020-09', new_users: 0, active_users: 69},
  { _id: '2020-08', new_users: 43, active_users: 219},
  { _id: '2020-07', new_users: 147, active_users: 276},
  { _id: '2020-06', new_users: 125, active_users: 129},
  { _id: '2020-05', new_users: 4, active_users: 4}
]

2 Answers 2

2
+50

You can try below aggregation.

Count new users followed by look up to count the active users for the time window for each year month.

db.users.aggregate([
{"$group":{
  "_id":{"$dateFromParts":{"year":{"$year":"$createdAt"},"month":{"$month":"$createdAt"}}},
  "new_users":{"$sum":1}
}},
{"$lookup":{
   "from":"users",
    "let":{"end_date":"$_id", "start_date":{"$dateFromParts":{"year":{"$year":"$_id"},"month":{"$subtract":[{"$month":"$_id"},1]}}}},
    "pipeline":[
      {"$match":{"$expr":
        {"$and":[{"$gte":[
          "$updatedAt",
          "$$start_date"
        ]}, {"$lt":[
          "$updatedAt",
          "$$end_date"
        ]}]}
      }},
      {"$count":"activeUserCount"}
    ],
  "as":"activeUsers"
}},
{"$project":{
  "year-month":{"$dateToString":{"format":"%Y-%m","date":"$_id"}}, 
  "new_users":1, 
  "active_users":{"$arrayElemAt":["$activeUsers.activeUserCount", 0]},
  "_id":0
}}])
Sign up to request clarification or add additional context in comments.

4 Comments

what happen when month in _id is 1 and when subtract using "$subtract": [ {"$month":"$_id"},1], will it manage by $dateFromParts or it fail?
What about something like this {$subtract: [ _id, 2592000000 ]}? The number is a month in milliseconds.
@francis - no need for months in milliseconds - date from parts can handle overflow and it will adjust the year accordingly. You can verify on your data.
@turivishal - it will not fail and it will adjust to previous year in this case and month as dec.
0

You can do the same, that you did in first query, group by cteatedAt, no need to use $ifNull operator in total_users,

Playground


Updated,

  • use $facet group by month and count for both counts
  • $project to concat both arrays using $concatArrays
  • $unwind deconstruct array root
  • $group by month and merge both month and count

Playground

12 Comments

I've updated the question to include an example result. So active_users should be users where updatedAt is $gt the date in the _id minus 6 months. For example, if _id is 2020-07 then the $cond should be { $gt: ["$updatedAt", ISODate("2020-01-01")] }.
ISODate should be dynamically generated based on the _id.
ok can you add expected result as per your find documents.
I've added a link to sample data and expected results based on 1 month filter since the data only goes back 5 months.
I don't fully understand how $project works but can't help to feel like it can help. In the first stage, we can grab all the dates, and in the second stage apply condition based on those dates.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.