I am stuck and confused with my current Aggregate expression and I was hoping on some input or a solution in Mongo itself.
The original data from Mongo (simplified to only the fields I need right now):
[{
'status': 'Cancelled',
'CIC Package': 'Test Gallery Cafe',
},
{
'status': 'Completed',
'CIC Package': 'Design Thinking workshop'
},
{
'status': 'Tentative',
'CIC Package': 'Design Thinking workshop'
},
{
'status': 'Confirmed',
'CIC Package': 'Product / solution demonstration'
},
....etc
]
In general...there are 1000s of records of probably 8 'CIC packages' with different statuses (Confirmed, Cancelled, Tentative, Completed) and other data that I have excluded for now.
The end result I am looking for is something like this:
[{
"_id" : "Test Gallery Café",
"package" : "Test Gallery Café",
"status" : [
{
"Cancelled": 1
},
{
"Completed": 1
}
]
},
{
"_id" : "Design Thinking workshop",
"package" : "Design Thinking workshop",
"status" : [
{
"Cancelled": 3
},
{
"Completed": 2
}
]
},
{
"_id" : "Product / solution demonstration",
"package" : "Product / solution demonstration",
"status" : [
{
"Completed": 10
},
{
"Cancelled": 3
},
{
"Confirmed": 1
}
]
}]
So per CIC package which I renamed to package in the $group I want to have a count of each status that exists in the dataset. The statuses and packages are not under my control so in time new ones could be added. It needs to be a dynamic group.
I came as far as this:
db.reportData.aggregate([
{
$project:
{
'CIC package': 1,
'Status': 1
}
}
,
{
$group:
{
_id: '$CIC package',
package:
{
$first: '$CIC package'
}
,
status:
{
$push: '$Status'
}
}
}
]).toArray()
which resulted in something likes this:
[{
"_id" : "Test Gallery Café",
"package" : "Test Gallery Café",
"status" : [
"Cancelled",
"Completed"
]
},
{
"_id" : "Design Thinking workshop",
"package" : "Design Thinking workshop",
"status" : [
"Cancelled",
"Cancelled",
"Cancelled",
"Completed",
"Completed"
]
},
{
"_id" : "Product / solution demonstration",
"package" : "Product / solution demonstration",
"status" : [
"Completed",
"Completed",
"Cancelled",
"Processing",
"Cancelled",
"Cancelled",
"Completed",
"Completed",
"Completed",
"Completed",
"Completed",
"Completed",
"Completed",
"Completed",
"Completed",
"Tentative"
]
}]
This is a small extraction of a much larger set, but it a good sample of the result so far.
I have tried unwind after the last group which does create new records that I possibly could group again, but I am getting a bit confused right now. And maybe I am doing it inefficiently.
I think I am almost there but I would love some input.
Any ideas?
$group. Start Date is not used, I will remove that. I just wanted to show that the original data is a bigger object but none of this fields matter for query/question. I will do my best to clear it up.