I have some documents
{name: 'apple', type: 'fruit', color: 'red'}
{name: 'banana', type: 'fruit', color: 'yellow'}
{name: 'orange', type: 'fruit', color: 'orange'}
{name: 'eggplant', type: 'vege', color: 'purple'}
{name: 'brocoli', type: 'vege', color: 'green'}
{name: 'rose', type: 'flower', color: 'red'}
{name: 'cauli', type: 'vege', color: 'white'}
{name: 'potato', type: 'vege', color: 'brown'}
{name: 'onion', type: 'vege', color: 'white'}
{name: 'strawberry', type: 'fruit', color: 'red'}
{name: 'cashew', type: 'nut', color: ''}
{name: 'almond', type: 'nut', color: ''}
{name: 'lemon', type: 'vege', color: 'yellow'}
{name: 'tomato', type: 'vege', color: 'red'}
{name: 'tomato', type: 'fruit', color: 'red'}
{name: 'fig', type: 'fruit', color: 'pink'}
{name: 'nectarin', type: 'fruit', color: 'pink'}
I want to group them into alphabets like below
{
_id:'a',
name:['apple','almond'],
type:[],
color:[]
}
{
_id:'b',
name:['banana','brocoli'],
type:[],
color:['brown']
}
...
{
_id:'f',
name:['fig'],
type:['fruit','flower'],
color:['']
}
...
{
_id:'n',
name:['nectarin'],
type:['nut'],
color:['']
}
...
{
_id:'p',
name:['potato'],
type:[''],
color:['pink','purple']
}
...
The result can be saved into another collection. So I can issue a query in the newly created collection: find({_id:'a'}) to return name, type and color begins with the letter 'a'.
I have thought about using $group
$group: {
_id: $substr: ['$name', 0, 1],
name: {$addToSet: '$name'},
}
Then another command
$group: {
_id: $substr: ['$type', 0, 1],
name: {$addToSet: '$type'},
}
And
$group: {
_id: $substr: ['$color', 0, 1],
name: {$addToSet: '$color'},
}
But I am stuck at how to unify all three together to save into a new collection. Or is aggregation framework not suitable for this kind of data summary?
In a real world example, e.g. a e-commerce site, the front page displays something like: "currently we have 135636 products under 231 categories from 111 brands". Surely, these numbers should be cached somewhere (in memory or in another collection), because running $group each time is resource intensive? What would be the optimal schema/design for these situations?
Sorry, my questions are a bit 'confusing'.