I need to model web site with users and articles where each user can interact (read, open e.t.c) with any article many times. I want to model this data in one elasticsearch index by following nested mapping:
{
"mappings": {
"user": {
"properties": {
"user_id": {"type": "string"},
"interactions": {
"type": "nested",
"properties": {
"article_id": {"type": "string"},
"interact_date": {"type": "date"}
}
}
}
}
}
}
example of indexed document:
{
"user_id": 20,
"interactions": [
{"article_id": "111", "interact_date": "2015-01-01"},
{"article_id": "111", "interact_date": "2015-01-02"},
{"article_id": "222", "interact_date": "2015-01-01"}
]
}
I need to do the following aggregations on the data:
Total number of interactions per day, done by nested aggregation:
GET /_search { "size": 0, "aggs": { "by_date": { "nested": { "path": "interactions" }, "aggs": { "m_date": {"terms": {"field": "interactions.interact_date"}} } } } }Number of unique users interactions per day. If specific user interacted with several articles at same date range the user should be counted only once. In postgres it's simple query: for table with 3 columns [user_id, article_id, interact_date]
SELECT dt, count(uid) FROM (SELECT interact_date::TIMESTAMP::DATE dt, user_id uid FROM interactions GROUP BY interact_date::TIMESTAMP::DATE, user_id) by_date GROUP BY dt;How can I do the same in elasticsearch index?
How to add interactions by _update without re-indexing whole document?
- How to filter users by specific articles - count user once in aggregation by date only if he interacted with one of specified articles?
Thank you