0

I want to filter and get data from elastic search. where I have tried Date histogram aggregation but its not solving my purposes. I have data like:

[
   {
      "id":1,
      "title":"Sample news",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-17"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-18"
         }
      ]
   },
   {
      "id":2,
      "title":"Sample news 1",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-18"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-17"
         }
      ]
   }
]

I want to filter and get data like:

year: {
  month: {
   day: {
    news: int,
    regulations: int,
   }
 }
}

That means per day news and regulation count as a date Hierarchy. I can achieve data like that:

        "2020-09-17" : {
          "key_as_string" : "2020-09-17",
          "key" : 1600300800000,
          "doc_count" : 1
        },
        "2020-09-18" : {
          "key_as_string" : "2020-09-18",
          "key" : 1600387200000,
          "doc_count" : 0
        },
        "2020-09-19" : {
          "key_as_string" : "2020-09-19",
          "key" : 1600473600000,
          "doc_count" : 0
        },

using

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    }
  }
}

But it's not solving my purpose. How can I do that using Elasticsearch and Elasticsearch dsl

Expected response: Expected response:

2020: {
  09: {
   17: {
    news: 2,
    regulation: 2
   },
   18: {
    news: 0,
    regulation: 2
   }
 }
}

4
  • Could you make it more clear? what the final response that you will get for the above example? the number of news in specific day? Commented Oct 7, 2020 at 10:40
  • 'regulation' is a nested object or multivalue field. Can you please share the index mapping ? Does regulations date also need to be taken into account or is it the news date only? Commented Oct 7, 2020 at 11:08
  • @CoderL I have updated my expected response. please have a look. Commented Oct 7, 2020 at 11:31
  • @SahilGupta regulation is a nested object and regulations date also needs to count regulation for a specific day. Commented Oct 7, 2020 at 11:34

2 Answers 2

2

I didn't sure what your expected respone, but if you want to get the number of news for every day this is the request you looking for

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "regulation.date",
        "calendar_interval": "day",
        "format": "yyy-MM-dd"
         }
      }
   }
}
Sign up to request clarification or add additional context in comments.

Comments

1

Since the news date and regulation date are 2 different fields & one of them belong to parent doc and other to nested doc. I am not completely sure that we can directly do what you are asking for (I myself is also exploring for the same). However, below query should also work for you.

GET news/_search
{
  "size": 0, 
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    },"regulations_over_time":{
      "nested": {
        "path": "regulation"
      },"aggs": {
        "regulation": {
          "date_histogram": {
            "field": "regulation.date",
            "calendar_interval": "day",
            "keyed": true,
            "format": "yyy-MM-dd"
          }
        }
      }
    }
  }
}

It will provide results in below form:

"aggregations" : {
"regulations_over_time" : { //<=== Regulations over time based on regulationDate
  "doc_count" : 9,
  "regulation" : {
    "buckets" : {
      "2020-09-17" : {
        "key_as_string" : "2020-09-17",
        "key" : 1600300800000,
        "doc_count" : 5
      },
      "2020-09-18" : {
        "key_as_string" : "2020-09-18",
        "key" : 1600387200000,
        "doc_count" : 4
      }
    }
  }
},
"news_over_time" : { //<======= news over time based on news date
  "buckets" : {
    "2020-09-17" : {
      "key_as_string" : "2020-09-17",
      "key" : 1600300800000,
      "doc_count" : 2
    },
    "2020-09-18" : {
      "key_as_string" : "2020-09-18",
      "key" : 1600387200000,
      "doc_count" : 2
    }
  }
}
}
}

You can then merge these 2 stats together if required.

4 Comments

It's partially working. the results are coming side by side. Thank you.
I am not completely sure whether what u r asking is feasible or not. Please do accept the answer if it works for your case
I can't accept the answer because it's not the answer what I'm asking for.
Sure ... Thanks ... Please do let us know here if you got the answer what you are looking for.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.