0
Here is an aggregation query that works as expected when I use dev tools in on Elastic Search :  

   search_query = {
      "aggs": {
        "SHAID": {
          "terms": {
            "field": "identiferid",
            "order": {
              "sort": "desc"
            },
    #         "size": 100000
          },
          "aggs": {
            "update": {
              "date_histogram": {
                "field": "endTime",
                "calendar_interval": "1d"
              },
              "aggs": {
                "update1": {
                      "sum": {
                        "script": {
                          "lang": "painless",
                          "source":"""
                              if (doc['distanceIndex.att'].size()!=0) { 
                                  return doc['distanceIndex.att'].value;
                              } 
                              else { 
                                  if (doc['distanceIndex.att2'].size()!=0) { 
                                  return doc['distanceIndex.att2'].value;
                              }
                              return null;
                              }
                              """
                        }
                      }
                    },
                "update2": {
                         "sum": {
                        "script": {
                          "lang": "painless",
                          "source":"""
                              if (doc['distanceIndex.att3'].size()!=0) { 
                                  return doc['distanceIndex.att3'].value;
                              } 
                              else { 
                                  if (doc['distanceIndex.at4'].size()!=0) { 
                                  return doc['distanceIndex.att4'].value;
                              }
                              return null;
                              }
                              """
                        }
                      }
                  },
              }
            },
            "sort": {
              "sum": {
                "field": "time2"
              }
            }
          }
        }
      },
    "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "match_all": {}
            },
            {
              "range": {
                "endTime": {
                  "gte": "2021-11-01T00:00:00Z",
                  "lt": "2021-11-03T00:00:00Z"
                }
              }
            }
          ]
        }
      }
    }

When I attempt to execute this aggregation using the Python ElasticSearch client (https://elasticsearch-py.readthedocs.io/en/v7.15.1/) I receive the exception :

exception search() got multiple values for keyword argument 'size'

If I remove the attribute :

"size": 0,

From the query then the exception is not thrown but the aggregation does not run as "size": 0, is required for an aggregation.

Is there a different query format I should use for performing aggregations using the Python ElasticSearch client ?

Update :

Here is code used to invoke the query :

import elasticsearch
from elasticsearch import Elasticsearch, helpers

es_client = Elasticsearch(
    ["https://test-elastic.com"],
    scheme="https",
    port=443,
    http_auth=("test-user", "test-password"),
    maxsize=400,
    timeout=120,
    max_retries=10,
    retry_on_timeout=True
)

query_response = helpers.scan(client=es_client,
                                     query=search_query,
                                     index="test_index",
                                     clear_scroll=False,
                                     request_timeout=1500)

rows = []
try:
    for row in query_response:
        rows.append(row)
except Exception as e:
    print('exception' , e)
        

Using es_client :

es_client.search(index="test_index", query=search_query)

results in error :

/opt/oss/conda3/lib/python3.7/site-packages/elasticsearch/connection/base.py in _raise_error(self, status_code, raw_data)
    336 
    337         raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
--> 338             status_code, error_message, additional_info
    339         )
    340 

RequestError: RequestError(400, 'parsing_exception', 'unknown query [aggs]')

Is aggs valid for search api ?

2
  • Please add the actual call made in Python. Commented Nov 7, 2021 at 13:35
  • @JasonS please see question update. Commented Nov 7, 2021 at 14:05

1 Answer 1

1

helpers.scan is a

Simple abstraction on top of the scroll() api - a simple iterator that yields all hits as returned by underlining scroll requests.

It's meant to iterate through large result sets and comes with a default keyword argument of size=1000

To run an aggregation, use the es_client.search() method directly, passing in your query as body, and including "size": 0 in the query should be fine.

Sign up to request clarification or add additional context in comments.

4 Comments

thanks, please see update, search client does not accept 'aggs' as is .
it's body, query is going and setting the query key within the body, updated
Example would be useful-
I have been using body. I now get a Warning saying that body is going to be deprecated. I want to move to using query. But I am unable to do so because of this error. Are there any updates on how to do aggregates without using body?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.