3

I'm trying to find the minimum (smallest) value in a 2-level nesting (separate minimum value per document).

So far I'm able to make an aggregation which counts the min value from all the nested values in my search results but without separation per document.

My example schema:

class MyExample(DocType):
    myexample_id = Integer()
    nested1 = Nested(
        properties={
            'timestamp': Date(),
            'foo': Nested(
                properties={
                    'bar': Float(),
                }
            )
        }
    )
    nested2 = Nested(
        multi=False,
        properties={
            'x': String(),
            'y': String(),
        }
    )

And this is how I'm searching and aggregating:

from elasticsearch_dsl import Search, Q

search = Search().filter(
    'nested', path='nested1', inner_hits={},
    query=Q(
        'range', **{
            'nested1.timestamp': {
                'gte': exampleDate1,
                'lte': exampleDate2
            }
        }
    )
).filter(
    'nested', path='nested2', inner_hits={'name': 'x'},
    query=Q(
        'term', **{
            'nested2.x': x
        }
    )
).filter(
    'nested', path='nested2', inner_hits={'name': 'y'},
    query=Q(
        'term', **{
            'nested2.y': y
        }
    )
)

search.aggs.bucket(
    'nested1', 'nested', path='nested1'
).bucket(
    'nested_foo', 'nested', path='nested1.foo'
).metric(
    'min_bar', 'min', field='nested1.foo.bar'
)

Basically what I need to do is to get the min value for all the nested nested1.foo.bar values for each unique MyExample (they have unique myexample_id field)

1 Answer 1

2

If you want minimum value per document then put all the nested buckets within a bucket terms aggregation over myexample_id field:

search.aggs..bucket(
  'docs', 'terms', field='myexample_id'
).bucket(
  'nested1', 'nested', path='nested1'
).bucket(
  'nested_foo', 'nested', path='nested1.foo'
).metric(
  'min_bar', 'min', field='nested1.foo.bar'
)

Note that this aggregation might be extremely expensive to calculate since it has to create a bucket for each document. For a use case like this it might be easier to compute the minimum on a per document basis as a script_field or in the app.

Sign up to request clarification or add additional context in comments.

1 Comment

I was even going to suggest that this minimum value should be figured out at indexing time and stored at the root level of the document. It would make the whole thing a lot more performant than several levels of aggs on nested documents or using scripting.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.