0

I want to implement an aggregation that only returns the documents whose frequency is above a certain threshold.

For instance, here is the aggregation to get all of the documents with their counts

AggregationBuilder aggregation = AggregationBuilders
                .terms("agg").field("column_name");

so this gives me the counts of documents for each value in column_name

[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"},{"doc_count":23,"key":"val3"}]

now, lets say i dont want all of these documents. I only want those that have a doc_count greater than 25

So the ideal result would be

[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"}]

how do i apply such a filter to my aggregation? I was looking at FilterBuilders and filter aggregations, but they are for applying filters on any values within the documents. For instance i can apply a filter to only get the documents where val1 == xza for column_name

but that is not what i am looking for. I want to apply a threshold for the doc_cunt values after the aggregation has been applied.

Is this possible? I am using elasticsearch java api version 1.7.2

1 Answer 1

1

Terms aggregation has a built in option called min_doc_count. See here for their documentation on it. I haven't used Java API, but this example seems to use .minDocCount() in an example (ctrl-f 'minDocCount')

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.