0

Am querying ElasticSearch using Java API and am getting lot of duplicate values. I want to get only the unique values from the query (distinct value). How can we get the distinct values from the Query Builder.

Please find my java code below, which is giving duplicate values.

QueryBuilder qb2=null;
List<Integer> link_id_array=new ArrayList<Integer>();
for(Replacement link_id:linkIDList) {
    link_id_array.add(link_id.getLink_id());
}

qb2 = QueryBuilders.boolQuery()
        .must(QueryBuilders.termsQuery("id", link_id_array));

Am using elastic search 6.2.3 version with RestHighLevelClient

1 Answer 1

1

Way 1: You need to use the so-called aggregation API :

Sample query to get distinct emails client :

{
  "query" : {
    "match_all" : { }
  },
  "aggregations" : {
    "label_agg" : {
      "terms" : {
        "field" : "Email_client",
        "size" : 100
      }
    }
  }
}

Java code sample=>

SearchRequestBuilder aggregationQuery = 
      client.prepareSearch("emails")
        .setQuery(QueryBuilders.matchAllQuery())
        .addAggregation(AggregationBuilders.terms("label_agg")
          .field("Email_client").size(100));

SearchResponse response = aggregationQuery.execute().get();
    Aggregation aggregation = response.getAggregations().get("label_agg");
    StringTerms st = (StringTerms) aggregation;
    return st.getBuckets().stream()
      .map(bucket -> bucket.getKeyAsString())
      .collect(toList());

Way 2 : Use cardinality of aggregation Api: Sample elasticquery:

{
  "size": 0,
  "aggs": {
    "distinct": {
      "cardinality": {
        "field": "Email_client",
        "size" : 100
      }
    }
  }

Java code sample=>

AggregationBuilder agg11 = AggregationBuilders.cardinality("distinct").field("Email_client");
        SearchResponse response11 = client.prepareSearch("emails")// we can give multiple index names here
                .setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
                .setQuery(query11)
                .addAggregation(agg11)
                .setExplain(true)
                .setSize(0)
                .get();
Sign up to request clarification or add additional context in comments.

3 Comments

Way1 trowns an exception "invalid term-aggregator order path [_key]. Unknown aggregation [_key]"}],"type":"search_phase_execution_exception"
@rogger2016 {"type":"aggregation_execution_exception","reason":"Invalid term-aggregator order path [_key]. Unknown aggregation [_key]"} facing the same error. Is there any solution you found?
Second approach gives the duplicate values. Both the solutions are not working for me

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.