0

How to represent elastic search query shown below in spark with scala :

Request

GET importsmethods/typeimportsmethods/_search?search_type=count
{
  "size": 0,
  "aggs": {
    "group_by_imports": {
      "terms": {
        "field": "tokens.importName"
      }
    }
  }

}

Response

{
   "took": 2064,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1297362,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "group_by_imports": {
         "doc_count_error_upper_bound": 4939,
         "sum_other_doc_count": 1960640,
         "buckets": [
            {
               "key": "java.util.list",
               "doc_count": 129986
            },
            {
               "key": "java.util.map",
               "doc_count": 103525
            }
         ]
      }
   }
}

Spark Code

val conf = new SparkConf().setMaster("local[2]").setAppName("test")

conf.set("es.nodes", "localhost")
conf.set("es.port", "9200")
conf.set("es.index.auto.create","true")
conf.set("es.resource","importsmethods/typeimportsmethods/_search")
conf.set("es.query","""?search_type=count&ignore_unavailable=true {
  "size": 0,
     "aggs": {
       "group_by_imports": {
         "terms": {
           "field": "tokens.importName"
         }
       }
     }
}""")

sc = new SparkContext(conf)
val importMethodsRDD = sc.esRDD();
val rddVal = importMethodsRDD.map(x => x._2) 

rddVal.saveAsTextFile("../")

Exception

Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Index [importsmethods/typeimportsmethods/_search] missing and settings [es.field.read.empty.as.null] is set to false

1 Answer 1

1

You just need to fix the following line, es.resource should only be index/type no need to add the _search endpoint

conf.set("es.resource","importsmethods/typeimportsmethods")

Also, in es.query you don't need the query string, only the query DSL part:

conf.set("es.query","""{
  "size": 0,
     "aggs": {
       "group_by_imports": {
         "terms": {
           "field": "tokens.importName"
         }
       }
     }
}""")
Sign up to request clarification or add additional context in comments.

2 Comments

I am getting this exception : ElasticsearchIllegalArgumentException[aggregations are not supported with search_type=scan]}]
Actually, aggregation support is not yet possible with pre 2.2 elasticsearch-hadoop releases. This feature is slated to be available in 2.2.0-rc1 (current estimate).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.