4

I'm trying to launch Spark jobs that use Elastic Search input via command line using spark-submit as described in http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html

I'm setting the properties in a file, but when launching spark-submit it gives the following warnings:

~/spark-1.0.1-bin-hadoop1/bin/spark-submit --class Main --properties-file spark.conf SparkES.jar

Warning: Ignoring non-spark config property: es.resource=myresource
Warning: Ignoring non-spark config property: es.nodes=mynode
Warning: Ignoring non-spark config property: es.query=myquery
...
Exception in thread "main" org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed

My config file looks like (with correct values):

es.nodes      nodeip:port
es.resource   index/type
es.query      query

Setting the properties in the Configuration object in the code works, but I need to avoid this workaround.

Is there a way to set those properties via command line?

1
  • can you add your spark.conf file to the question? Commented Aug 12, 2014 at 20:09

2 Answers 2

4

I don't know if you resolved your issue (if so, how?), but I found this solution:

import org.elasticsearch.spark.rdd.EsSpark

EsSpark.saveToEs(rdd, "spark/docs", Map("es.nodes" -> "10.0.5.151"))

Bye

Sign up to request clarification or add additional context in comments.

Comments

0

When you pass a config file to spark-submit, it only loads configs that start with 'spark.'

So, in my config I simply use

spark.es.nodes <es-ip>

and in the code itself I have to do

val conf = new SparkConf()
conf.set("es.nodes", conf.get("spark.es.nodes"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.