I am using Apache Spark DataFrame and I want to upsert data to Elasticsearch and I found I can overwrite them like this
val df = spark.read.option("header","true").csv("/mnt/data/akc_breed_info.csv")
df.write
.format("org.elasticsearch.spark.sql")
.option("es.nodes.wan.only","true")
.option("es.port","443")
.option("es.net.ssl","true")
.option("es.nodes", esURL)
.option("es.mapping.id", index)
.mode("Overwrite")
.save("index/dogs")
but what i noticed so far is this command mode("Overwrite") is actually delete all existing duplicated data and insert the new data
is there a way I can upsert them not delete and re-write them ? because I need to query those data almost real time. thanks in advance