10

I have a use case in which concurrent update requests make hit my Elasticsearch cluster. In order to make sure that a stale event (one that is made irrelevant by a newer request) does not update a document after a newer event has already reached the cluster, I would like to pass a script with my update requests to compare a field to determine if the incoming request is relevant or not. The request would look like this:

curl -XPOST 'localhost:9200/test/type1/1/_update' -d '
{
  "script": " IF ctx._source.user_update_time > my_new_time THEN do not update ELSE proceed with update",
  "params": {
    "my_new_time": "2014-09-01T17:36:17.517""
   },
  "doc": {
    "name": "new_name"
   },
  "doc_as_upsert": true
}'

Is the pseudo code I wrote in the "script" field possible in Elasticsearch ? If so, I would love some help with the syntax (groovy, python or javascript).

Any alternative approach suggestions would be greatly appreciated too.

4
  • were you able to find a solution to this? I tried this approach but didn't work. Commented Nov 4, 2016 at 1:36
  • @animageofmine did you find a solution ? Commented May 30, 2017 at 21:34
  • 1
    @Anant look at my post here: discuss.elastic.co/t/conditional-update-to-the-document/64964/… Commented Jun 13, 2017 at 18:30
  • @animageofmineThanks! Commented Jun 13, 2017 at 21:32

2 Answers 2

8

Elasticsearch has built-in optimistic concurrency control (+ here and here).

The way it works is that the Update API allows you two use the version parameter in order to control whether the update should proceed or not.

So taking your above example, the first index/update operation would create a document with version: 1. Then take the case where you have two concurrent requests. Both components A and B will send an updated document, they initially have both retrieved the document with version: 1 and will specify that version in their request (see version=1 in the query string below). Elasticsearch will update the document if and only if the provided version is the same as the current one

Component A and B both send this, but A's request is the first to make it:

curl -XPOST 'localhost:9200/test/type1/1/_update?version=1' -d '{
  "doc": {
    "name": "new_name"
   },
  "doc_as_upsert": true
}'

At this point the version of the document will be 2 and B's request will end up with HTTP 409 Conflict, because B assumed the document was still at version 1, even though the version increased in the meantime due to A's request.

B can definitely retrieve the document with the new version (i.e. 2) and try its update again, but this time with ?version=2in the URL. If it's the first one to reach ES, the update will succeed.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answer @Val. I definitely considered relying on the optimistic concurrency control for this. The mechanism making the updates is acting as a middle man, and I would prefer that it didn't have to retrieve a document before updating it. I'd rather it just send an update, and have Elacticsearch decide whether the update is relevant or not. This middle man will only know the Id of the document, and the data to put in.
@bkahler That is not possible. You will have to first retrieve the doc and then write. All optimistic locks work on this principle.
2

I think the script should be like this:

"script": "if(ctx._source.user_update_time > my_new_time) ctx._source.user_update_time=my_new_time;"

or

"script": "ctx._source.user_update_time > my_new_time ? ctx.op=\"none\" : ctx._source.user_update_time=my_new_time"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.