27

I am using elasticsearch as a document database and each record I create has a guid id that the system uses for the record id. Business people want to offer a feature to let the user have their own auto file name convention based on date and how many records were created so far this day/month.

What I need is to prevent duplicate user file names. Is there a way to setup an indexed field to be unique? Like a sql unique constraint?

3
  • 2
    i believe that the only unique constrain applies to the _id field Commented Jan 30, 2014 at 17:07
  • 2
    Your question is wrong, elasticsearch is not a data-base but a search engine based on Apache Lucene, which not support such features. Also keep in mind that ES is "near" realtime. Commented Oct 23, 2016 at 9:17
  • Document oriented databases tend not to do this, and Elasticsearch is no different. Take a look at Elasticsearch as a NoSQL Database Commented Jun 24, 2022 at 20:50

5 Answers 5

23

You'd need to use the field that is supposed to be unique as id for your documents. By default a new document with existing id would override the existing document with same id, but you can switch to op_type=create in order to get back an error if a document with same id already exists.

There's no way to have the same behaviour with arbitrary fields though, only the _id field works that way. I would probably consider handling this logic in the application layer instead of within elasticsearch.

Sign up to request clarification or add additional context in comments.

1 Comment

I would suggest having a separate collection/type that is simply a pointer to the original document... this way your originals will still have the uuid, and you can even the unique name as a field in the original, the separate type/document will act as a unique index to the original.
2

One solution will be to use uniqueId field value for specifying document ID and use op_type=create while storing the documents in ES. With this you can make sure your uniqueId field will have unique value and will not be overridden by another same valued document.

For this, the elasticsearch document says:

The index operation also accepts an op_type that can be used to force a create operation, allowing for "put-if-absent" behavior. When create is used, the index operation will fail if a document by that id already exists in the index.

Here is an example of using the op_type parameter:

$ curl -XPUT 'http://localhost:9200/es_index/es_type/unique_a?op_type=create' -d  '{
    "user" : "kimchy",
    "uniqueId" : "unique_a"
}'

If you run the above request it is ok, but running it the next time will give you an error.

Comments

1

You can use the _id in the column you want to have unique contraint on. Here is the sample river that uses postgresql. Yo can change the Database Driver/DB-URL according to your usage.

curl -XPUT localhost:9200/_river/simple_jdbc_river/_meta -d "{\"type\":\"jdbc\",\"jdbc\":{\"strategy\":\"simple\",\"poll\":\"1s\",\"driver\":\"org.postgresql.Driver\",\"url\":\"jdbc:postgresql://DB-URL/DB-INSTANCE\",\"user\":\"USERNAME\",\"password\":\"PASSWORD\",\"sql\":\"select t.id as _id,t.name from topic as t \",\"digesting\" : true},\"index\":{\"index\":\"jdbc\",\"type\":\"topic_jdbc_river1\"}}"

Comments

1

So far as to ES 7.5, there is no such extra "constraint" to ensure uniqueness using a custom field in the mapping.

But you still can walk around it via your own application UUID, which could be used directly explicitly as the _id (which is unique implictly) to achieve your goals.

PUT <your_index_name>/_doc/<your_app_uuid>
{
  "a_field": "a_value"
}

Comments

0

Another approach might be to generate the string you store in a field that should be unique by integrating an auto-incrementing integer. This way you ensure from the start that your field values are unique.

You would put your file name together like this:

<current day/month>_<auto-incremented integer>

Auto-incrementing integers are not supported by Elasticsearch per se but you could mimic them using this approach. If you happen to use node.js you can use the es-sequence module.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.