5

Can't seem to find an answer to my doubt, so I decided to post the question and see if someone can help me.

In my application, I have an array of ids which comes from the backend and which is ordered already as I want, for example: [0] => 23, [1] => 12, [2] => 45, [3] => 21

I then "ask" elasticsearch the information corresponding to each id present in this array, using a terms filter. The problem is the results don't come in the order of the ids I sent, so the results get mixed up, like: [0] => 21, [1] => 45, [2] => 23, [3] => 12

Note that I can't sort in elasticsearch by the sorting that orders the array in the backend.

I also can't order them in php as I'm retrieving paginated results from elasticsearch, so if each oage had 2 results, elasticsearch could give me the info only for [0] => 21, [1] => 45, so I can't even order them with php.

How can I get the results ordered by the input array? Any ideas?

Thanks in advance

1
  • I don't think it is possible to do in elastic, which means you need to do it once you get the results back Commented Jan 15, 2014 at 19:31

1 Answer 1

6

Here is one way you can do it, with custom scripted scoring.

First I created some dummy data:

curl -XPUT "http://localhost:9200/test_index"

curl -XPOST "http://localhost:9200/test_index/_bulk " -d'
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 1 } }
{ "name" : "Document 1", "id" : 1 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 2 } }
{ "name" : "Document 2", "id" : 2 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 3 } }
{ "name" : "Document 3", "id" : 3 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 4 } }
{ "name" : "Document 4", "id" : 4 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 5 } }
{ "name" : "Document 5", "id" : 5 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 6 } }
{ "name" : "Document 6", "id" : 6 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 7 } }
{ "name" : "Document 7", "id" : 7 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 8 } }
{ "name" : "Document 8", "id" : 8 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 9 } }
{ "name" : "Document 9", "id" : 9 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 10 } }
{ "name" : "Document 10", "id" : 10 }
'

I used an "id" field even though it's redundant, since the "_id" field gets converted to a string, and the scripting is easier with integers.

You can get back a specific set of docs by id with the ids filter:

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "filter": {
      "ids": {
         "type": "docs",
         "values": [ 1, 8, 2, 5 ]
      }
   }
}'

but these will not necessarily be in the order you want them. Using script based scoring, you can define your own ordering based on document ids.

Here I pass in a parameter that is a list of objects that relate ids to score. The scoring script simply loops through them until it finds the current document id and returns the predetermined score for that document (or 0 if it isn't listed).

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "filter": {
      "ids": {
         "type": "docs",
         "values": [ 1, 8, 2, 5 ]
      }
   },
   "sort" : {
        "_script" : {
            "script" : "for(i:scoring) { if(doc[\"id\"].value == i.id) return i.score; } return 0;",
            "type" : "number",
            "params" : {
                "scoring" : [
                    { "id": 1, "score": 1 },
                    { "id": 8, "score": 2 },
                    { "id": 2, "score": 3 },
                    { "id": 5, "score": 4 }
                ]
            },
            "order" : "asc"
        }
    }
}'

and the documents are returned in the proper order:

{
   "took": 11,
   "timed_out": false,
   "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": null,
      "hits": [
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "1",
            "_score": null,
            "_source": {
               "name": "Document 1",
               "id": 1
            },
            "sort": [
               1
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "8",
            "_score": null,
            "_source": {
               "name": "Document 8",
               "id": 8
            },
            "sort": [
               2
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "2",
            "_score": null,
            "_source": {
               "name": "Document 2",
               "id": 2
            },
            "sort": [
               3
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "5",
            "_score": null,
            "_source": {
               "name": "Document 5",
               "id": 5
            },
            "sort": [
               4
            ]
         }
      ]
   }
}

Here is a runnable example: http://sense.qbox.io/gist/01b28e5c038c785f0844abb7c01a71d69a32a2f4

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much @Sloan Ahrens. I tested it right now and got it working, thanks, worked very well. I would vote up if I could! :)
Up vote, because your custom script filter works. It would be really nice if elasticsearch supports sorting by given id array someday natively. I think there are many cases when you need sorting based on information not stored in the documents you're searching and to sort a already fetched collection of documents in code is also not smart when you request only a chunk of documents (e.g. for pagination).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.