2

I've got several different Elasticsearch function_score but I'm not sure how to combine them

This is the test set I'm looking at (I added comments to be able to refer to specific items in the question, these comments are not actually in the index)

[
    { // Item 1
        "priority": 0.7,
        "classification": [
            {
                "feature": "A",
                "confidence": 0.4
            },
            {
                "feature": "C",
                "confidence": 0.3
            },
            {
                "feature": "B",
                "confidence": 0.6
            }
        ]
    },
    { // Item 2
        "priority": 0.8,
        "classification": [
            {
                "feature": "A",
                "confidence": 0.3
            },
            {
                "feature": "C",
                "confidence": 0.6
            }
        ]
    },
    { // Item 3
        "priority": 0.4,
        "classification":  [
            {
                "feature": "D",
                "confidence": 0.6
            },
            {
                "feature": "C",
                "confidence": 0.8
            }
        ]
    }
]

Now assume I want to score items with the following weights:

  • "A" with weight of 2
  • "B" with weight of 3

I would like to do the following:

  1. Calculate average confidence for each item only for features "A" and "B" (e.g. average confidence of 0.5 for item 1)
  2. Calculate priority for each item (e.g. popularity of 0.8 item 2)
  3. Calculate the sum of weights for each item feature (if item has feature "A" it receives a weight of 2, if it has feature "B" it receives a weight of 3. e.g. item 1 would receive a weight of 5 and item 2 a weight of 2)
  4. Combine the different calculations into a final score

I know how to create the function_score for the average confidence, it would be something like this:

{
  "nested": {
    "path": "classification",
    "query": {
       "function_score": {
          "functions": [
              {
                  "field_value_factor": {
                      "field": "classification.confidence",
                      "missing": 0
                  },
                  "weight": 0
              }
          ],
          "query": {
              "terms": {
                  "classification.feature": [
                      "A",
                      "B"
                  ]
              }
          },
          "score_mode": "avg"
        }
    }
  }
}

I also know how to create the function score for the priority field, it would be something like this:

{
    "function_score": {
        "functions": [
            {
                "field_value_factor": {
                    "field": "popularity",
                    "missing": 0
                },
                "weight": <some-weight>
            }
        ],
        "score_mode": "sum"
    }
}

I think (but not sure) I know how to create the function score for the sum of feature weights (ignoring weights for features that don't match "A" or "B"). It would probably be something like this:

{
  "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        { "match": { "classification.feature": "A" } },
                        { "match": { "classification.feature": "B" } }
                    ]
                }
            },
            "functions": [
              {
                  "filter": { "match": { "classification.feature": "A" } },
                  "weight": 2
              },
              {
                  "filter": { "match": { "classification.feature": "B" } },
                  "weight": 3
              },
            ],
            "score_mode":"sum"
        }
    }
}

But I have no idea how to combine these 3 different function score (I'm currently not sure what would be the actual combine function. I will need to play with different functions and decide which one works best for me but for the question sake we can say I would like to do average on the results of my 3 function_score)

And so my questions are:

  1. Is it possible to define multiple function_score and then define how to combine them?
  2. If it's not possible to combine multiple function_score what approach should I take in order to solve this issue? (I'm not fixated on using 3 different function_score but not sure how to do it otherwise)
  3. Although I said I want to do average on all the function_score results I may later want to do something a bit more complicated like this: score("popularity") + (score("feature-weight") * score("confidence")) - is there a way to achieve this?

I'm currently testing this on ES 2.4.5 (which I know is deprecated). We are going to upgrade pretty soon anyway but:

  • Is it only possible to achieve with later ES versions?
  • Even if its only possible in later ES versions I would still like to know how to accomplish it (and use it after we upgrade)

Googling this didn't result in any useful information

Thanks in advance

1 Answer 1

1

I think you should use script_score. It allows to compute the score using the values of the fields document. Using script_score you do not need to write multiple function_score.

You can also pass parameters to your function score to set the weights for your features at query time.

There is a good example for elasticsearch 2 for advanced usage of script_score in the documentation : https://www.elastic.co/guide/en/elasticsearch/guide/current/script-score.html

Sign up to request clarification or add additional context in comments.

4 Comments

1. Do you happen to know if there's a big performance impact for using the script_score function? 2. Do you happen to know if there's another way to accomplish the same thing? Or is using script_score the best way (that you know of)?
1. I would not expect much lower performance, maybe even better than combining several function score. I do not know the exact performance impact of using scripts, but they are highly optimized. 2. The only thing that I can think of may be rank_feature queries available since Elasticsearch 7, but you will need to index the expected score and you will lose the ability to dynamically set your feature weights.
Also, you may get warning using scripts in Elasticsearch 2. They have been completely rewritten in Elasticsearch 4 using a new language called painless so your script is likely to not be working when you upgrade. I think also that by default scripts are disabled for ES2.
Thanks for the heads up on the rewriting of scripts in ES 4, you are correct about scripts being disabled by default for ES 2 :) I'll give it a shot and keep you updated if I got it working

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.