2

Used properties:

{
 "mappings": {
   "properties": {
     "attribute_must_1": {
       "type": "nested"
     },
     "attribute_1": {
       "type": "nested"
     },
     "attribute_2": {
       "type": "nested"
     },
   }
 }

}

Input documents for testing:

POST _bulk
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":9},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":9},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":8},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":7},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":11},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":5},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":10},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":6},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":7},"attribute_2":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":1},"attribute_1":{"id":7},"attribute_2":{"id":3}}

Actual Query:

q = {
    "size": 10,
    "query": {
        "function_score": {
            "query": {
    "bool": {
      "filter": [
      ],
      "must": [
        {
          "nested": {
            "path": "attribute_must_1",
            "query": {
              "term": {
                "attribute_must_1.id": "1"
              }
            }
          }
        }
      ]
    }
  },
  "boost": 1,
  "functions": [
    {
      "filter": {
        "nested": {
          "path": "attribute_1",
          "query": {
              "script_score": {
                "query": {
                      "match_all": {}
                  },
                  "script": {
                      "source": "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['attribute_1.id'].value)",
                      "params": {
                          "origin": 10,
                          "scale": 5,
                          "decay": 2,
                          "offset": 0
                      }
                  }
              }
          },
        }
      },
      "weight": 30
    },
    {"filter": {"nested": {"path": "attribute_2", "query": {"term": {"attribute_2.id": "3"}}}}, "weight": 70},

  ],
  "score_mode": "sum",
  "boost_mode": "replace"
 }
},
"sort": [
  "_score",
   {
     "date_deposit": {
     "order": "desc"
   }
   }
   ]
  }

I am trying to add a new filter with a nested field "attribute_1" where I want to calculate a distance between the actual value and the value from all other documents, but there is no influence on the scores that I can see:

for attribute_1 of found:

documents = [9, 9, 9, 10, 9, 9, 4, 9, 3, 9]

I get (sum of 30% and 70% weights from 2 attributes):

scores = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]

so it seems quite binary while it should be somehow a linear function. What I want in something like this:

for found documents values: [10, 9, 8, 3, 10] and the input value of 10 -> I would like to have:

scores (let's say in percentage): [100%, 90%, 80%, 30%, 100%]

I would like to have a simple score as an output ranging from 0-100% but including partial scores from multiple attributes (attribute_1, attribute_2, ...) in a way that:

  • score from attribute_1 in a linear score based on the distance (i.e. any value from 0% to 30%)
  • score from attribute_2 is either 0% or 70% (term query)

I have tried different variations, but nothing works - what is the correct way of doing that? I have the impression that the filter query can't do script_scores somehow ...

I hope that somebody could help me with that? Huge THNX!

2 Answers 2

1
+100

I have tried different variations, but nothing works - what is the correct way of doing that? I have the impression that the filter query can't do script_scores somehow ...

Yes, you are right. As mentioned in documentation - "In a filter context, a query clause answers the question “Does this document match this query clause?” The answer is a simple Yes or No — no scores are calculated. Filter context is mostly used for filtering structured data, e.g."

I will recommend you to not use filter in queries that need to be scored.

Sign up to request clarification or add additional context in comments.

2 Comments

The only working version I was able to build within "functions" was: using "script_score": {"script_score":{ "script": { "source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['attribute_1'].value)", "params": { "origin": 10, "scale": 5, "decay": 0.5, "offset" : 0 } but in this case, the attribute_1 can't be nested...
I was wondering how should I implement the same function for a nested attribute, so these main question still remains.
0

I'm not sure what the difference between attribute_must_1 and attribute_1 is in your example. But taking a step back, a rudimentary pivoted percentage calculation can be achieved much more simply:

Set up a nested mapping:

PUT scores
{
  "mappings": {
    "properties": {
      "attribute_must_1": {
        "type": "nested"
      }
    }
  }
}

Sync the sample docs ([9, 9, 9, 10, 9, 9, 4, 9, 3, 9]):

POST _bulk
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":10}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":4}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":3}}
{"index":{"_index":"scores","_type":"_doc"}}
{"attribute_must_1":{"id":9}}

Use a subtractive function script score query:

GET scores/_search
{
  "query": {
    "nested": {
      "path": "attribute_must_1",
      "query": {
        "function_score": {
          "query": {
            "match_all": {}
          },
          "script_score": {
            "script": {
              "source": "((float)doc['attribute_must_1.id'].value / params.origin) * 100",
              "params": {
                "origin": 10.0
              }
            }
          },
          "boost_mode": "replace"
        }
      }
    }
  }
}

Check the scores:

[
  {
    "_score":100.0,
    "_source":{
      "attribute_must_1":{
        "id":10
      }
    }
  },
  {
    "_score":90.0,
    "_source":{
      "attribute_must_1":{
        "id":9
      }
    }
  },
  ...
  {
    "_score":40.0,
    "_source":{
      "attribute_must_1":{
        "id":4
      }
    }
  },
  {
    "_score":30.0,
    "_source":{
      "attribute_must_1":{
        "id":3
      }
    }
  }
]

5 Comments

In my case, I need to combine several attributes of type: (musts) -> they need to match (bool) to define a scope of a search (should match) -> they should match and have predefined weights As an output I need a score 0-100% which will be a combination of several (should match) attributes and their weights, i.e.:
If I have two (should must) attributes: attribute_1 and attribute_2 and weight_1=3 and weight_2=7 are their corresponding weights, I would like to have an output score in a way that: - attribute_1 is a distance function (as in your query) participating in the total score in the range (0-30%) and attribute_2 has constant participation of 0% or 70% I need a way of combining several of these attributes (some with just a bool predefined weight (0% or weight) and some as a distance function (continuous values 0-weight)
In order to achieve this, I needed to use a list of functions with filters (as represented in my original code) Do you have any idea on how to integrate your solution to this case? Thank you so much for your help !!!
Thanks for the explanation. Can you please edit the original question and share a few of your actual docs -- not just abstract lists of integers? And also the difference between attribute_must_1 and attribute_1 -- it's still not clear.
The difference between attribute_must_1 and attribute_1 is that the first one defines the scope of the research for the query and simply is a must match argument without any influence on the score. The second one attribute_1 and 'attribute_2' are the actual matching criteria that will participate in the output score calculations.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.