1

I have an elasticsearch index that contains various member documents. Each member document contains a membership object, along with various fields associated with / describing individual membership. For example:

{membership:{'join_date':2015-01-01,'status':'A'}}

Membership status can be 'A' (active) or 'I' (inactive); both Unicode string values. I'm interested in providing a slight boost the score of documents that contain active membership status.

In my groovy script, along with other custom boosters on various numeric fields, I have added the following:

String status = doc['membership.status'].value; 
float status_boost = 0.0;

if (status=='A') {status_boost = 2.0} else {status_boost=0.0}; 

return _score + status_boost

For some reason associated with how strings operate via groovy, the check (status=='A') does not work. I've attempted (status.toString()=='A'), (status.toString()=="A"), (status.equals('A')), plus a number of other variations.

How should I go about troubleshooting this (in a productive, efficient manner)? I don't have a stand-alone installation of groovy, but when I pull the response data in python the status is very much so either a Unicode 'A' or 'I' with no additional spacing or characters.

2
  • 2
    Are you sure the value is 'A' , because due to analyzing this might have been changed to small ;a' .Can you try with that. Commented Aug 17, 2015 at 18:25
  • You were absolutely correct - thank you. Commented Aug 17, 2015 at 19:08

1 Answer 1

3

@VineetMohan is most likely right about the value being 'a' rather than 'A'.

You can check how the values are indexed by spitting them back out as script fields:

$ curl -XGET localhost:9200/test/_search -d '
{
  "script_fields": {
    "status": {
      "script": "doc[\"membership.status\"].values"
    }
  }
}
'

From there, it should be an indication of what you're actually working with. More than likely based on the name and your usage, you will want to reindex (recreate) your data so that membership.status is mapped as a not_analyzed string. If done, then you won't need to worry about lowercasing of anything.

In the mean time, you can probably get by with:

return _score + (doc['membership.status'].value == 'a' ? 2 : 0)

As a big aside, you should not be using dynamic scripting. Use stored scripts in production to avoid security issues.

Sign up to request clarification or add additional context in comments.

2 Comments

By checking the value via script_fields instead of the value stored in the _source document, I found that not only was the 'A' lowercased, but it was also wrapped in brackets. The former makes perfect sense, but I find the latter a bit strange (given that the value is mapped as a string). Setting s = doc['membership.status'].value[0], then checking s against "a" did the trick.
The bracketed nature of it is largely to do with the fact that I used values instead of value in the output, as well as just how script_fields displays content. Don't worry too much about that unless it was also returning more than one value, which analyzed strings are prone to do.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.