3

I have a JSON object with same keys.

The values can be a string or numeric (in string form) and is indexed as a text in the same pattern in Elastic search

[{
  "key" : "foo",
  "value" : "lisa"
}, {
  "key" : "bar",
  "value" : "19"
}]

I'm comparing on the basis of the following:

1. match key as "bar"
2. range { "value" : {gt:"10"}}

This is not happening, as the value is indexed as a string (which it should be) and since String "2" > String "10", its failing - which is expected.

Any suggestions on how to move ahead in order to solve this use case?

Additional Info:

I see that the documentation is removed regarding Strings to be used as TermRangeQuery in Elastic Search 7.0+.

References:

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

2
  • Have you defined your mappings for this case? strings type is deprecated in Elasticsearch 7+. The types generally used are either text or keyword based on what kind of searching you want for that particular field. Commented Apr 21, 2020 at 5:57
  • From String, I mean't text. Commented Apr 21, 2020 at 7:50

2 Answers 2

2

As you have already encountered, the drawback of not using a correct data type results in unexpected behaviour. I quiet didn't get why value can be either string of numeric etc. But considering the use-case I would suggest to define different field for different type of value. Considering the query that you are trying to match with, requires the relation between key and value field to be maintained. I therefore would suggest you to define a nested field instead of plain object field.

The reason for not using object field is that elastisearch flattens the object and then index it. Flattening the object result in loss of relationship between the properties. Read more on this here.

Now, consider the following example (elastic 7.x) :

Step 1: Define mapping with correct type for fields

PUT test
{
  "mappings": {
    "properties": {
      "nestedField": {
        "type": "nested",
        "properties": {
          "key": {
            "type": "keyword"
          },
          "stringValue": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "numericValue": {
            "type": "integer"
          }
        }
      }
    }
  }
}

We created a nestedField with fields key, stringValue, numericValue of type keyword (not analysed), text (default standard analyser and sub field of type keyword if exact match is required), integer respectively.

Step 2: Index document

PUT test/_doc/1
{
  "nestedField": [
    {
      "key": "foo",
      "stringValue": "lisa"
    },
    {
      "key": "bar",
      "numericValue": 19
    }
  ]
}

PUT test/_doc/2
{
  "nestedField": [
    {
      "key": "foo",
      "stringValue": "mary"
    },
    {
      "key": "bar",
      "numericValue": 9
    }
  ]
}

Note how I indexed string value and numeric value.

Step 3: Query as required.

To query on nested type field you have to use nested query.

GET test/_search
{
  "query": {
    "nested": {
      "path": "nestedField",
      "query": {
        "bool": {
          "filter": [
            {
              "term": {
                "nestedField.key": "bar"
              }
            },
            {
              "range": {
                "nestedField.numericValue": {
                  "gt": 10
                }
              }
            }
          ]
        }
      }
    }
  }
}

The above query will return only doc 1 because for doc 2 even though key: bar is present but the related value (numericValue) is not greater than 10.

Sign up to request clarification or add additional context in comments.

3 Comments

Hi Nishanth, thanks for your time to write this answer. My concern was on the range query for the "text". "text" doesn't work to provide range for that. Since "2" > "10. Apart from having a different indexing for the numeric value do you see any other approach?
If the data type is string (text) elastic will treat the value lexicographically and not as numeric value. In order to use range as expected with numeric values it is necessary to define the field type as one of the numeric data type given here.
Yes, seems the only way is to have a corresponding numeric field [{ "key" : "foo", "value" : "lisa", }, { "key" : "bar", "value" : "19", "numericValue" : 19, }] and index on the basis of numeric value as well. Later use this while fetching from ES.
0

From the many Stack-overflow/ Github and other resources it confirms that this feature is not available for text.

The only way to use it is via having corresponding numeric field while indexing:

[
 { "key" : "foo", "value" : "lisa" }, 
 { "key" : "bar", "value" : "19", "numericValue" : 19}
] 



and index on the basis of numeric value as well. Later use this while fetching from ES.

1. match key as "bar"
2. range { "numericValue" : {gt:10}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.