1

Elasticsearch version: 7.1.1

Hi, I try a lot but could not found any solution in my index, I have a field which is containing strings.

so, for example, I have two documents containing different values in locations array.

Document 1:

"doc" : {
            "locations" : [
              "Cloppenburg",
              "Berlin"
           ]
       }

Document 2:

"doc" : {
                "locations" : [
                  "Landkreis Cloppenburg",
                  "Berlin"
                ]
              }

a user requests a search for a term Cloppenburg and I want to return only those documents which contain term Cloppenburg and not Landkreis Cloppenburg. the results should contain only Document-1. but my query is returning both documents.

I am using the following query and getting both documents back. can someone please help me out in this.

GET /my_index/_search
     {
        "query": {
            "bool": {
                "must": [
                    {
                        "match": {
                            "doc.locations": {
                                "query": "cloppenburg",
                                "operator": "and"
                            }
                        }
                    }
                ]
            }
        }
    }

1 Answer 1

2

The issue is due to your are using the text field and match query.

Match queries are analyzed and used the same analyzer of search terms which is used at index time, which is a standard analyzer in case of text fields. which breaks text on whitespace on in your case Landkreis Cloppenburg will create two tokens landkreis and cloppenburg both index and search time and even cloppenburg will match the document.

Solution: Use the keyword field.

Index def

{
    "mappings": {
        "properties": {
            "location": {
                "type": "keyword"
            }
        }
    }
}

Index your both docs and then use same search query

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "location": {
                            "query": "Cloppenburg"
                        }
                    }
                }
            ]
        }
    }

}

Result

 "hits": [
            {
                "_index": "location",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.6931471,
                "_source": {
                    "location": "Cloppenburg"
                }
            }
        ]
Sign up to request clarification or add additional context in comments.

4 Comments

is there any way out?
@Jawad, nope you need to make these changes, this is how it works :) also its a clean way of doing the things
Thanks this makes sense and working. there is another thing if the document has a publishing date. ``` "doc" : { "datePublished": "2014-11-22T15:06:00.000Z" "locations" : [ "Landkreis Cloppenburg", "Berlin" ] } ``` and sort them with the latest publication date. Then the result is coming back with the Landkreis Cloppenburg if i have this in the query ` "sort": [ { "doc.datePublished": { "order": "desc" } } ] `

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.