0

Having the following simple mapping:

curl -XPUT localhost:9200/transaciones/ -d '{
    "mappings": {
        "ventas": {
            "properties": {
                "tipo": { "type": "string" },
                "cantidad": { "type": "double" }
            }
        }
    }
}'

Adding data:

curl -XPUT localhost:9200/transaciones/ventas/1 -d '{
    "tipo": "Ingreso bancario",
    "cantidad": 80
}'

curl -XPUT localhost:9200/transaciones/ventas/2 -d '{
    "tipo": "Ingreso bancario",
    "cantidad": 10
}'

curl -XPUT localhost:9200/transaciones/ventas/3 -d '{
    "tipo": "PayPal",
    "cantidad": 30
}'

curl -XPUT localhost:9200/transaciones/ventas/4 -d '{
    "tipo": "Tarjeta de credito",
    "cantidad": 130
}'

curl -XPUT localhost:9200/transaciones/ventas/5 -d '{
    "tipo": "Tarjeta de credito",
    "cantidad": 130
}'

When I try to get the aggs with:

curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
    "size": 0,
    "aggs": {
        "tipos_de_venta": {
            "terms": {
                "field": "tipo"
            }
        }
    }
}'

The response is:

  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "tipos_de_venta" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "bancario",
        "doc_count" : 2
      }, {
        "key" : "credito",
        "doc_count" : 2
      }, {
        "key" : "de",
        "doc_count" : 2
      }, {
        "key" : "ingreso",
        "doc_count" : 2
      }, {
        "key" : "tarjeta",
        "doc_count" : 2
      }, {
        "key" : "paypal",
        "doc_count" : 1
      } ]
    }
  }
}

As you can see it cuts the strings Tarjeta de credito into Tarjeta, de, credit. How can I take the entire string without using on the mapping not_analyzed on tipo? My desired output would be Ingreso bancario, PayPal and Tarjeta de crédito, on the response would be something like this:

 "aggregations" : {
    "tipos_de_venta" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "Ingreso bancario",
        "doc_count" : 2
      }, {
        "key" : "PayPal",
        "doc_count" : 1
      }, {
        "key" : "Tarjeta de credito",
        "doc_count" : 2
      } ]
    }
  }

PS: I'm using ES 2.3.2

1 Answer 1

1

It's because your tipo field is an analyzed string. The right way to do this is to create a not_analyzed field in order to achieve what you want:

curl -XPUT localhost:9200/transaciones/_mapping/ventas -d '{
    "properties": {
        "tipo": { 
           "type": "string",
           "fields": {
               "raw": {
                   "type": "string",
                   "index": "not_analyzed"
               }
           }
        }
    }
}'

Then you need to reindex your documents and finally you'll be able to run this and get the desired results:

curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
    "size": 0,
    "aggs": {
        "tipos_de_venta": {
            "terms": {
                "field": "tipo.raw"
            }
        }
    }
}'

UPDATE

If you really don't want to create a not_analyzed field, then you have another way using a script terms aggregation but it can really kill the performance of your cluster

curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
    "size": 0,
    "aggs": {
        "tipos_de_venta": {
            "terms": {
                "script": _source.tipo"
            }
        }
    }
}'
Sign up to request clarification or add additional context in comments.

7 Comments

I know I can do this using not_analyzed (as I state on the post). I want to know if there's any other possibility to achieve my goal without using it.
A little question. The reason of not using the not_analyzed on the tipo field is because someone maybe wants to search on the term itself. Using that tipo.raw whould make that I can make searches directly on tipo and get the aggs on tipo.raw (like two different things)?
Yes that's correct, you can still search on tipo and run your aggregation on tipo.raw
Thanks a lot! Just a little question, how can I easily reindex my data ?
Yes, you can do it
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.