11

I am trying to aggregate over dynamic fields (different for different documents) via elasticsearch. Documents are like following:

[{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
},{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}]

As you can see attributes change across documents. What Im trying to achieve is to aggregate fields of these attributes, like so:

{
     aggregations: {
         types: {
             buckets: [{key: 'phone', count: 123}, {key: 'clothing', count: 12}]
         }
     }
}

I am trying aggregation feature of elasticsearch to achieve this, but not able to find correct way. Is it possible to achieve via aggregation ? Or should I start looking in to facets, thought it seem to be depricated.

0

2 Answers 2

16

You have to define attributes as nested in your mapping and change the layout of the single attribute values to the fixed layout { key: DynamicKey, value: DynamicValue }

PUT /catalog
{
  "settings" : {
    "number_of_shards" : 1
  },
  "mappings" : {
    "article": {
      "properties": {
        "name": { 
          "type" : "string", 
          "index" : "not_analyzed" 
        },
        "price": { 
          "type" : "integer" 
        },
        "attributes": {
          "type": "nested",
          "properties": {
            "key": {
              "type": "string"
            },
            "value": {
              "type": "string"
            }
          }
        }
      }  
    }
  }
}

You may than index your articles like this

POST /catalog/article
{
  "name": "shirt",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "clothing"},
    { "key": "size", "value": "m"}
  ]
}

POST /catalog/article
{
  "name": "galaxy note",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "phone"},
    { "key": "weight", "value": "140gm"}
  ]
}

After all you are then able to aggregate over the nested attributes

GET /catalog/_search
{
  "query":{
    "match_all":{}
  },     
  "aggs": {
    "attributes": {
      "nested": {
        "path": "attributes"
      },
      "aggs": {
        "key": {
          "terms": {
            "field": "attributes.key"
          },
          "aggs": {
            "value": {
              "terms": {
                "field": "attributes.value"
              }
            }
          }
        }
      }
    }
  }
}

Which then gives you the information you requested in a slightly different form

[...]
"buckets": [
  {
    "key": "type",
    "doc_count": 2,
    "value": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
      {
        "key": "clothing",
        "doc_count": 1
      }, {
        "key": "phone",
        "doc_count": 1
      }
      ]
    }
  },
[...]
Sign up to request clarification or add additional context in comments.

1 Comment

Could you please tell how one can filter over dynamic fields and get appropriate fileds' counts in response? Question from here stackoverflow.com/questions/55839096/…
0

Not sure if this is what you mean, but this is fairly simple with basic aggregation functionality. Beware I did not include a mapping so with type of multiple words you are getting double results.

POST /product/atype
{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
}

POST /product/atype
{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}

GET /product/_search?search_type=count
{
  "aggs": {
    "byType": {
      "terms": {
        "field": "attributes.type",
        "size": 10
      }
    }
  }
}

4 Comments

I was trying this, but problem is that attributes are not fixed (and also not meant to be fixed), is it possible to do something like attributes.* ?
So you are saying the key attributes is not fixed? Or the key attributes.type is not fixed? From your example, everything looks pretty fixed to me. What is not fixed exactly?
Same here! Need an answer to this question.
Meaning attributes is fixed, everything inside attributes are not, one doc might have attributes.size and attributes.weight, other doc might have attributes.width, attributes.color, attributes.release and attributes.matrial god knows what other doc might have.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.