0

For example, let's assume we have a product index with the following mapping:

 {
  "product": {
    "mappings": {
      "producttype": {
        "properties": {
          "id": {
            "type": "keyword"
          },
          "productAttributes": {
            "type": "nested",
            "properties": {
              "name": {
                "type": "keyword"
              },
              "value": {
                "type": "keyword"
              }
            }
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "text",
                "analyzer": "keyword"
              }
            },
            "analyzer": "standard"
          }
        }
      }
    }
  }
}

I am trying to find how many products which have specific product attributes using the following query(I am using a fuzzy query to allow some edit distance):

  {
  "size": 0,
  "query": {
    "nested": {
      "query": {
        "fuzzy": {
          "productAttributes.name": {
            "value": "SSD"
          }
        }
      },
      "path": "productAttributes"
    }
  },
  "aggs": {
    "product_attribute_nested_agg": {
      "nested": {
        "path": "productAttributes"
      },
      "aggs": {
        "terms_nested_agg": {
          "terms": {
            "field": "productAttributes.name"
          }
        }
      }
    }
  }
}

But it returns all product attributes for each matched document and here is the response I get.

  "aggregations" : {
    "product_attribute_nested_agg" : {
      "doc_count" : 6,
      "terms_nested_agg" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "SSD",
            "doc_count" : 3
          },
          {
            "key" : "USB 2.0",
            "doc_count" : 3
          }
        ]
      }
    }
  }

Could you please guide me to how to filter buckets to only return matched attributes?

Edit: Here are some document samples:

  "hits" : {
    "total" : 12,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "product",
        "_type" : "producttype",
        "_id" : "677d1164-c401-4d36-8a08-6aa14f7f32bb",
        "_score" : 1.0,
        "_source" : {
          "title" : "Dell laptop",
          "productAttributes" : [
            {
              "name" : "USB 2.0",
              "value" : "4"
            },
            {
              "name" : "SSD",
              "value" : "250 GB"
            }
          ]
        }
      },
      {
        "_index" : "product",
        "_type" : "producttype",
        "_id" : "2954935a-7f60-437a-8a54-00da2d71da46",
        "_score" : 1.0,
        "_source" : {
          "productAttributes" : [
            {
              "name" : "USB 2.0",
              "value" : "3"
            },
            {
              "name" : "SSD",
              "value" : "500 GB"
            }
          ],
          "title" : "HP laptop"
        }
      },
    ]
  }
2
  • Can you provide a sample doc and can you specify what needs tp be filtered from buckets? Commented Jul 10, 2020 at 9:12
  • For example, if I search "SSD" I want to return how many products(documents) have this attribute's name Commented Jul 10, 2020 at 9:20

1 Answer 1

1

To filter only specific, you can use filter queries.

Query:

{
  "size": 0,
  "aggs": {
    "product_attribute_nested_agg": {
      "nested": {
        "path": "productAttributes"
      },
      "aggs": {
        "inner": {
          "filter": {
            "terms": {
              "productAttributes.name": [
                "SSD"
              ]
            }
          },
          "aggs": {
            "terms_nested_agg": {
              "terms": {
                "field": "productAttributes.name"
              }
            }
          }
        }
      }
    }
  }
}

This is what it does the trick:

"filter": {
  "terms": {
    "productAttributes.name": [
      "SSD"
    ]
  }
}

You need to do filter part of the aggregation.

Output:

"aggregations": {
  "product_attribute_nested_agg": {
    "doc_count": 4,
    "inner": {
      "doc_count": 2,
      "terms_nested_agg": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "SSD",
            "doc_count": 2
          }
        ]
      }
    }
  }
}

Filtering using Fuzziness :

GET /product/_search
{
  "size": 0,
  "aggs": {
    "product_attribute_nested_agg": {
      "nested": {
        "path": "productAttributes"
      },
      "aggs": {
        "inner": {
          "filter": {
           "fuzzy": {
                "productAttributes.name": {
                  "value": "SSt",//here will match SSD
                  "fuzziness": 3//you can remove it to be as Auto
                }
              }
          },
          "aggs": {
            "terms_nested_agg": {
              "terms": {
                "field": "productAttributes.name"
              }
            }
          }
        }
      }
    }
  }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. it solved my problem I edited your answer to support my use case(fuzziness). Could you please upvote the question if you found it helpful?
@Gibbs Can you post here the bodybuilder.js syntax for the first query which is in your answer?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.