2

I am using Elasticsearch to store click traffic and each row includes topics of the page which has been visited. A typical row looks like:

{
  "date": "2017-09-10T12:26:53.998Z",
  "pageid": "10263779",
  "loc_ll": [
    -73.6487,
    45.4671
  ],
  "ua_type": "Computer",
  "topics": [
    "Trains",
    "Planes",
    "Electric Cars"
  ]
}

I want each topics to be a keyword so if I search for cars nothing will be returned. Only Electric Cars would return a result.

I also want to run a distinct query on all topics in all rows so I have a list of all topics used.

Doing this on a pageid would look like like the following, but I am unsure how to approach this for the topics array.

{
  "aggs": {
    "ids": {
      "terms": {
        "field": pageid,
        "size": 10
      }
    }
  }
}

2 Answers 2

6

Your approach to querying and getting the available terms looks fine. Probably you should check your mapping. If you get results for cars this looks as your mapping for topics is an analyzed string (e.g. type text instead of keyword). So please check your mapping for this field.

PUT keywordarray
{
  "mappings": {
    "item": {
      "properties": {
        "id": {
          "type": "integer"
        },
        "topics": {
          "type": "keyword"
        }
      }
    }
  }
}

With this sample data

POST keywordarray/item
{
  "id": 123,
  "topics": [
    "first topic", "second topic", "another"
  ]
}

and this aggregation:

GET keywordarray/item/_search
{
  "size": 0,
  "aggs": {
    "topics": {
      "terms": {
        "field": "topics"
      }
    }
  }
}

will result in this:

"aggregations": {
  "topics": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
      {
        "key": "another",
        "doc_count": 1
      },
      {
        "key": "first topic",
        "doc_count": 1
      },
      {
        "key": "second topic",
        "doc_count": 1
      }
    ]
  }
}
Sign up to request clarification or add additional context in comments.

1 Comment

keyword was it, I was mapping it as text thinking that keyword for concatenate the array.
1

It is very therapeutic asking on SO. Simply changing the mapping type to keyword allowed me to achieve what I needed.

A part of me thought that it would concatenate the array into a string. But it doesn't

{
  "mappings": {
    "view": {
      "properties": {
        "topics": {
          "type": "keyword"
        },...
      }
    }
  }
}

and a search query like

{
  "aggs": {
    "ids": {
      "terms": {
        "field": pageid,
        "size": 10
      }
    }
  }
}

Will return a distinct list of all elements in a fields array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.