1

I have an index structured like below:

"my_index": {
  "mappings": {
    "my_index": {
      "properties": {
        "adId": {
          "type": "keyword"
        },
        "name": {
          "type": "keyword"
        },
        "title": {
          "type": "keyword"
        },
        "creativeStatistics": {
          "type": "nested",
          "properties": {
            "clicks": {
              "type": "long"
            },
            "creativeId": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

I need to remove the nested object in a new index and just save the creativeId as a new keyword (to make it clear: I know I will loose the clicks data, and it is not important). It means the final new index scheme would be:

"my_new_index": {
  "mappings": {
    "my_new_index": {
      "properties": {
        "adId": {
          "type": "keyword"
        },
        "name": {
          "type": "keyword"
        },
        "title": {
          "type": "keyword"
        },
        "creativeId": {
          "type": "keyword"
        }
      }
    }
  }
}

Right now each row has exactly one creativeStatistics. and therefore there is no complexity in selecting one of the creativeIds.

I know it is possible to reindex using painless scripts, but I don't know how can I do that. Any help will be appreciated.

2 Answers 2

1

You can do it like this:

POST _reindex
{
  "source": {
    "index": "my_old_index"
  },
  "dest": {
    "index": "my_new_index"
  },
  "script": {
    "source": "if (ctx._source.creativeStatistics != null && ctx._source.creativeStatistics.size() > 0) {ctx._source.creativeId = ctx._source.creativeStatistics[0].creativeId; ctx._source.remove('creativeStatistics')}",
    "lang": "painless"
  }
}
Sign up to request clarification or add additional context in comments.

Comments

1

You can also create a Pipeline by creating a Script Processor as follows:

PUT _ingest/pipeline/my_pipeline
{
  "description" : "My pipeline",
  "processors" : [
    { "script" : {
        "source": "for (item in ctx.creativeStatistics) { if(item.creativeId!=null) {ctx.creativeId = item.creativeId;} }"
      }  
    },
    {
      "remove": {
        "field": "creativeStatistics"
      }
    }
  ]
}

Note that if you have multiple nested objects, it would append the last object's creativeId. And it would only add creativeId if a source document has one in its creativeStatistics.

Below is how you can then use reindex query:

POST _reindex
{
  "source": {
    "index": "creativeindex_src"
  },
  "dest": {
    "index": "creativeindex_dest",
    "pipeline": "my_pipeline"
  }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.