0

Have an Elasticsearch mapping similar to the one below and I'm trying to update it using the re-index API. I've learned how to use of pipelines to do various things such remove fields or change types, however nothing on removing fields from nested types. For instance, in the descriptions field how would I setup a pipeline to remove the badfield?

{
    "mappings": {
        "all": {
            "_all": {
                "enabled": false
            },
            "dynamic": "strict",
            "properties": {
                "address": {
                    "type": "text"
                },
                "businessName": {
                    "type": "text"
                },
                "descriptions": {
                    "type": "nested",
                    "properties": {
                        "dateSeen": {
                            "type": "date",
                            "format": "date_time"
                        },
                        "source": {
                            "type": "text",
                        },
                        "value": {
                            "type": "text"
                        },
                        "badfield": {
                            "type": "text"
                        }
                    }
                },
                "dateAdded": {
                    "type": "date",
                    "format": "date_time||date_time_no_millis"
                }
            }
        }
    }
}

Documentation on re-indexing

Documentation on removing a field using the removing fields

Using ES 6 btw.

I setup a processor script based on a comment and running into issues where the field is null even though its plainly there.

{
    "processors": [{
            "script": {
                "source": """
                if (ctx._source.descriptions != null) { 
                    for(item in ctx._source.descriptions) { 
                         item.remove('badfield'); 
                    } 
                }
                """
            }
        }
    ]
}

EDIT: removing _source from the script was the issue, which means I don't fully understand its usage but was able to create a nested field removal script.

2
  • 1
    didn't get it. do you want to nullify data in the field or "update" the mapping? you won't be able to update the mapping unless you reindex the index. so, just want to confirm what you are looking for Commented Feb 5, 2020 at 2:17
  • When you want to remove a field from a mapping, re-indexing and a pipeline are necessary to remove the field and data present. I'm assuming the same is possible if the field is nested, however the pipeline structure is not as obvious. I'll post a reference doc to illustrate. @AndreyBorisko Commented Feb 5, 2020 at 4:20

1 Answer 1

2

I answered similar question earlier. The logic used there to remove nested could be moved to script processor.

Maybe documentation needs to clarify nested use case but if you just give it a try, it will work (tested on ES 7.5) If you want to deeper understand what happens i guess you need to check the source code

PUT src
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "name": {
        "type": "keyword"
      },
      "nestedField": {
        "type": "nested",
        "properties": {
          "field1": {
            "type": "boolean"
          },
          "field2": {
            "type": "boolean"
          }
        }
      }
    }
  }
}

PUT dst
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "name": {
        "type": "keyword"
      },
      "nestedField": {
        "type": "nested",
        "properties": {
          "field2": {
            "type": "boolean"
          }
        }
      }
    }
  }
}

POST src/_doc
{
  "name": "name1",
  "nestedField": {
    "field1": true,
    "field2": false
  }
}

POST src/_doc
{
  "name": "name2",
  "nestedField": {
    "field1": false,
    "field2": true
  }
}

GET src
GET src/_search
GET dst/_search
GET dst

PUT _ingest/pipeline/test_pipeline
{
  "processors": [
    {
      "remove": {
        "field": "nestedField.field1"
      }
    }
  ]
}

POST _reindex
{
  "source": {
    "index": "src"
  },
  "dest": {
    "index": "dst",
    "pipeline": "test_pipeline"
  }
}

GET dst/_search
GET dst

#DELETE src
#DELETE dst
Sign up to request clarification or add additional context in comments.

5 Comments

I'm actually on 6 so not sure if that matters (ill update my post) but it my attempt to use that exact logic return errors. Will test again.
sure, otherwise you could use script processor as i mentioned.
finally got a test put together for the remove processor, did not work. Keep getting this error: field1 is not an integer, cannot be used as an index as part of path nestedfield.field1
just realized that was my question you answered ha
Figured out the issue, _source is not acting how I believed and removing it allowed the script to run perfectly. Would be nice if the remove processor worked however maybe versioning or something else is preventing that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.