0

I'm trying to delete fields from an object of an array in Elasticsearch. The index has been dynamically generated.

This is the mapping:

{
  "mapping": {
    "_doc": {
      "properties": {
        "age": {
          "type": "long"
        },
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "result": {
          "properties": {
            "resultid": {
              "type": "long"
            },
            "resultname": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          },
        "timestamp": {
          "type": "date"
        }
      }
    }
  }
}
}

this is a document:

{
    "result": [
        {
            "resultid": 69,
            "resultname": "SFO"
        },
        {
            "resultid": 151,
            "resultname": "NYC"
        }
    ],
    "age": 54,
    "name": "Jorge",
    "timestamp": "2020-04-02T16:07:47.292000"
}

My goals is to remove all the fields resultid in result in all the document of the index. After update the document should look like this:

{
    "result": [
        {
            "resultname": "SFO"
        },
        {
            "resultname": "NYC"
        }
    ],
    "age": 54,
    "name": "Jorge",
    "timestamp": "2020-04-02T16:07:47.292000"
}

I tried using the following articles on stackoverflow but with no luck: Remove elements/objects From Array in ElasticSearch Followed by Matching Query remove objects from array that satisfying the condition in elastic search with javascript api Delete nested array in elasticsearch Removing objects from nested fields in ElasticSearch

Hopefully someone can help me find a solution.

3 Answers 3

2

You should reindex your index in a new one with _reindex API and call a script to remove your fields :

POST _reindex
{
  "source": {
    "index": "my-index"
  },
  "dest": {
    "index": "my-index-reindex"
  },
  "script": {
    "source": """
     for (int i=0;i<ctx._source.result.length;i++) {
        ctx._source.result[i].remove("resultid")
     }
     """

  }
}

After you can delete your first index :

DELETE my-index

And reindex it :

POST _reindex
{
  "source": {
    "index": "my-index-reindex"
  },
  "dest": {
    "index": "my-index"
  }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you so much! I used your script with an "update by query" so that I don't have reindex
Perfect ! it's just safer to reindex in a new index. Can you mark my answer as resolved
I agree that it is safer to reindex to a new index, so that if something goes wrong you'll always have your old index.
1

I combined the answer from Luc E with some of my own knowledge in order to reach a solution without reindexing.

POST INDEXNAME/TYPE/_update_by_query?wait_for_completion=false&conflicts=proceed
{
"script": {
    "source": "for (int i=0;i<ctx._source.result.length;i++) { ctx._source.result[i].remove(\"resultid\")}"
    },
"query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "result.id"
          }
        }
      ]
    }
  }
}

Thanks again Luc!

Comments

1

If your array has more than one copy of element you want to remove. Use this: ctx._source.some_array.removeIf(tag -> tag == params['c'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.