2

I'm trying to replace all the unnecessary text in the field

For example I have this:

_id:12345678901,
name:"Company Z"

_id:12345678902,
name: "Corp Y"

_id:12345678902,
name: "Corporation X"

And I want to remove Corp, Corporation and Company in the field name, and make a new field for it, but I can't do it with regex

Target:

_id:12345678901,
name: "Company Z",
newName: "Z"

_id:12345678902,
name: "Corp Y",
newName: "Y"

_id:12345678902,
name: "Corporation X",
newName: "X"

Currently I have this:


db.customers.updateMany(
  {  },
  [{
    $set: { newName: {
      $replaceAll: { input: "$name", find: {"$regexFind": { input: "$name", regex: '/(Corp)|(Corporation)|(Company)/gi' } }, replacement: "" }
    }}
  }]
)

But it doesn't seems to work.

BTW im using mongod 4.4.14

2 Answers 2

5

The problem is that $regexFind doesn't return a string, but an object doc First you have to do $regexFind, and after use the returned objects match field to to the $replaceAll. Here is an example aggregation pipeline which transforms your objects to the desired ones:

  {
    $addFields: {
      "regexResObject": {
        "$regexFind": {
          "input": "$name",
          "regex": "(Company )|(Corporation )|(Corp )"
        }
      }
    }
  },
  {
    "$match": {
      regexResObject: {
        $ne: null
      }
    }
  },
  {
    $addFields: {
      newName: {
        $replaceAll: {
          input: "$name",
          find: "regexResObject.match",
          replacement: ""
        }
      }
    }
  },
  {
    "$project": {
      regexResObject: 0
    }
  }
])
Sign up to request clarification or add additional context in comments.

3 Comments

it's kinda works, but some how, if name isn't contain the regex, newName is returning null. Any suggestion?
yes, thats why I filtered them out. you could write a conditional to add the original field if the value is null.
I don't believe you can use $match inside of $updateMany. So if you do this approach, you'll have to (bulk) write them all back. May be simpler to just do it in code, depending on database size and how many matches you think you'll get.
3

Building off of aaronlukacs answer, but fixing for the fact that you can't use match inside an update pipeline:

db.customers.updateMany(
  {  },
  [
    {
      $addFields: {
        regexResObject: {
          $regexFindAll: {
            input: '$name',
            regex: '(Company )|(Corporation )|(Corp )'
          }
        }
      }
    },
    {
      $addFields: {
        newName: {
          $reduce: {
            input: '$regexResObject',
            initialValue: '$name',
            in: {
              $replaceAll: {
                input: '$$value',
                find: '$$this.match',
                replacement: ''
              }
            }
          }
        }
      }
    },
    {
      $project: {
        regexResObject: 0
      }
    }
  ]
)

Note that if there are no matches, newName should just be name. I also used regexFindAll, which will eliminate more than one instance of the regex match... not relevant here but might be for other cases.

1 Comment

This is using aggregation pipeline as well, so there's no major difference in using updateMany vs aggregate in this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.