1
  • i need to replace all occurrences of a string within another string, if the original string matches some filter
  • i can only use a single regex using an s command, because i need to send the assembled command to a 3rd party API

i have tried to use positive lookahead as to not consume the string in which i want to replace characters, but somehow i can not get the replacing to work as expected.

here is what i have tried so far and what was the outcome: (note that the filter - here [0-9]+ is just an example and will be passed in from the call site and i can not directly influence it.

expected result: 9999997890

perl -e '$x = "4564567890"; $x =~ s/(?=^[0-9]+$)456/999/g; print $x'

actual result: 9994567890

  1. this replaces only the first occurrence of 456. why is this happening?
  2. even less understandable for me is that if i change the filter lookahead to (?=.*), both occurrences of 456 are being replaced. why does changing the filter have any effect on the replacing portion of the regex?

i seem to be missing some very basic point about how mixing filtering and replacing stuff in one s command works.

3
  • 1
    I think you need s/(?:\G(?!^)|^(?=\d+$))\d*?\K456/999/g Commented Dec 11, 2019 at 10:17
  • @WiktorStribiżew this seems to work...can you write it as an answer so i can accept it? also, would you care to explain why your regex works and mine does not? :D Commented Dec 11, 2019 at 10:27
  • Probably simpler to understand if you could use eg ^.*?\D.*(*SKIP)(*F)|456 to skip strings that don't contain only digits. Commented Dec 11, 2019 at 10:38

2 Answers 2

2

Your regex only replaces the 456 that is at the start of the string that only consists of digits.

You may use

s/(?:\G(?!^)|^(?=\d+$))\d*?\K456/999/g

See the regex demo

Pattern details

  • (?:\G(?!^)|^(?=\d+$)) - a custom boundary that matches either the end of the previous successful match (\G(?!^)) or (|) the start of string (^) that only contains digits ((?=\d+$))
  • \d*? - 0+ digits, but as few as possible
  • \K - omit the currently matched chars
  • 456 - a 456 substring.

The idea is:

  • Use the \G based pattern to pre-validate the string: (?:\G(?!^)|^(?=<YOUR_VALID_LINE_FORMAT>$))
  • Then adjust the consuming pattern after the above one.
Sign up to request clarification or add additional context in comments.

2 Comments

You can combine the non-capturing group I think: s/(?=\G(?!^)|^\d+$)\d*?\K456/999/g
@justhalf Yes, that is also working here, just I - personally - do not like using zero-width assertions inside zero-width assertions.
1

Alternatively you can probably use (*SKIP)(*F) to skip strings not composed only of digits .

s/^\d*\D.*(*SKIP)(*F)|456/999/g

See this demo at regex101 or your demo at tio.run

The left part ^\d*\D.* tries to match any \D non digit. If found, skips .* rest of the string and fails | OR matches the specified substring 456.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.