2

I'm trying to use RegEx to get all instances of varying strings that exist in between a particular pair set of strings. E.g. in the following string:

"The Start. Hello. Jamie. Bye. The Middle. Hello. Sarah. Bye. The End"

I want to get a collection / array consisting of "Jamie" and "Sarah" by checking in between "Hello. " and ". Bye. "

My RegEx object is working fine and I feel I'm nearly successful:

Sub Reggie()
Dim x As String: x = "The Start. Hello. Jamie. Bye. The Middle. Hello. Sarah. Bye. The End"
Dim regEx As RegExp
Set regEx = New RegExp
Dim rPat1 As String: rPat1 = "Hello. "
Dim rPat2 As String: rPat2 = " Bye."
Dim rPat3 As String: rPat3 = ".*"
With regEx
    .Global = True
    .ignorecase = True
    .Pattern = "(^.*" & rPat1 & ")(" & rPat3 & ")(" & rPat2 & ".*)"
    .MultiLine = True
    ' COMMAND HERE
End With
End Sub

But the last bit COMMAND HERE I'm trying .replace(x, "$2") which gives me a string of the last instance of a match i.e. Sarah

I've tried .Execute(x) which gives me a MatchCollection object and when browsing the immediate window I see that object only has the last instance of a match.

Is what I'm requiring possible and how?

2
  • 1
    That is because .* matches as many any chars as possible. You need rPat1 = "Hello\. ", rPat2 = " Bye\.", rPat3 = ".*?" and .Pattern = rPat1 & "(" & rPat3 & ")" & rPat2. Note that you need to escape special chars in regex patterns (here, . is special). Commented May 14, 2021 at 10:39
  • Awesome, thank you, I am very inexperienced with RegEx, I was trying a few different combos with no success. That works perfectly. Commented May 14, 2021 at 10:43

1 Answer 1

3

That is because .* matches as many any chars as possible and you should not match the whole string by adding .* on both ends of your regular expression.

Besides, you need to escape special chars in the regex pattern, here, . is special as it matches any char other than a line break char.

You need to fix your regex declaration like

rPat1 = "Hello\. "
rPat2 = " Bye\."
rPat3 = ".*?"`
.Pattern = rPat1 & "(" & rPat3 & ")" & rPat2

Or, to further enhance the regex, you may

  • Replace literal spaces with \s* (zero or more whitespaces) or \s+ (one or more whitespaces) to support any whitespace
  • Match any non-word chars after the captures string with \W+ or \W*.
rPat1 = "Hello\.\s*"
rPat2 = "\W+Bye\."
rPat3 = ".*?"`
.Pattern = rPat1 & "(" & rPat3 & ")" & rPat2

See the regex demo. Details:

  • Hello\. - Hello. string
  • \s* - zero or more whitespaces
  • (.*?) - Group 1: any zero or more chars other than line break chars as few as possible
  • \W+ - one or more chars other than ASCII letters/digits/_
  • Bye\. - Bye. string.
Sign up to request clarification or add additional context in comments.

3 Comments

OP says they want Jamie not Jamie. so would have suggested .Pattern = "Hello\.\s*(.*?)\.\s*Bye\." ?
@RobinMackenzie I think any non-word chars can be matched then, Hello\.\s*(.*?)\W+Bye\. (demo).
Thanks, accepted, works perfectly and thanks for the link to regex101 - great for experimenting and testing

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.