0

I understand how to match a single String against multiple regex patterns using the pipe symbol as explained in some of the answers to this question: Match a string against multiple regex patterns

My question is that when I have the following String:

this_isAnExample of What nav-input a-autoid-9-announce thisIsAnExampleToo

And I use the following regex to extract text:

[A-Z][a-z]*|(?<=_)[A-Za-z-]*

I am expecting to get the following matches:

is
An
Example
What
Is
An
Example
Too

But I actually get is:

isAnExample
What
Is
An
Example
Too

Basically the engine is automatically linking the word An with Example bec it matches the underscore pattern but I want it to treat them as two words (non greedy?) bec according to the other pattern there is another match.

2 Answers 2

2

You probably ment the regex to be

[A-Z][a-z]*|(?<=_)[a-z-]*

The first part being lowercase word starting with uppercase letter, or the second: lowercase word preceded by underscore.

The part of your posted regex (?<=_)[A-Za-z-]* matches lower and upper case letters after underscore, i.e. does not stop matching when uppercase letter met, which should be in fact start of another word.

Sign up to request clarification or add additional context in comments.

2 Comments

Just fyi. If input is just _ it will find a match with empty string.
@anubhava you are right. Sadly the OP didnt describe what should be matched, he gave just one example... To match only non-empty strings replace * by + in the regex.
0

You can use this alternation regex to capture all the lower case text that is wither preceded by _ OR mixed case text:

((?<=_)[a-z][a-z-]*|[A-Z][a-z]*)

RegEx Demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.