0

I am trying to understand how can I get 2 captured groups with regex(JS), from the following string:

"Group: i_am_group |SubGroup: i_am_sub_group"

I want to get in the end: group1: i_am_group and group2: i_am_sub_group

the rules are-

Extract the first word after "Group: " into group1
Extract the first word after "SubGroup: " into group2

I need to implement those two rules with regex so I can run it with match() function in javaScript

I was trying to do the following:

(?<=Group:\s)(\w+) ((?<=|SubGroup:\s)(\w*))

and the result was:

results

Thanks in advance.

5
  • 1
    you don't need to use lookarounds if you only care about the groups. Commented Mar 20, 2022 at 8:36
  • you need to escape the | in the second lookbehind. Commented Mar 20, 2022 at 8:38
  • Try: /\bGroup: (\w+)\s*\|SubGroup: (\w+)/i Commented Mar 20, 2022 at 8:38
  • @anubhava this works! but I don't get it... when you write "Group......" so the regex will not include Group in the result? Commented Mar 20, 2022 at 8:40
  • 1
    Your question just got answered by 2 people with almost 2 million points between them Commented Mar 20, 2022 at 8:51

2 Answers 2

2

| has special meaning in regular expressions, it's used to specify alternatives. You need to escape it to match it literally.

There's no need to use lookbehinds when you're capturing the part after that. The purpose of lookarounds is to keep them out of the matched string, but if you're only interested in the capture groups this is irrelevant.

This regexp should work for you:

Group:\s(\w+) \|SubGroup:\s(\w*)

DEMO

Sign up to request clarification or add additional context in comments.

2 Comments

Your answer is exactly what I was looking for.... I added group name- "Group:\s(?<group>\w+) \|SubGroup:\s(?<subGroup>\w*). can I ask if there is a way to define a default value to a group if the value not found?
I don't think so.
0

If by "word" you're happy with the definition of \w (which is [A-Za-z0-9_]; more below), you can do it like this:

const rex = /Group:\s*(\w+).*?SubGroup:\s*(\w+)/;

Add the i flag if you want to allow Group and SubGroup to be in lower case.

That:

  1. Looks for Group:
  2. \s* - allows for optional whitespace after it
  3. (\w+) captures all "word" characters that follow that
  4. .*? - looks for optional anything (the ? is important to make it non-greedy)
  5. Looks for SubGroup:
  6. \s* optional whitespace again
  7. (\w+) captures all "word" chars after that

Live Example:

const str = "Group: i_am_group |SubGroup: i_am_sub_group";
const rex = /Group:\s*(\w+).*?SubGroup:\s*(\w+)/;
console.log(str.match(rex));

If you want a different definition for "word" character than \w, use [something_here]+ instead of \w+, where within the [ and ] you list the characters / character ranges you want to consider "word" characters.

For instance, in English, we usually don't consider _ as part of a word (though your examples use it, so I'll leave it in), but we often consider - to be part of a word. We also frequently allow letters borrowed from other languages like é and ñ, so you might want those in the character class. You might go further and (in ES2015+ environments) use Unicode's definition of a "letter", which is written \p{Letter} (and requires the u flag on the expression):

const rex = /Group:\s*([-\p{Letter}0-9_]+).*?SubGroup:\s*([-\p{Letter}0-9_]+)/u;

(The - at the very beginning is treated literally, not as an indicator of a range.)

Live Example:

const str = "Group: i_am_group |SubGroup: i_am_sub_group";
const rex = /Group:\s*([-\p{Letter}0-9_]+).*?SubGroup:\s*([-\p{Letter}0-9_]+)/u;
console.log(str.match(rex));

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.