2

I have the following little example with the regex /-+|(?<=: ?).*. But this leads to an infinite loop in Node/Chrome and an "Invalig regex group"-error in Firefox.

When i change this to /-+|(?<=: ).*/gm (Leaving out the ?-quantifier in the lookbehind) it runs but - of course - i don't get the lines which contain no value after the :.

If i change the regex to /-+|(?<=:).*/gm (leaving the space out of the lookbehind) i again run into an infinite loop/error.

Can anyone explain this behaviour to me and what regex i would have to use to also match the lines which end on a colon? I'd love to understand...

const text = `
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
`;

const pattern = /-+|(?<=: ?).*/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    console.log(`"${res[0]}"`);
} 

EDIT:

The expected output is:

"-------------------------------------"
"5048603"
""
"asjhgg | a3857"
"Something..."
"-------------------------------------"
"5048603"
""
"asjhgg | a3857"
"Something..."
"-------------------------------------"
4
  • What output do you expect to get for the sample string you posted? Commented May 16, 2020 at 22:05
  • @WiktorStribiżew Did add the expected output Commented May 16, 2020 at 22:09
  • 1
    Use text.replace(/^[^:\r\n]+:[^\S\r\n]*/gm, ''), if necessary, then .split("\n") Commented May 16, 2020 at 22:11
  • It's generally clearest to begin questions with a statement of what you are trying to achieve, then, where appropriate, give one or more examples, showing the desired result for each, then present code you have tried and explain why it doesn't work, then recap what you are asking of readers. Commented May 16, 2020 at 22:32

4 Answers 4

3

The (?<=...) lookaround is a positive lookbehind and it is not yet supported in FireFox (see supported environments here), thus, you will always get an exception until it is implemented.

The /-+|(?<=: ?).* pattern belongs to patterns that may match empty strings, and this is a very typical "pathological" type of patterns. The g flag makes the JS regex engine match all occurrences of the pattern, and to do that, it advances its lastIndex upon a valid match, but in cases when the match is of zero length, it does not, and keeps on trying the same regex at the same location all over again, and you end up in the loop. See here how to move the lastIndex properly to avoid infinite loops in these cases.

From what I see, you want to remove all beginning of lines before the first : including : and any whitespaces after. You may use

text.replace(/^[^:\r\n]+:[^\S\r\n]*/gm, '')

Or, if you want to actually extract those lines that are all -s or all after :, you may use

const text = `
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
`;

const pattern = /^-+$|:[^\S\r\n]*(.*)/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    if (res[1] != undefined) {
      console.log(res[1]);
    } else {
      console.log(res[0]);
    }
}

Sign up to request clarification or add additional context in comments.

Comments

0

try to use this pattern : /(.*):(.*)/mg

const regex = /(.*):(.*)/mg;
const str = `-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

Comments

0

Up front: Wiktor's answer is the answer to make it work cross-browser.

For anyone who is interested in how to get this to work in Chrome with the "original" pattern (thanks to Wiktor's answer, pointing out that the last index is not incremented on zero-matching):

const pattern = /-+|(?<=: ?).*/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    if(res.index === pattern.lastIndex)
        pattern.lastIndex++;
    console.log(`"${res[0]}"`);
}

Comments

-1

A Regex lookahead is defined like this (?=pattern) and not (pattern?)

https://www.regular-expressions.info/lookaround.html

5 Comments

I'm not using (pattern?), i'm using (?<=pattern) which should be a lookbehind. (Fixed the question title, where i was saying 'lookahead')
Lookbehind is (?!pattern)
(?=foo) => lookahead, (?<=foo) => lookbehind, (?!foo) => negative lookahead, (?<!foo) => negative lookbehind
In PHP but not in JavaScript I think. Or only in a few browsers: caniuse.com/#feat=js-regexp-lookbehind
developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/… (Lookahead does not seem to exist, lookbehind seems to be the same)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.