2

I would like to parse with regex files with the following format in order to retrieve content1a, content1b,content2a,content2b, etc ...

===
content1a
===
content1b
===
content2a
===
content2b

Important : the end of file does not contain ===

This regex does almost the job :

/[===[\s\S]*?===[.]*/g 

but does not retrieve the last content (content2b)

Thank you for helping

3
  • 1
    Do you mean like this? ^===\n(.+) regex101.com/r/2CCNGP/1 Commented Jun 25, 2020 at 7:34
  • perfect in php, almost in js. Thanks Commented Jun 25, 2020 at 7:45
  • ... for the provided use case something like ... str.match(/^(?!\=).*$/gm) ... already should be sufficient enough. Commented Jun 25, 2020 at 10:11

2 Answers 2

4

The pattern that you tried uses a character class, which can also be written as [\s\S]*?=== or ([^]*?)===

It expects === to be there at the end, that is why is does not match the last content.

But if you have for example 5 times an equals sign ===== you will also match the last 2 equals signs, so you could add a newline to prevent that.


Instead of using [\s\S]*? You could use a capturing group to capture all lines that do not start with ===

^===\n((?:(?!===\n).*\n?)*)

Regex demo

const regex = /^===\n((?:(?!===\n).*\n?)*)/gm;
const str = `===
content1a
===
content1b
===
content2a
content2a
content2a
===
content2b`;
let m;

while ((m = regex.exec(str)) !== null) {
  // This is necessary to avoid infinite loops with zero-width matches
  if (m.index === regex.lastIndex) {
    regex.lastIndex++;
  }
  console.log(m[1]);
}

Sign up to request clarification or add additional context in comments.

2 Comments

Same : perfect in php, almost in js. Thanks
@tit I have added a js example
3

You don't need a regex you can just split the string

const str = `===
content1a
===
content1b
===
content2a
===
content2b`;

const contents = str.split('===\n').filter(f => f !== "");

console.log(contents);

3 Comments

this is not a regex, but nice!
No and by not being a regex it's easier to read and probably marginally more efficient. "Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems." (though I don't really subscribe to this opinion myself)
I would change the split value to (/\n*===\n/); thus the filtered values from the split-list will not anymore be terminated by \n.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.