0

I need to extract the content between two XML tags, excluding the tags.

PS: I won't be using this just to parse XML. I'll be using the RegEx in JavaScript, so the lookbehind won't work.

What am I doing wrong?

XML:

<location maps="">
    RewriteMap map txt:map.txt
    RewriteMap lower int:tolower
    RewriteCond %{REQUEST_URI} ^/([^/.]+)\.html$ [NC]
    RewriteCond ${map:${lower:%1}|NOT_FOUND} !NOT_FOUND
    RewriteRule .? /index.php?q=${map:${lower:%1}} [NC,L]
</location>

RegEx:

/(?:(?=(\<(?!\/)(.*?)\>)))([\s\S]*?)(?=(\<(?=\/)(.*?)\>))/igm

Results:

<location maps="">
    RewriteMap map txt:map.txt
    RewriteMap lower int:tolower
    RewriteCond %{REQUEST_URI} ^/([^/.]+)\.html$ [NC]
    RewriteCond ${map:${lower:%1}|NOT_FOUND} !NOT_FOUND
    RewriteRule .? /index.php?q=${map:${lower:%1}} [NC,L]

What I Want

RewriteMap map txt:map.txt
RewriteMap lower int:tolower
RewriteCond %{REQUEST_URI} ^/([^/.]+)\.html$ [NC]
RewriteCond ${map:${lower:%1}|NOT_FOUND} !NOT_FOUND
RewriteRule .? /index.php?q=${map:${lower:%1}} [NC,L]
4
  • There must be a super easy and robust way if you can use any decent XML parser, instead of doing this with regex; Why is it such a bad idea to parse XML with regex? Commented May 12, 2016 at 9:59
  • Actually it's just an example, I'll be using it for anything, HTML, XML, even if I have to get content between two 'Things' with multiple lines. Commented May 12, 2016 at 10:05
  • What environment? JS, php, editor...? Commented May 12, 2016 at 10:07
  • I'll be using it in JavaScript. Commented May 12, 2016 at 10:08

2 Answers 2

1

You can also use the following regex: (if tag name is constant)

<location[^>]*>([^<]+)</location>
Sign up to request clarification or add additional context in comments.

2 Comments

That still has a result with the tags in.
I have made regex based on your given example. It is working correctly.
0

How about

<(\w+)[^>]+>\n*([\s\S]*)<\/\1>

It'll grab your tag, capture everything up to tag repeated prefixed with /.

Result in capture group 2.

Check it out here at regex101.

4 Comments

That's not quite what I am looking for. I already have a fixed JavaScript RegEx Template and the only thing that changes is the RegEx.
Don't quite follow... Do you mean you have to use the match, not capture groups?
The template will loop through the RegEx that was put into it. The template will find the RegEx and replace it with something. I wouldn't want to change the template just for one RegEx option when I have 123 more RegEx options. I need the RegEx to just capture what is between the two tags, excluding the tags.
Then I don't see a way of doing it since JS don't have look behinds.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.