0

Been beating my head against a wall trying to get this to work - help from any regex gurus would be greatly appreciated!

The text that has to be matched

[template option="whatever"] 

<p>any amount of html would go here</p>

[/template]

I need to pull the 'option' value (i.e. 'whatever') and the html between the template tags.

So far I have:

> /\[template\s*option=["\']([^"\']+)["\']\]((?!\[\/template\]))/

Which gets me everything except the html between the template tags.

Any ideas?

Thanks, Chris

11
  • 1
    Which language are you using? Commented Jan 23, 2011 at 3:53
  • 3
    What happens if <p>this is how you break a parser: [/template] It's broken now! </p> is the html? Commented Jan 23, 2011 at 3:53
  • aqua: PHP ircmaxell: doesn't matter Commented Jan 23, 2011 at 3:55
  • 1
    I suspect that you forgot to escape brackets. Remember - they have special meaning in regex? Commented Jan 23, 2011 at 4:05
  • 1
    well, your second parenthesis group includes the [/template] tag, but otherwise you should be able to access the contents of the parens by number! For the HTML, you can simply try a "reluctant" .* (probably .*? but I'm not familiar with PHP). Also be aware, of course, that your option value should not be empty or contain escaped " chars, otherwise this will not work ... Commented Jan 23, 2011 at 4:16

4 Answers 4

1

edit: [\s\S] will match anything that is space or not space.

you may have a problem when there are consecutive blocks in a large string. in that case you will need to make a more specific quantifier - either non greedy (+?) or specify range {1,200} or make the [\s\S] more specific

/\[template\s*option=["\']([^"\']+)["\']\]([\s\S]+)\[\/template\]/
Sign up to request clarification or add additional context in comments.

1 Comment

Good work! Yes that works. Well done. Thanks everyone else as well
1

Try this

/\[template\s*option=\"(.*)\"\](.*)\[\/template]/

basically instead of using complex regex to match every single thing just use (.*) which means all since you want everything in between its not like you want to verify the data in between

3 Comments

Yes I tried this but it doesn't work I presume because '.' is any character except a new line.. which the content may have. Replacing (.*) with ([.\n]) didn't work either.
@Chris, there's a multi-line modifier in PHP's regex, in this case you'd follow the expression with an m.
@Mark E, great thanks for the tip and that brilliant answer to parsing html with regex is going on my wall to cheer me up first thing on a monday morning!
0

The assertion ?! method is unneeded. Just match with .*? to get the minimum giblets.

/\[template\s*option=\pP([\h\w]+)\pP\]  (.*?)  [\/template\]/x

Comments

0

Chris,

I see you've already accepted an answer. Great!

However, I don't think use of regular expressions is the right solution here. I think you can get the same effect by using string manipulations (substrings, etc)

Here is some code that may help you. If not now, maybe later in your coding endeavors.

<?php

    $string = '[template option="whatever"]<p>any amount of html would go here</p>[/template]';

    $extractoptionline = strstr($string, 'option=');
    $chopoff = substr($extractoptionline,8);
    $option = substr($chopoff, 0, strpos($chopoff, '"]'));

    echo "option: $option<br \>\n";

    $extracthtmlpart = strstr($string, '"]');
    $chopoffneedle = substr($extracthtmlpart,2);
    $html = substr($chopoffneedle, 0, strpos($chopoffneedle, '[/'));

    echo "html: $html<br \>\n";

?>

Hope this helps anyone looking for a similar answer with a different flavor.

2 Comments

can I ask why don't you think use of regular expressions is the right solution here? What is the disadvantage of using a regular expression?
@Chris: For your purposes, and because you have a valid solution now, you can use regex. However, in general, I use regular expressions when I want to find some text in a document for which I cannot control (or do not know) the formatting. If the format of the document is more or less statically known, or has a particular structure, you can use string manipulation functions like I did here. Notice that the functions implicitly do the same thing as regex (find [/ etc...).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.