0

I am trying to remove following pattern from a string:

<div class="main_title">Content 1</div> 

where 'Content 1' may vary between strings.

The following does not seem to be working:

$output = preg_replace('<div class="main_title">.*</div>', " ", $output);

Am I missing something obvious?

3
  • 2
    Am I missing something obvious? You're trying to parse HTML with regular expressions. Commented May 28, 2013 at 21:38
  • 2
    Do not parse HTML with a regular expression! stackoverflow.com/questions/1732348/… Commented May 28, 2013 at 21:38
  • See these answers for a better way. Commented May 28, 2013 at 21:41

2 Answers 2

3

The DOM method is probably superior because you don't have to worry about case sensitive, whitespace, etc.

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//div[@class="main_title"]') as $node) {
    $node->parentNode->removeChild($node);
}
$output = $dom->saveHTML();

It's possible to do with regex, especially if you can trust that your input will follow a very specific format (no extra whitespace, perhaps no case discrepancies, etc.) Your main issue is a lack of PCRE delimiters.

$output = preg_replace('@<div class="main_title">.*?</div>@', '', $output);
Sign up to request clarification or add additional context in comments.

Comments

1

As others says in the comments, don't use regular expressions to parse HTML, use SimpleXML or DOMDocument instead. If you need a regex yet, you need to put the pattern delimiters in your code:

$output = preg_replace('#<div class="main_title">.*</div>#', " ", $output);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.