1

Hi everyone I am having and issue using Regex and can not get it to work when there are spaces or line breaks in the content.

$content = "<dt><span>Name:</span></dt>
                      <dd>
                        John
                      </dd>
                      <dt><span>Age:</span></dt>
                      <dd>
                        40
                      </dd>
                      <dt><span>Sex:</span></dt>
                      <dd>
                        Male
                      </dd>";

The regex i am using is

preg_match_all('/<dt><span>(.*)<\/span><\/dt><dd>(.*)<\/dd>/',$content, $output);
3
  • 5
    You should use a DOM parser for this, not regex. Commented Mar 20, 2013 at 18:34
  • stackoverflow.com/questions/1732348/… Commented Mar 20, 2013 at 18:36
  • you can add [\h\v]* between tags in the pattern Commented Mar 20, 2013 at 18:37

2 Answers 2

2

Don't parse HTML with RegEx. Use DOM. Here's an example that will work if you are sure about HTML structure.

$dom = new DOMDocument();
@$dom->loadHTML($content);
$xpath = new DOMXPath($dom);
$spans = $xpath->query('//span');
$dds= $xpath->query('//dd');
for ($i = 0; $i < $spans->length; $i++)
{
    echo $spans->item($i)->nodeValue . $dds->item($i)->nodeValue . '<br>';
}

If you are not sure of it's structure, you'll need something a bit more complicated.

Sign up to request clarification or add additional context in comments.

Comments

0

Agree that you should use the DOM. however you are not taking account of the whitespace between and

Try:

preg_match_all('/<dt><span>(.*)<\/span><\/dt>.*<dd>(.*)<\/dd>/',$content, $output);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.