1

I'm adding schema (description) to our product pages, all of which are dynamically generated, so I'm looking to add a good general purpose regular expression to properly format said description.

So here's what I'm currently working with (spaced a little oddly for ease of reading):

<meta itemprop="description" content="
    <?php 
        $original_desc = $_product->getShortDescription();
        $schema_desc = preg_replace('Rocking REGEX theoretically goes here','$1 $2', $original_desc);
        strip_tags($schema_desc);
        echo $schema_desc; 
    ?>
">

Problem is, our product descriptions are being pulled from the admin of our CMS, so the formatting is a little squirrelly.

Here's what they look like:

 content="<p><strong>Product Title</strong> - Other Product Name - <em>Blah Blah</em></p>
 <p><strong>Product Heading 1</strong> </p>
 <p><strong>Product Heading 2:</strong>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras vulputate pellentesque sem, id mattis sem blandit at. 
    Suspendisse tempus sodales enim nec aliquam. Vestibulum laoreet tincidunt dui, sit amet laoreet ipsum gravida at. Nulla in tempus justo, 
    et bibendum dolor.</p>
    <p><strong>Product Heading 3:</strong> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras vulputate pellentesque 
    sem, id mattis sem blandit at. Suspendisse tempus sodales enim nec aliquam. Vestibulum laoreet tincidunt dui, sit amet laoreet ipsum gravida at. 
    Nulla in tempus justo, et bibendum dolor.</p>"

So here's what I want to do - I want to KEEP the text between the first two <strong></strong> tags because that's the product category/title, but all the subsequent text between <strong></strong> tags are simply headings that have no usefulness in a search description, so I'd like to remove it. I've found ways to say, strip ALL the text from between ALL the <strong></strong> tags, but not all but the first.

Thanks!

3 Answers 3

1

I'd recommend DomDocument here

$str = <<<STR
<p><strong>Product Title</strong> - Other Product Name - <em>Blah Blah</em></p>
 <p><strong>Product Heading 1</strong> </p>
 <p><strong>Product Heading 2:</strong>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras vulputate pellentesque sem, id mattis sem blandit at. 
    Suspendisse tempus sodales enim nec aliquam. Vestibulum laoreet tincidunt dui, sit amet laoreet ipsum gravida at. Nulla in tempus justo, 
    et bibendum dolor.</p>
    <p><strong>Product Heading 3:</strong> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras vulputate pellentesque 
    sem, id mattis sem blandit at. Suspendisse tempus sodales enim nec aliquam. Vestibulum laoreet tincidunt dui, sit amet laoreet ipsum gravida at. 
    Nulla in tempus justo, et bibendum dolor.</p>
STR;

$dom = new DOMDocument();
@$dom->loadHTML($str);
$elements = $dom->getElementsByTagName('strong');

echo $elements->item(0)->nodeValue;
echo '<br>';
echo $elements->item(1)->nodeValue;

OUTPUTS:

Product Title
Product Heading 1

EDIT:

If I understand correctly, $str is populated by $_product->getShortDescription():

$dom = new DOMDocument();
@$dom->loadHTML($_product->getShortDescription());
$elements = $dom->getElementsByTagName('strong');

echo $elements->item(0)->nodeValue;
echo '<br>';
echo $elements->item(1)->nodeValue;
Sign up to request clarification or add additional context in comments.

1 Comment

This looks interesting and maybe this is a stupid question, but given that the value of the description is pulled dynamically (via $_product->getShortDescription()) how would I incorporate the mark up you're recommending?
0

All you need is to use one of the patterns you have found and set the limit parameter of the preg_replace() function to 1. See the documentation.

2 Comments

Doesn't setting the parameter limit to one mean that preg_replace stops after the first match?
@Kale: Yes. If you don't find a good pattern, you can use '~<strong>[^>]+>~'
0

You could simply use <strong>(.*)<\/strong> and then replace with <strong><meta itemprop="description" content="$1">$1</strong>

Here's a working example: http://regex101.com/r/dV9wJ5

(I'm not sure if it's syntactically correct to your particular schema, but you get the idea).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.