1

I have a string with HTML tags, $paragraph:

$paragraph = '
    <p class="instruction">
        <sup id="num1" class="s">80</sup>
        Hello there and welcome to Stackoverflow! You are welcome indeed.
    </p>
';

$replaceIndex = array(0, 4);
$word = 'dingo';

I'd like to replace the words at indices defined by $replaceIndex (0 and 4) of $paragraph. By this, I mean I want to replace the words "80" and "welcome" (only the first instance) with $word. The paragraph itself may be formatted with different HTML tags in different places.

Is there a way to locate and replace certain words of the string while virtually ignoring (but not stripping) HTML tags?

Thanks!

Edit: Words are separated by (multiple) tags and (multiple) whitespace characters, while not including anything within the tags.

11
  • And what makes 80 be at 0th index and welcome at 4th? Commented Jan 26, 2016 at 8:08
  • 2
    Do these ("80" and "welcome") appear multiple times? If not that should be easy with str_replace Commented Jan 26, 2016 at 8:10
  • Can you tell us what the criteria used here to select words at the index 0 and 4, is it something like it must be preceded by a space and followed by a space or something like that. Commented Jan 26, 2016 at 8:17
  • @urban: Yes, words may appear multiple times. Commented Jan 26, 2016 at 8:24
  • @Hanky웃Panky: Words are separated by (multiple) tags or (multiple) whitespace characters, and do not include anything within tags. *I think they're called whitespace characters… spaces, newlines, tabs, etc) Commented Jan 26, 2016 at 8:29

3 Answers 3

2

Thanks for all the tips. I figured it out! Since I'm new to PHP, I'd appreciate it if any PHP veterans have any tips on simplifying the code. Thanks!

$paragraph = '
    <p class="instruction">
        <sup id="num1" class="s">80</sup>
        Hello there and welcome to Stackoverflow! You are welcome indeed.
    </p>
';

// Split up $paragraph into an array of tags and words
$paragraphArray = preg_split('/(<.*?>)|\s/', $paragraph, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$wordIndicies = array(0, 4);
$replaceWith = 'REPLACED';

foreach ($wordIndicies as $wordIndex) {
    for ($i = 0; $i <= $wordIndex; $i++) {
        // if this element starts with '<', element is a tag.
        if ($paragraphArray[$i]{0} == '<') {
            // push wordIndex forward to compensate for found tag element
            $wordIndex++;
        }
        // when we reach the word we want, replace it!
        elseif ($i == $wordIndex) {
            $paragraphArray[$i] = $replaceWith;
        }
    }
}

// Put the string back together
$newParagraph = implode(' ', $paragraphArray);

// Test output!
echo(htmlspecialchars($newParagraph));

*Only caveat is that this may potentially produce unwanted spaces in $newParagraph, but I'll see if that actually causes any issues when I implement the code.

Sign up to request clarification or add additional context in comments.

Comments

1
$text = preg_replace('/\b80\b|\bwelcome\b/', $word, $paragraph);

Hope this will help you :)

2 Comments

A commenter helped me clarify: words may appear multiple times, and I am only trying to replace a certain instance of it. I am interested in replacing words at specified indices, not predefined words. But thanks for your answer!
For that, you need to define $NUMBER and $WORD in your string which you want to replace. Regex can do it.
0

SimpleXML could come in handy as well:

$paragraph = '
    <p class="instruction">
        <sup id="num1" class="s">80</sup>
        Hello there and welcome to Stackoverflow! You are welcome indeed.
    </p>
';

$xml = simplexml_load_string($paragraph);
$xml->sup = $word;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.