0

Hi guys could you please help me to resolve this riddle?

I have an array of keywords pretty big one that I use to match those keywords in content.

Example: $array('game','gamers','gaming'); etc...

I am using this foreach method to loop though each keyword and match it against content but in some cases content can be really big and more than 50 keywords and more than 20 posts per page slows the site down dramatically.

foreach($array as $key => $vall)
{

$pattern = '/\b'.$vall.'\b/';

$data['content'] = preg_replace($pattern, " <a style=\"font-weight: bold;color:rgb(20, 23, 26);\" href=\"".$url."\">".$vall."</a>", $data['content'],1);    
}

The good thing about is that I can specify replace limit to one means that it will not replace all instances of that keyword.

The method above is not performance effective.

My other approach is this but there is an issue. I can't specify limit.

$pattern = '/\bgamers|gaming|gamer\b/';
$data['content'] = preg_replace($pattern, " <a style=\"font-weight: bold;color:rgb(20, 23, 26);\" href=\"".$url."\">$0</a>", $data['content']);

The method above works great but it replaces all instances of matched keywords.

Question. How can I add to pattern keywords separated by or expression and then replace only first match of each keyword.

Update.

$string = 'I am a gamer and I love playing video games. Video games are awesome. I have being a gamer for a long time. I love to hang-out with other gamer buddies of mine.';

$keyWordsToMatch = 'gamer,games';

For the output it needs to replace only forst instance of $keyWordsToMatch.

Like this:

$string = 'I am a (gamer)_replace and I love playing video (games)_replace. Video games are awesome. I have being a gamer for a long time. I love to hang-out with other gamer buddies of mine.';
6
  • Can you post a very simple sample of your input data, along with what you want for the output, I'm not completely clear on the "all but first" portion. Commented Nov 17, 2021 at 17:52
  • $pattern = '/\bgamers|gaming|gamer\b/'; is a typo. You need $pattern = '/\b(?:gamers|gaming|gamer)\b/'; Commented Nov 17, 2021 at 17:55
  • Thanks guys I have update my question. Sorry I have not preceded my self correctly. Commented Nov 17, 2021 at 18:07
  • How do i use this $pattern = '/\b(?:gamers|gaming|gamer)\b/'; Commented Nov 17, 2021 at 18:08
  • You are already using it, $data['content'] = preg_replace($pattern, " <a style=\"font-weight: bold;color:rgb(20, 23, 26);\" href=\"".$url."\">$0</a>", $data['content']);. But I guess the problem is with the number of words, the pattern gets too long for PCRE. Commented Nov 17, 2021 at 18:21

1 Answer 1

1

I think you can solve this by using preg_replace_callback and keeping track of keywords you found. I've also added the grouping the @Wiktor Stribiżew suggested, and I personally like to use named-captures in RegEx.

See the comments for additional details

$string = 'gamer thing gamer games test games';
$pattern = '/\b(?<keyword>gamer|games)\b/';

// We'll append to this array after we use a given keyword
$usedKeywords = [];
$finalString = preg_replace_callback(
    $pattern,

    // Remember to capture the array by-ref
    static function (array $matches) use (&$usedKeywords) {
        $thisKeyword = $matches['keyword'];
        if (in_array($thisKeyword, $usedKeywords, true)) {
            return $thisKeyword;
        }

        $usedKeywords[] = $thisKeyword;
        
        // Do your replacement here
        return '~'.$thisKeyword.'~';

    },
    $string
);

print_r($finalString);

The output is:

~gamer~ thing gamer ~games~ test games

Demo here: https://3v4l.org/j40Oq

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you so much. Man people around here are smart.
Works like a charm. Exactly what i was looking for and its fast too.
The idea is good but the use of in_array() drives to the same problem when the number of keywords grows. You can avoid the problem building an associative array where keys are keywords and values whatever you want, then you can test if a keyword "is in the array" using array_key_exists() that is from far faster. Or better, use a Ds/Set to store the keywords.
Although I don't disagree that for large sets in_array is slower than others, the OP mentioned 50 as their number for "big", so I would be surprised if anything would be measurable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.