6

For example, it just returns the snippet around which the searching keyword exists.

And part of the text is replaced by "...".

Is it possible to achieve that goal with PHP and MySQL?

1
  • those are called ellipsis(es) en.wikipedia.org/wiki/Ellipsis‎ Commented Apr 27, 2014 at 23:12

5 Answers 5

9

Modified deceze's function slightly to allow multiple phrases. e.g. your phrase can be "testa testb" and if it does not find testa, then it will go to testb.

function excerpt($text, $phrase, $radius = 100, $ending = "...") { 


         $phraseLen = strlen($phrase); 
       if ($radius < $phraseLen) { 
             $radius = $phraseLen; 
         } 

         $phrases = explode (' ',$phrase);

         foreach ($phrases as $phrase) {
             $pos = strpos(strtolower($text), strtolower($phrase)); 
             if ($pos > -1) break;
         }

         $startPos = 0; 
         if ($pos > $radius) { 
             $startPos = $pos - $radius; 
         } 

         $textLen = strlen($text); 

         $endPos = $pos + $phraseLen + $radius; 
         if ($endPos >= $textLen) { 
             $endPos = $textLen; 
         } 

         $excerpt = substr($text, $startPos, $endPos - $startPos); 
         if ($startPos != 0) { 
             $excerpt = substr_replace($excerpt, $ending, 0, $phraseLen); 
         } 

         if ($endPos != $textLen) { 
             $excerpt = substr_replace($excerpt, $ending, -$phraseLen); 
         } 

         return $excerpt; 
   } 

Highlight function

function highlight($c,$q){ 
$q=explode(' ',str_replace(array('','\\','+','*','?','[','^',']','$','(',')','{','}','=','!','<','>','|',':','#','-','_'),'',$q));
for($i=0;$i<sizeOf($q);$i++) 
    $c=preg_replace("/($q[$i])(?![^<]*>)/i","<span class=\"highlight\">\${1}</span>",$c);
return $c;}
Sign up to request clarification or add additional context in comments.

4 Comments

Can you highlight the phrases?
I'm getting Warning: strpos(): Empty needle when searching an empty string. I tried with checking if everything is empty, null, or even set, but I always get this warning...
I tested it on 3.9, and it works, however they changed it in 4.0... Any info on how to handle this?
This returns excerpts containing only the first keyword. Maybe this is what the OP wants, maybe others would require to have an excerpt contains as many keywords as possible (multiple "excerpts" separated by ellipsis "...").
6

My solution for multiple multiple keywords and multiple occurences (also works for accents case insensitive):

function excerpt($text, $query)
{
//words
$words = join('|', explode(' ', preg_quote($query)));

//lookahead/behind assertions ensures cut between words
$s = '\s\x00-/:-@\[-`{-~'; //character set for start/end of words
preg_match_all('#(?<=['.$s.']).{1,30}(('.$words.').{1,30})+(?=['.$s.'])#uis', $text, $matches, PREG_SET_ORDER);

//delimiter between occurences
$results = array();
foreach($matches as $line) {
    $results[] = htmlspecialchars($line[0], 0, 'UTF-8');
}
$result = join(' <b>(...)</b> ', $results);

//highlight
$result = preg_replace('#'.$words.'#iu', "<span class=\"highlight\">\$0</span>", $result);

return $result;
}

This is example result for query = "švihov prohlídkám"

Result

2 Comments

fyi- this appears to not match if the line in $text that contains $query doesn't have a newline before and after it. $text="\n${text}\n" is a simple workaround, but the regex may need adjustment
This returns very long excerpts if you have a long text and many hits.
4
function excerpt($text, $phrase, $radius = 100, $ending = "...") {
    $phraseLen = strlen($phrase);
    if ($radius < $phraseLen) {
        $radius = $phraseLen;
    }

    $pos = strpos(strtolower($text), strtolower($phrase));

    $startPos = 0;
    if ($pos > $radius) {
        $startPos = $pos - $radius;
    }

    $textLen = strlen($text);

    $endPos = $pos + $phraseLen + $radius;
    if ($endPos >= $textLen) {
        $endPos = $textLen;
    }

    $excerpt = substr($text, $startPos, $endPos - $startPos);
    if ($startPos != 0) {
        $excerpt = substr_replace($excerpt, $ending, 0, $phraseLen);
    }

    if ($endPos != $textLen) {
        $excerpt = substr_replace($excerpt, $ending, -$phraseLen);
    }

    return $excerpt;
}

Shamelessly stolen from the Cake TextHelper.

Comments

0
$snippet = "//mysql query";

function trimmer($updates,$wrds){
    if(strlen($updates)<=$wrds){
        return $updates;
    }else{
    $marker = strrpos(substr($updates,0,$wrds),' ');
    $string = substr(substr($updates,0,$wrds),0,$marker)."...";return $string;
}

echo trimmer($snippet,200); //You can send the snippet string to this function it searches for the last space if string length is greater than 200 and adds "..." to it

This is probably what you want (EDIT):

$string1="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.";
function trimmer($updates,$wrds,$pos){
    if(strlen($updates)<=$wrds) {
        return $updates;
} else {
        $marker = strrpos(substr($updates,$pos,$wrds),' ');
        $string = substr(substr($updates,$pos,$wrds),0,$marker)."...";
        return $string;
    }
}

$pos = strpos($string1, "dummy");
echo trimmer($string1,100,$pos);

2 Comments

Think what if the keywords exists 201 characters from the beginning. You returned the first 200 characters,but with no keywords included.
I added an EDIT ... maybe this is what you want
0

this is my modified function

function excerpt_2($text, $query, $limit_chars_between, $results_divider, $b_highlight_class)
{
    if ((trim($text) === '') || (trim($query) === '')) {
        return false;
    }

    $text = "\n" . chop(trim($text)) . "\n"; // IMPORTANT START/END with "\n" - for search first/last entries

    //words
    $words = join('|', explode(' ', preg_quote(trim($query))));

    //lookahead/behind assertions ensures cut between words
    $s_preg_charset = '\s\x00-/:-@\[-`{-~'; //character set for start/end of words
    preg_match_all('#(?<=[' . $s_preg_charset . ']).{0,' . $limit_chars_between . '}((' . $words . ').{0,' . $limit_chars_between . '})+(?=[' . $s_preg_charset . '])#ius', $text, $matches, PREG_OFFSET_CAPTURE);

    $result = $is_text_truncated = false;
    $first_founded_pos = $last_founded_pos = -1;
    $a_results = array();
    
    if (count($matches[0]) > 0) {
        foreach ($matches[0] as $a_line) {
            if ($first_founded_pos < 0) {
                $first_founded_pos = $a_line[1];
            }
            $a_results[] = chop(trim($a_line[0])); //htmlspecialchars($a_line[0], 0, 'UTF-8');
        }

        if (count($a_results)) {
            $result = join(' ' . $results_divider . ' ', $a_results);

            if (($first_founded_pos > $limit_chars_between)) {
                $result = $results_divider . '' . $result;
            }

            $last_el_matches = end($matches);
            $last_founded_pos = $last_el_matches[0][1];

            if (($last_founded_pos + $limit_chars_between) < mb_strlen($text, 'UTF-8')) {
                $result .= $results_divider;
            }

            $is_text_truncated = true;
            $result = preg_replace('#' . $words . '#iu', '<b class="' . $b_highlight_class . '">$0</b>', $result);
        }

    }

    return array(
        'text' => $result,
        'is_text_truncated' => $is_text_truncated,
    );
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.