1

i have an array of strings and i need to search whether the strings of the array exits in database or not. i am using following query:

foreach($array as $s)
{   
if(preg_match('/^(\bpho)\w+\b$/', $s))
{ 
    $query="select * from dictionary where words REGEXP '^".$s."$'";
    $result=$db->query($query) or die($db->error);
    if($result)
    {
    $count += $result->num_rows;
    }
}
}

but this query taking long time to execute. PLease provide a solution to reduce the searching time

5
  • if you can narrow your search to specific table or column then you can use index on that particular columns Commented Mar 12, 2014 at 7:11
  • Why are you using REGEXP when it looks like you're doing exact matches? Does $array contain regular expressions? Commented Mar 12, 2014 at 7:13
  • yes, u can see i am searching all the elements that starts with 'pho'. Commented Mar 12, 2014 at 7:20
  • Does $array contain things like pho\d*, i.e. are there regular expression operators in $array? Or are they just words like phonetic? Commented Mar 12, 2014 at 7:22
  • I understand why you're using preg_match, my question is why you're using WHERE words REGEXP. Commented Mar 12, 2014 at 7:22

4 Answers 4

1

I don't think your problem here is about your code. I think you should optimize your database. I'm not very good at it but I think you could add indexes in your database to speed up the research

Sign up to request clarification or add additional context in comments.

4 Comments

Indexes only have limited ability to optimize REGEXP matching.
and it's exactly what he need no?
Maybe you should be more specific. What types of optimizations do you recommend to make the regular expression matching faster?
You call that "more specific"?
0

Combine all the search strings into a single regular expression using alternation.

$searches = array();
foreach ($array as $s) {
    if (preg_match('/^(\bpho)\w+\b$/', $s)) {
        $searches[] = "($s)";
    }
}
$regexp = '^(' . implode('|', $searches) . ')$';
$query="select 1 from dictionary where words REGEXP '$regexp'";
$result=$db->query($query) or die($db->error);
$count = $result->num_rows;

If $array doesn't contain regular expressions, you don't need to use the SQL REGEXP operator. You can use IN:

$searches = array();
foreach ($array as $s) {
    if (preg_match('/^(\bpho)\w+\b$/', $s)) {
        $searches[] = "'$s'";
    }
}
$in_list = implode(',', $searches);
$query="select 1 from dictionary where words IN ($in_list)";
$result=$db->query($query) or die($db->error);
$count = $result->num_rows;

Comments

0

Searching the whole database is a large job, I think a better way is you can cache some parts of the database, and than search in the cache. Redis is very good.

Comments

0
  1. Modify your query so that is doesn't select all table columns - that is a waste of resources. Instead, just let the database count the number of rows containing the search query and return back only a single answer (matches):

$query = "SELECT COUNT(id) AS matches FROM dictionary WHERE words REGEXP '^".$s."$'";

  1. How are you indexing your database? If your words column is not indexed properly, then your regexp would take a long time. Examine your database structure and potentially add indexing to the words column.

P.S. And don't forget to fetch the matches column instead of using num_rows

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.