0

I am using WordPress, and I need to create a function to check for typos in a custom table, by comparing the single values with a comparison table. The values to be checked are animal species names, and they stored as follows in table A

id | qualifying_species
----------------------
1  | Dugong dugon, Delphinus delphis
2  | Balaneoptera physalus, Tursiops truncatus, Stenella coeruleoalba

etc.

These values must be checked for typos by matching them with table B which contains a simple list of species name as a reference

id | species_name
----------------------
1  | Dugong dugon 
2  | Delphinus delphis
3  | Balaneoptera physalus
4  | Tursiops truncatus
5  | Stenella coeruleoalba

Here's the code I prepared

function test(){
global $wpdb;
$query_species = $wpdb->get_results("SELECT qualifying_species FROM A", ARRAY_A);

                    foreach($query_species as $row_species)
                    {
                        $string = implode (";", $row_species);
                        $qualifying_species = explode(";", $string);

                        //echo '<pre>';
                        //print_r($qualifying_species);
                        //echo '</pre>';

                        foreach ($qualifying_species as $key => $value) {
                            //I get here the single species name
                            echo $value . '<br>';
                                //I compare the single name with the species list table
                                $wpdb->get_results("SELECT COUNT(species_name) as num_rows FROM B WHERE species_name = '$value'");
                                    //if the species is written correctly, it will be counted as 1 with num_rows
                                    //if the species is written wrongly, no match will be found and num_rows = 0 
                                    echo $wpdb->num_rows . '<br>';
                        }
                    } 
}

The echo was to check the results of the function. The Mysql query works when I do it on PHPMyAdmin, but it seems that something is wrong with the PHP loop that I wrote. For each $value echoed I have a result of 1 echoed with $wpdb->num_rows even if $value presents typos or doesn't exist in table B

What am I doing wrong?

9
  • $string = implode (";", $row_species); $qualifying_species = explode(";", $string); – what’s the point in that? And why am I not seeing any exploding of the qualifying_species value by ', ' anywhere? If you wanted to compare the values from that “list” to individual values, then I’d assume it starts with that …? Commented Mar 25, 2021 at 8:26
  • Do it on MySQL level, not in PHP - this will be more effective. Commented Mar 25, 2021 at 8:27
  • Recommendation - do not store the value as CSV, normalize your data. Commented Mar 25, 2021 at 8:28
  • To @CBroe, that was my mistake in writing the post: actually the values are stored in with a semicolon (;). Commented Mar 25, 2021 at 8:30
  • @Akina, unfortunately data will come via csv as they are output in that way and will be uploaded in a MySQL table. I don't understand your comment to do it on MYsql level. Sorry I am not an expert with Mysql Commented Mar 25, 2021 at 8:32

2 Answers 2

1

Possible solutoin for MySQL 5.7.

Create a procedure (must be performed only once):

CREATE PROCEDURE check_table_data ()
BEGIN
DECLARE sp_name VARCHAR(255);
DECLARE done INT DEFAULT FALSE;
DECLARE cur CURSOR FOR SELECT species_name FROM tableB;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done := TRUE;
OPEN cur;
CREATE TEMPORARY TABLE t_tableA SELECT * FROM tableA;
FETCH cur INTO sp_name;
REPEAT
    UPDATE t_tableA SET qualifying_species = REPLACE(qualifying_species, sp_name, '');
    FETCH cur INTO sp_name;
UNTIL done END REPEAT;
CLOSE cur;
SELECT id, qualifying_species wrong_species FROM t_tableA WHERE REPLACE(qualifying_species, ',', '') != '';
DROP TABLE t_tableA;
END

Now, when you need to check your data for unknown species and misprintings you simply execute one tiny query

CALL check_table_data;

which will return id for a row which have a species value not found in tableB, and this species itself.

fiddle

The code assumes that there is no species_name value which is a substring of another species_name value.


The procedure do the next: it makes the data copy then removes existent species from the values. If some species is wrong (is absent, contains misprint) it won't be removed. After all species processed the procedure selects all rows which are not empty (contain non-removed species).

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks I am going to try your option. Since I am not experienced with Mysql, where I create the procedure? and where I call it? I normally write my custom functions in the file functions.php of my WordPress. I am not sure where to write it
where I create the procedure? It is created on MySQL server within the database. The best way is to create it using MySQL console client. See dev.mysql.com/doc/refman/8.0/en/create-procedure.html (pay attention to DELIMITER statements). But you may use Workbench, phpMyAdmin and so on - it doesn't matter. where I call it? in your PHP code, $wpdb->get_results("CALL check_table_data;");
Just one more question @Akina, as I didn't find how to solve it. If in my qualifying species field I have values with 3 words, such as "Balaenoptera musculus intermedia", the procedure reports the last word (in this case "intermedia") as wrong, while it is not. I tried to go through your suggested procedure, but as I am not an expert I didn't find the way to solve this.
@ElenaPoliti I cannot reproduce. The code does not depend on the words amount. dbfiddle.uk/… fiddle, add your data (and maybe use your tables structures) and provide the link which demonstrates the issue.
Here's what I mean: check table A first row (Delphinus delphis delphis) dbfiddle.uk/…
|
0

You maybe could do this in the same query :

SELECT *
FROM table_a
INNER JOIN table_b ON FIND_IN_SET(species_name,replace(qualifying_species,';',','))

if you want to find non-existent values, use something like this:

SELECT *
FROM table_b
LEFT OUTER JOIN table_a ON FIND_IN_SET(species_name,replace(qualifying_species,';',','))
WHERE table_a.id IS null

3 Comments

Pay attention - the CSV values contains a space after a comma. Single FIND_IN_SET will find nothing.
if there is a space after the comma, juste have to add it in the replace no?
Maybe.. but there can be values without space, with 2 spaces, with trailing space(s) in theory.. FIND_IN_SET can use only the data which is strictly formatted - but OP tells that typos/misprints may occur, i.e. the value may be direct user input (or grabbing result) which is unpredictable...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.