1

I hope you can help me. I searched a lot, but unfortunately didn't find anything. What's the problem? I've got big CSV files with 1 column, which contains e-mail addresses. There are about 50000 lines in single file. I'm creating administration panel, which allows to import these files to the server, using HTML form and PHP. Importing CSV to MySQL database through PHP is simple, but I need something more - check for every e-mail does it exists, and if yes - skip it. What's the problem? Table has about million+ records, checking one e-mail lasts +/- 3 seconds. 50000 records multiplied by 3... it's gonna take min. 44 hours! PHP script stops responding after less than 10 minutes... So it's impossible do it this way:

function doesExist($email) {
    $sql = "SELECT count(*) as counter FROM mailing_subscribers WHERE subscriber_email LIKE :subscriber_email";
    $sth = $this->db->prepare($sql);
    $sth->execute(array(':subscriber_email' => $email));
    $row = $sth->fetch();
    $counter = $row->counter;
    if ($counter > 0) {
        return true;
    } else {
        return false;
    }
}

function importCSV($file,$group) {            

    $fp = fopen($file['tmp_name'], "r");
    $importsCounter = 0;

    while($csv_line = fgetcsv($fp)) {
        for ($i = 0, $j = count($csv_line); $i < $j; $i++) {
            if (!$this->doesExist($csv_line[$i])) {
                $sql = "INSERT INTO mailing_subscribers(subscriber_email,subscriber_group) VALUES('".$csv_line[$i]."','".$group."')";
                $sth = $this->db->prepare($sql);
                $sth->execute();
                $importsCounter++;
            }
        }
    }

    $_SESSION["feedback_positive"][] = FEEDBACK_FILE_IMPORT_SUCCESSFUL . " Utworzonych wpisów: " . $importsCounter;
}

$file is a $_FILE array.

Is there any other and faster method to do it?

5
  • 1
    Omit the check and just use INSERT IGNORE. A proper index on the email column could help also. Commented Aug 27, 2014 at 9:01
  • Is this valid WHERE subscriber_email LIKE :subscriber_email, having WHERE and LIKE? I usually see someting like WHERE subscriber_email = '$email' as an example. Commented Aug 27, 2014 at 9:05
  • @Fred-ii- I use LIKE because I'm not sure does = is case insensitive. Commented Aug 27, 2014 at 9:09
  • @Sirko I saw INSERT IGNORE somewhere, but... it's enough just to set subscriber_email column as unique? Commented Aug 27, 2014 at 9:10
  • @Blacky Together with INSERT IGNORE it should be enough, yes. You should also use bulk inserts instead of one SQL statement per entry. Commented Aug 27, 2014 at 9:14

1 Answer 1

1

Below is my suggestion:

1) Load your csv file in Temporary table. refer http://dev.mysql.com/doc/refman/5.1/en/load-data.html

2) It will load your bulk csv data very fast may be in seconds. Now use insert query and insert data from temporary table to master table with duplicate value check.

E.G

1) Lets assume you have load csv data in temporary table named "TempTable"

2) say your master table name is "mailing_subscribers"

3) say you do not want duplicate record to be insert.

your query will be like:

insert into mailing_subscribers (subscriber_email,cola,colb..) select subscriber_email,cola,colb.. from TempTable where subscriber_email not in (select subscriber_email from mailing_subscribers)

Please let me know if you face any issue.

Sign up to request clarification or add additional context in comments.

1 Comment

Hello @Ronak Shah nice idea but if csv have duplicate record and we are check for temptable only so might be it create problem. am i right or wrong??????

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.