3

This code opens all excel files in a folder then it gets all emails in the file opened and puts them in an array. In the end I need ONE BIG array from all the content from all the array of arrays. I need it to be one big array of all emails from all files.

The code below is not working. I am sure this is a simple one. Thanks

<?

$Folder = "sjc/";
$files = scandir($Folder);


function cleanFolder($file)
{
$string = file_get_contents("sjc/$file");
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);

$Emails[] = $matches[0];
return $Emails;
}



function beginClean($files)
{
    for($i=0; count($files)>$i;$i++)
        {
        $Emails = cleanFolder("$files[$i]");
        $TheEmails .= explode(",",$Emails);

        }

/// Supposed to be a big string of emails separated by comma
echo $TheEmails; // But it just echos .... ArrayArrayArrayArrayArray etc...

// WHAT I REALLY WANT IS.. one Array holding all emails, not an Array of Arrays. 
}

beginClean($files);

?>

UPDATE: GOT TOT WORK.. HOWEVER I am having a memory issue now as the emails total over 229911.

Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 71 bytes) in /home/public_html/StatuesPlus/CleanListFolder.php on line 33

Here is the code that worked:

<?

$Folder = "sjc/";
$files = scandir($Folder);


function cleanFolder($file)
{
//echo "FILE NAME " . $file . "<br>";
$string = file_get_contents("sjc/$file");
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);

$TheEmails .= implode(',', $matches[0]);
return $TheEmails;

}



function beginClean($files)
{
    for($i=0; count($files)>$i;$i++)
        {
        $Emails .= cleanFolder("$files[$i]");
        }



$TheEmails = explode(",", $Emails);
//$UniqueEmails= array_unique($TheEmails);
echo count($TheEmails);
//file_put_contents("Emails.txt", $TheEmails);
}

beginClean($files);

?>
3
  • 1
    instead of raw excel files at least convert to csv, then getting the emails is a a snap. your regular expression will not match some valid email addresses Commented May 21, 2013 at 1:58
  • Thanks Dagon, I was going to do that but it has lots of excel files. I only know how to do it manually. Also this info has much more than emails. I am just taking the emails. Is there a code to convert excel to csv via php? Commented May 21, 2013 at 2:01
  • 1
    not via php but from the command line: stackoverflow.com/questions/1858195/… Commented May 21, 2013 at 2:08

2 Answers 2

2

.= is used for concatenating strings, not arrays. But you can just keep them as strings for a while:

$TheEmails .= ",$Emails";

And then:

$TheEmails = explode(',', substr($TheEmails, 1));
Sign up to request clarification or add additional context in comments.

3 Comments

Fast. Thanks. Where exactly would I put this code? In the first loop or second? Also what does the substr do and the 1?
@PapaDeBeau: The substr just takes off the leading comma, since the first item gets the comma too. Anyways, $TheEmails .= ",$Emails" replaces the other $TheEmails .= … line, and $TheEmails = explode… goes after the loop.
Thanks. I think there is yet one issue. $Emails = cleanFolder("$files[$i]"); is turning $Emails into and arrray to its not actually the email but an array from the other loop.
1

Below is the Final code I used to gather multiple emails from multiple excel sheets in any give folder. The files can be CSV, XLS, XLSX, HTML etc.. and this code will abstract the emails from multiple pages in that folder and puts them into ONE HUGE ARRAY. :)

<?
    // See below for ARRAY out put called $FinalEmails 

    // SET YOUR FOLDER HERE

    $Folder = "sjc/";
    $files = scandir($Folder);


    function cleanFolder($file)
    {

    $string = file_get_contents("$Folder/$file");
    $pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
    preg_match_all($pattern, $string, $matches);

    $TheEmails .= implode(',', $matches[0]);
    $TheEmails = strtolower($TheEmails);

    return $TheEmails;

    }



    function beginClean($files)
    {
        for($i=0; count($files)>$i;$i++)
            {
            $Emails .= cleanFolder("$files[$i]");
            }



    $TheEmails = explode(",", $Emails);
    $UniqueEmails= array_unique($TheEmails);

    $Emails = implode(",", $UniqueEmails);


    function isValidEmail($email)

    {  
     return filter_var(filter_var($email, FILTER_SANITIZE_EMAIL), FILTER_VALIDATE_EMAIL);  
    }  


    for($i=0; count($UniqueEmails)>$i;$i++)
    {
        if(isValidEmail("$UniqueEmails[$i]"))
        {  
        echo $UniqueEmails[$i] . "<br>";
        $FinalEmails .= "$UniqueEmails[$i],";
        } 
    else 
        {  
        //not valid  
        }
    }


    /// An ARRAY OF Emails from multiple Excel Sheeet Cleaned
    // Cleaned of duplicates and checked if a valid email.
    $FinalEmails = explode(",", $FinalEmails);



    }

    beginClean($files);

    ?>

2 Comments

Without substr, though, the last element of $FinalEmails will be empty. Also, you don’t have to convert from an array to a string to an array. Also, is $Emails used?
It does work without substr. Not sure why. Yes, $Emails in this example is not used. Thanks for pointing that out. I will remove it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.