0

I'm using PHP to open and parse a very small (around 1kb) CSV file to generate an HTML table. I'm new to PHP and this is largely experimental. In addition to generating the HTML table, I'm trying to generate an array from a particular set of columns in that csv (city and country information), then remove duplicate values. The CSV is structured like this:

Last Name, First Name, City, Country, Language
Smith, Joe, Shanghai, China, English
Jackson, Stacey, Madrid, Spain, Spanish
Jones, Bob, London, United Kingdom, English
Seward, Elisa, Madrid, Spain, English
Harrison, Tim, Berlin, Germany, German

The idea here is that in addition to a table with all the data, I'll also have a list of all the cities/countries listed in the table:

  • Shanghai, China
  • Madrid, Spain
  • London, United Kingdom
  • Berlin, Germany

Thanks to the fgetcsv() documentation and other questions on Stack Overflow, reading the file and building the table is straightforward:

<?php
    $handle = fopen("namelist.csv", "r");
    $data = fgetcsv($handle, 1000, ",");
    echo('<table>');
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        echo("<tr>\r\n");
        foreach ($data as $index=>$val) {
            echo("\t<td>");
            echo htmlentities($val, ENT_QUOTES);
            echo("</td>\r\n");
        }
    echo("</tr>\r\n");
    }
    echo("</table>");
    fclose($handle);
?>

But I've been unable to figure out how to grab city, country data and remove duplicates. Does anyone have suggestions?

1
  • I expect when $index == 2 you have City, and when $index == 3 you have Country. To remove duplicates you'll need to refer to the data array or store the information in a more user friendly array to keep track of what's been written. Commented Jun 24, 2014 at 15:44

6 Answers 6

4

Here a simple way which removes duplicate cities without actually having to filter for them.

$fHandle = fopen("namelist.csv", "r");
$aData = fgetcsv($handle, 1000, ",");
while (($aData = fgetcsv($fHandle, 1000, ",")) !== FALSE) {
    $aLocations[$aData[3]] = $aData[4];
}

echo '<table>';
foreach ($aLocations as $sCity => $sCountry) {
    echo '<tr><td>'.$sCity.'</td><td>'.$sCountry.'</td></tr>';
}
echo '</table>';
Sign up to request clarification or add additional context in comments.

Comments

1

In the CSV handle loop, first, the city and country fields are joined with a comma and the $city_countries array is checked. If no dupes the city country string is written to the $city_countries array. The TR tags are echo'ed, and the $data array is looped to write out the TD tags and column values.

$handle = fopen("namelist.csv", "r");
$data = fgetcsv($handle, 1000, ",");
$city_countries = array();
echo('<table>');
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
    $city_country = $data[2] . ', ' . $data[3];
    if ( !in_array($city_country, $city_countries) ) {
        array_push($city_countries, $city_country);
    }
    echo("<tr>\r\n");
    foreach ($data as $index=>$val) {
        echo("\t<td>");
        echo htmlentities($val, ENT_QUOTES);
        echo("</td>\r\n");
    }
    echo("</tr>\r\n");
}
echo("</table>");
fclose($handle);

print '<pre>'; print_r($city_countries); print '</pre>';

This is the input file I'm using:

Last Name, First Name, City, Country, Language
Smith, Joe, Shanghai, China, English
Jackson, Stacey, Madrid, Spain, Spanish
Jackson, Steve, Madrid, Spain, Spanish
Jones, Bob, London, United Kingdom, English
Seward, Elisa, Madrid, Spain, English
Harrison, Tim, Berlin, Germany, German
Jones, Bill, London, United Kingdom, English
Jackson, Ralph, Madrid, Spain, Spanish

And this is the output I'm getting:

Smith    Joe     Shanghai China          English
Jackson  Stacey  Madrid   Spain          Spanish
Jackson  Steve   Madrid   Spain          Spanish
Jones    Bob     London   United Kingdom English
Seward   Elisa   Madrid   Spain          English
Harrison Tim     Berlin   Germany        German
Jones    Bill    London   United Kingdom English
Jackson  Ralph   Madrid   Spain          Spanish

Array
(
    [0] =>  Shanghai,  China
    [1] =>  Madrid,  Spain
    [2] =>  London,  United Kingdom
    [3] =>  Berlin,  Germany
)

3 Comments

thanks for the example - it looks like the duplicate values are being printed, though.
Ah I see - the code is parsing the entire row to find duplicates. I'd like to remove duplicates in the City, Country column alone, so there should be just one "Madrid, Spain" in the city,country array, while the full names/locations/languages table should not be filtered for duplicates
Ah. Code tweaked to list all records, and store unique City, Country combinations.
1

Try this: (Haven't got access to PHP at the moment, will look for tiny bugs in about an hour)

<?php
    $handle = fopen("namelist.csv", "r");
    $data = fgetcsv($handle, 1000, ",");
    $csv = array();
    $csv[] = array();
    $csv[] = array();
    $csv[] = array();
    $csv[] = array();
    $csv[] = array();
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        foreach ($data as $index=>$val) {
            $column=0;
            $csv[column][] = htmlentities($val, ENT_QUOTES);
            $column++;
        }
    }
    fclose($handle);
    //Now, csv[0] has all Last Names, csv[1] has all First Names, csv[2] all Cities, csv[3] all Countries and csv[4] all Languages
    //To filter duplicates..
    $cities = array_unique($csv[2]);
    $countries = array_unique($csv[3]);
?>

This creates an array that holds 5 arrays (one for each column). Those arrays are then populated with each row of the CSV. Afterwards, the city and country columns are purged of repeat values. As stated above, this code SHOULD work but I haven't been able to test it, if it doesn't leave me a comment and I'll be sure to bug fix it later this afternoon.

Comments

1

$data[2] contains city and applying array_unique(...) after pushing all city values to an array will remove duplicates.

$cities = array();

while (($data = fgetcsv($handle, 1000, ",")) !== false) {
    $cities[] = $data[2];
}

$cities = array_unique($cities);

print_r($cities);

Refer to php manual, which has a bunch of sample codes as well.

Comments

1

When dealing with CSV's that have a header line, I prefer to match the data columns to a named key so I don't need to keep track of which index a specific column relates to. This allows you to reference $var['ColumnName'] instead of $var[2]:

<?php
$csvDelim = ',';
$csvEnclosure = '';

$csvArr = file('./namelist.csv', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);

//create array of the csv headers
$csvHeaders = str_getcsv(trim(array_shift($csvArr)), $csvDelim, $csvEnclosure);
$csvHeaders = array_map("trim", $csvHeaders);

//get the csv data and make a multi-dim array of keys/values
$dataArr = array();
foreach($csvArr as $csvLine) 
{
    $lineData = str_getcsv(trim($csvLine), $csvDelim, $csvEnclosure);
    $lineData = array_map("trim", $lineData);
    $dataArr[] = array_combine($csvHeaders, $lineData);
}

//get unique city/country values
$locations = array();
foreach($dataArr as $da)
    $locations[] = $da['City'].', '.$da['Country'];

$locations = array_unique($locations);

//output data in table
echo '<table>';
echo '<tr>';
foreach($csvHeaders as $headerValue)
    echo '<th>'.$headerValue.'</th>';
echo '</tr>';

foreach($dataArr as $dataLine)
{
    echo '<tr>';
    foreach($dataLine as $dataValue)
    {
        echo '<td>'.htmlentities($dataValue, ENT_QUOTES).'</td>';
    }
    echo '</tr>';
}
echo '</table>';
?>

2 Comments

thanks for the great example. I've noticed that when I try implementing this and print_r($locations) I get an empty array: Array( [0] => , )
@Marcatectura: Sorry, was missing trim on the header values. I've updated the answer.
0

i read the post above.

It isnt working correctly - so here the updated working code of James Hunt.

<?php

$handle = fopen("test.csv", "r");
$data = fgetcsv($handle, 1000, ";");

$csv = array();
$csv[] = array();
$csv[] = array();
$csv[] = array();
$csv[] = array();
$csv[] = array();

while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) {

    $column=0;

    foreach ($data as $index=>$val) {

        $csv[$column][] = htmlentities($val, ENT_QUOTES);
        $column++;
    }

}

fclose($handle);
//Now, csv[0] has all Last Names, csv[1] has all First Names, csv[2] all Cities, csv[3] all Countries and csv[4] all Languages
//To filter duplicates..
$cities = array_unique($csv[2]);
$countries = array_unique($csv[3]);

var_dump($cities); //will output all column values of $csv[2]

?>

so stay tuned - greetz Robert!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.