0

I have a php script that reads a csv file (it has UTF-16LE encoding). The problem is that at some lines the array of php reading the lines of the csv is collapsed because of some Greek characters. A example is bellow (there are 7 elements at the array and the bellow has only 2), how can I solve this problem?

Array ( [0] => 205198 [1] => Label 4.2 Βάση για Σ▒ )

My code is bellow

$array = file_get_contents($this->listUrl);      
         $array = mb_convert_encoding($array, 'UTF8', 'UTF-16LE');   // Convert the file to UTF8
         $array = preg_split("/\R/", $array);                        // Split it by line breaks       
         $array = array_map(function ($v) {
             return str_getcsv($v, ";");
         }, $array);

[edit]I used the code below

$array = str_getcsv($array, "\n");
        foreach ($array as &$Row) {
            $Row = str_getcsv($Row, ";");          
        }
2
  • 1
    This should rather be done using fgetcsv with proper locale set (see stackoverflow.com/a/6160934/1427878) - if you just split by line breaks, you risk messing up your data, if any of the cell values could ever contain a line break. Commented Jul 4, 2022 at 12:08
  • @CBroe , it seems that you are right. I use the code below $f = file_get_contents('file'); $f = mb_convert_encoding($f, 'UTF8', 'UTF-16LE'); $f = str_getcsv($f, "\n"); foreach($f as &$Row) { $Row = str_getcsv($Row, ";"); } Commented Jul 4, 2022 at 13:07

1 Answer 1

0

My best bet is that :

You need mb_split, since you are messing with multibyte strings to support GR lang.

Some theory :

UTF-8, with the famous ASCII = 1 byte.

UTF-16 with all unicode characters support = 4 bytes.

Some action :

"mb_split — Split multibyte string using regular expression" : PHP : mb_split

There are also similar functions as mb_ereg_replace.

Example :

$array = file_get_contents($this->listUrl);      
         $array = mb_convert_encoding($array, 'UTF8', 'UTF-16LE');   // Convert the file to UTF8
         $array = mb_split("/\R/", $array);                        // Split it by line breaks       
         $array = array_map(function ($v) {
             return str_getcsv($v, ";");
         }, $array);

Have fun !

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.