0

I am facing a strange issue when extracting data from a MySql database and inserting it in a CSV file. In the database, the field value is the following:

K Secure Connection 1 año 1 PC

When I echo it before writing it to the CSV file, I get the same as the above in my terminal.

I use the following code to write content to the CSV file:

fwrite($this->fileHandle, utf8_encode($lineContent . PHP_EOL));

Yet, when I open the CSV with LibreOffice Calc (and specify UTF-8 as the encoding format), the following is displayed:

K Secure Connection 1 año 1 PC

I have no idea why this happens. Can someone explain how to solve this?

REM:

SELECT @@character_set_database;

returns

latin1 

REM 2:

`var_dump($lineContent, bin2hex($lineContent))`

gives

string(39) "Kaspersky Secure Connection 1 año 1 PC"
string(78) "4b6173706572736b792053656375726520436f6e6e656374696f6e20312061c3b16f2031205043"
6
  • Can you include a dump of that CSV line after it's been written to file? It'll help determine if it's LibreOffice misinterpreting or PHP for writing it incorrectly. Commented Jun 23, 2016 at 14:35
  • I have opened the CSV with gedit (a text editor) and the bad characters are there (i.e., before opening LibreOffice) Commented Jun 23, 2016 at 14:39
  • is your database utf-8, the connect to it in utf8 etc ? Commented Jun 23, 2016 at 14:47
  • The character set seems to be latin1 Commented Jun 23, 2016 at 14:54
  • 2
    var_dump($lineContent, bin2hex($lineContent)) – what does that give you? It's very unclear what encoding the string is actually in; the easiest way to figure that out is to look at the actual bytes. – Secondarily, why are you utf8_encodeing? Are you explicitly wanting to convert from Latin1 to UTF-8? Commented Jun 23, 2016 at 15:02

2 Answers 2

3

The var_dump shows that the string is already encoded in UTF-8. Using utf8_encode on it will garble it (the function attempts a conversion from Latin-1 to UTF-8). You're therefore actually writing "año" encoded in UTF-8 into your file, which is then "correctly" picked up by LibreOffice.

Simply don't utf8_encode.

Sign up to request clarification or add additional context in comments.

1 Comment

First rule of business: know what encoding your strings are in.
0

I would try to open the csv file with other editor just to make sure te problem is not with the office...

You may be double encoding the content if it is already in UTF-8 format.

I also prefer to aways work with UTF-8, so I get the data from database already in UTF-8 and no more convertion is needed. For that I run this query right after opening the SQL connection:

"set names 'utf8'"

1 Comment

And by the way... check this related post... this may be your problem too: http://stackoverflow.com/questions/6336586/fwrite-and-utf8

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.