1

P.S.: It is not a duplicated question, because I'm not looking to write contents in a file because it is already done, I'm looking to change a type of a file to be UTF-8, there is a difference in it.

How to generate the UTF-8 file and not ANSI. (Is not the contents).

For example, the most IDE have an option encoding, where you are able to modify the type of your file, but I'm generating a bulk from my database, and it generates a lot of individual text files, but the whole files is ANSI default.. I'm just looking for a function in php that make it possible to change the encoding before it generates the bulk.

If the source code help I can post it here. just let me know.

Thanks in advance.

EDITED

Follow a print of what I'm asking here.

enter image description here

When I generate the file "testecli01.csv" it always get encoding ANSI, whatever I do in my script it is always ANSI, and I need in UTF-8, just this. Is simple but I have no idea how to do.

5
  • 1
    Except that you're generating files from a database, the question How to write file in UTF-8 format? quite matches yours. There is no magical call to change the encoding of a file, you have to read it, change the encoding, then write it back. Commented Jul 8, 2011 at 19:29
  • The above comment has it right. There's not magical database encoding conversion free lunch. Commented Jul 8, 2011 at 19:32
  • Is not the same question, it is about the file itself and not the contents of the file. Is a thing that freak me out.. no good resources, even the php docs itself.. I can do it by hand, but I have thousand of files ... 0_o Commented Jul 8, 2011 at 19:49
  • 1
    @Fernando, text files don't have an 'encoding' property. The closest thing to that, for UTF-8, is a BOM marker at the beginning of a file. But even then, you still have to convert the contents of the file to UTF-8: just throwing in a BOM isn't going to fix anything unless there are no special characters in your file, in which case they were valid UTF-8 to start with. Commented Jul 9, 2011 at 0:21
  • notebad++ worked as charm with me, regarding converting files encoding. download it for free Commented Aug 20, 2016 at 18:29

4 Answers 4

4

If your 3rd party program "do not support files in ANSI but UTF-8" as you mentioned in a comment then most likely it's expecting a BOM.

While the Unicode Standard does allow a BOM in UTF-8,[2] it does not require or recommend it.[3] Byte order has no meaning in UTF-8[4] so a BOM serves only to identify a text stream or file as UTF-8.

The reason the BOM is recommended against is that it defeats the ASCII back-compatibility that is part of UTF-8's design.

So strictly speaking your 3rd party program isn't completely compliant with the standard because the BOM should be optional. ANSI is 100% valid UTF-8 and that is one of the main drivers of it. Anything that can understand UTF-8 accordng to the standard by definition also understands ANSI.

Try writing "\xEF\xBB\xBF" to the front of the file and see if that solves your problem.

Sign up to request clarification or add additional context in comments.

1 Comment

The 3rd party program is from gov and it is a very old program, ascent is not allowed, then you can imagine what type of program ... Does not matter, cause I generate the data. I think ANSI is correct 'cause it has all ascent, and the data is OK, but the gov program do not accept ascent, perheps I remove all ascent from my database.. ahahaha Thanks anyway, BOM I did'nt know about it.
2

I do not know of a database that will do the encoding conversion for you easily. For example, in MySQL, you have to reset all the character encodings for the db, tables, and columns, AND THEN convert the data.

I would suggest instead that you create your database dump and use iconv to change the encoding, whether on the command line:

iconv -f original_charset -t utf-8 dumpTextData > convertedTextData

or in PHP (taken from How to write file in UTF-8 format?)

$input = fopen($file, 'r');
$output = fopen($file, 'w');
stream_filter_append($input, 'convert.iconv.UTF-8/OLD-ENCODING');
stream_copy_to_stream($input, $output);
fclose($input);
fclose($output);

NOTE: edited to avoid leaking file descriptors.

7 Comments

Your copied answer leaks file descriptors. If you have more than a few hundred files, this will cause problems.
@zneak thanks for pointing that out. I forget you can't trust people to know you need an fclose. Edited to include.
You're still leaking the file descriptor from fopen in the stream_copy_to_stream. :) I've fixed it for you.
...and I need to remember how to read haha. I went to edit but you beat me to it. Thanks.
@Zneak Actually it has thousands and not hundreds of files, it is not in full use, but I expect to use in a production environment.. The problem is the thousand of .txt files will be imported by a 3td party program and it do not support files in ANSI but UTF-8. Then the txt files need to be in such a way..
|
0

Excel likes CSV files to be UTF-16LE, and begin with '\xFF\xFE'.

My code to build a file for excel is:

echo "\xFF\xFE"; // marker for UTF-16 file;

foreach ($rows as $row)
    echo mb_convert_encoding($row, 'UTF-16LE');

Comments

0

Old encoding is first, as it is in iconv function. You also can´t read and write same file.

    $input = fopen($path, 'r');
    $output = fopen($path . '.tmp', 'w');
    stream_filter_append($input, 'convert.iconv.OLDENCODING/UTF-8');
    stream_copy_to_stream($input, $output);
    fclose($input);
    fclose($output);
    unlink($path);
    rename($path . '.tmp', $path);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.