0

I have 12 excel files, each one with lots of data organized in 2 fields (columns): id and text.

Each excel file uses a diferent language for the text field: spanish, italian, french, english, german, arabic, japanese, rusian, korean, chinese, japanese and portuguese.

The id field is a combination of letters and numbers.

I need to import every excel into a different MySQL table, so one table per language.

I'm trying to do it the following way: - Save the excel as a CSV file - Import that CSV in phpMyAdmin

The problem is that I'm getting all sorts of problems and I can't get to import them properly, probably because of codification issues.

For example, with the Arabic one, I set everything to UTF-8 (the database table field and the CSV file), but when I do the import, I get weird characters instead of the normal arabic ones (if I manually copy them, they show fine).

Other problems I'm getting are that some texts have commas, and since the CSV file uses also commas to separate fields, in texts that are imported are truncated whenever there's a comma.

Other problems are that, when saving as CSV, the characters get messed up (like the chinese one), and I can't find an option to tell excel what encoding I want to use in the CSV file.

Is there any "protocol" or "rule" that I can follow to make sure that I do it the right way? Something that works for each different language? I'm trying to pay attention to the character encoding, but even with that I still get weird stuff.

Maybe I should try a different method instead of CSV files?

Any advice would be much appreciated.

2 Answers 2

3

OK, how do I solved all my issues? FORGET ABOUT EXCEL!!!

I uploaded the excels to Googledocs spreadsheets, downloaded them as CSV, and all the characters were perfect.

Then I just imported into their corresponding fields of the tables, using a "utf_general_ci" collation, and now everything is uploaded perfectly in the database.

Sign up to request clarification or add additional context in comments.

2 Comments

wow, I was trying to resolve this for the last 2 hours and this simple solution worked. Thanks
Nov 2019 and this is still the best approach (after hours trying other methods)
0

One standard thing to do in a CSV is to enclose fields containing commas with double quotes. So

ABC, johnny cant't come out, can he?, newfield

becomes

ABC, "johnny cant't come out, can he?", newfield

I believe Excel does this if you choose to save as file type CSV. A problem you'll have is that CSV is ANSI-only. I think you need to use the "Unicode Text" save-as option and live with the tab delimiters or convert them to commas. The Unicode text option also quotes comma-containing values. (checked using Excel 2007)

EDIT: Add specific directions

In Excel 2007 (the specifics may be different for other versions of Excel)

Choose "Save As"

In the "Save as type:" field, select "Unicode Text"

save dialog screenshot

You'll get a Unicode file. UCS-2 Little Endian, specifically.

3 Comments

@DaceE I solved the problem of the commas by changing the separators to semi-colons (;), in the windows Control Panel -> Language Options. However, I still have problems with the encoding of languages with special characters. For example, when I export the arabic ones to CSV, all I get are ????? instead of the arabic symbols, but I don't know how to tell excel what encoding to use
@Albert - you can not use CSV for non-ANSI characters; you need to use the Unicode Text (tab-delimited) export type for a flat file output. (See my answer edit for an example.
thank you very much for your reply, I managed to make it work with Google Docs. I wonder why the CSV it provides has all the characters right...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.