encode or decode string from mysql using php

Question

In my db I have a field value looking like this:

ÎœÎ‘ÎšÎ‘Î¡Î™ÎŸÎ¥ Î“\'

I think it must be Greek chars inserted when I didn't have set UTF-8 for my db (I think I was using the default Latin 1).

Is there a way to get the actual characters?

Thank you

If this is UTF8 that's stored inside a latin1 column, you could use utf8_encode() to bring the original encoding back. — Ja͢ck
– Ja͢ck, Commented Jan 15, 2013 at 8:41
@Jack I don't remember anymore. I think my db was in latin1 and 99% the data inserted are greek characters. I cannot get any result in any of my tries to convert this back. — Pavlos1316
– Pavlos1316, Commented Jan 15, 2013 at 9:32

Dino Babu · Accepted Answer · 2013-01-15 07:56:30Z

2

Not sure, Try this :

$str = "ÎœÎ‘ÎšÎ‘Î¡Î™ÎŸÎ¥ Î“\'";
$val = iconv(mb_detect_encoding($str), "UTF-8", $str);
echo $val;

answered Jan 15, 2013 at 7:56

Dino Babu

5,8093 gold badges26 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Pavlos1316 Over a year ago

I get back exactly the same string

Pavlos1316 Over a year ago

This fixed my problem... Since UTF-8 wasn't working I tried one by one all charsets until I hit the right one. And I did. I don't know how it happened but instead of UTF-8 the charset was "windows-1252". Thank you

DWright · Accepted Answer · 2013-01-15 08:08:19Z

0

Try saving the data into a text file and opening the text file in a hex editor (there are a bunch of good free ones). That could show you the underlying code values of the letters, which you could then match against published encodings.

For example, this page lists Unicode values for Polytonic Greek values (not sure you were using Polytonic, though): http://leb.net/reader/text/standards/unicode/old/MappingTables/NewTables/Polytonic_Greek.txt.

Looking at the text with a hex editor will help you to get code values to look up in lookup tables like this.

answered Jan 15, 2013 at 8:08

DWright

9,5594 gold badges39 silver badges57 bronze badges

5 Comments

Pavlos1316 Over a year ago

Tried but the HEX I got is: FFFECE005301CE001820CE006101CE00A100CE002221CE007801CE00A5002000CE001C205C002700 How do I convert this?

DWright Over a year ago

So those first two bytes FFFE, mean that your file is encoded in little endian UTF-16. See discussion here: en.wikipedia.org/wiki/Byte_order_mark. The next two bytes are CE00, which in little endian is just CE. When I go look at the standard unicode greek page: unicode.org/charts/PDF/U0370.pdf, CE doesn't seem to fit into the greek range). So perhaps this is reflecting a different original code page, such as Windows-1253?

DWright Over a year ago

BTW, 00CE is quite regular in this stream. Interesting. Seems like it's every other two-byte pair!

DWright Over a year ago

Looking 00CE up in the unicode charts by code here, unicode.org/charts, steers us to unicode.org/charts/PDF/U0080.pdf, in which 00CE realy is an I with a circumflex, which is showing up in your snippet of text. But what was the significance of CE in your original encoding. Did you originally encode, perhaps in Windows 1253, en.wikipedia.org/wiki/Windows-1253, in which CE is a captial Xi? Or something else? We'll have to figure that out.

DWright Over a year ago

Setting aside the significance of 00CE, what are the intervening byte pairs, the first three of which are: 0153,0281,0161, etc. Let's look those up at unicode.org/charts . . . I'm out of time for now, but maybe you'll find something interesting.

Collectives™ on Stack Overflow

encode or decode string from mysql using php

2 Answers 2

2 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related