4

I'm trying to sort alphabetically an UTF-8 string. The result contains unknown characters, and I don't know why. The same thing happens with usort and sort.

setlocale(LC_COLLATE, 'ro_RO.UTF-8');

$word = 'ÎABAȚÂIEȘĂ';
$chars = str_split($word);

echo 'Word: ' . $word . "\n";

//sort($chars, SORT_LOCALE_STRING);

usort($chars, function($a, $b){
    echo 'Comparing: ' . $a . ' and ' . $b . "\n";
    return strcoll($a, $b);
});

echo 'Result: ' . implode($chars) . "\n";

Command line example: http://s18.postimg.org/avqfhetsp/test.gif

1 Answer 1

1

The problem is not caused by comparing and/or sorting, but by the str_split() function. Since multibyte version of this function does not exist, you should use mb_split() or preg_split() for this purpose instead.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.