2

I'm trying to write function to do the number_format() job for the non-ascii numbers , specifically perso-arabic numbers.

First i have to exchange the numbers which leaves me with a string of non-ascii characters:

$n = 133;
$n = exchange($n);
echo $n ;
//result : ١٣٣

The problem is when I add the commas to the number or rather string, my final result comes with some � characters.

Here is the function that I use to add the commas:

    static public function addcomma($number)
       {

    $i = strlen($number)-1;
    $c = 0 ;

    for($i ; $i >= 0 ; $i--){
    $c++;

    if($c == 1 ) 
    $y =mb_substr($number, $i, 1);
    else 
    $y .= mb_substr($number, $i, 1);



    if($c%3 == 0 && $i != 0 )
    $y .=',';
    }
    $y = strrev($y);
    return $y;

    }

And this is the result for $n = ١٣٣:

٣,٣�,�١

1
  • What place do you need to put the comma? And what encoding is that string? Commented Apr 3, 2012 at 3:12

2 Answers 2

3

Some of your characters (likely all) are stored on more than one byte, unline regular ASCII strings. So you have to use multibyte string functions to manipulate the strings. You can't use strlen, substr and strrev (or any other regular string function), and you can't just treat the string as an array. So, you have to change some sections of your code, like this:

$i = mb_strlen($number)-1;
// (...)
$y = mb_substr($number, $i, 1);

There is no multibyte equivalent for strrev, so you can try this (suggested on a comment at the strrev manual page):

// strrev won't work
// $y = strrev($y); 
$y = join("", array_reverse(preg_split("//u", $y)));

The above will split the string into an array, respecting the multibyte boundaries (note the u at the end of the regex), reverse that array, and then join it back to a string.

Sign up to request clarification or add additional context in comments.

3 Comments

sorry i was wrong it's not just breaking the original string but also when i add the commas , those characters appear on the result too , the string comes out fine if i dont add the commas , any thought on that ?
@max, see my updated answer. You can't use regular string functions anywhere when dealing with those multibyte strings.
add the encoding ('UTF-8') to the code and it will work as expected. If you check $i from mb_strlen($number)-1, you'll see that it returns 5 instead of 2. So used mb_strlen($number,'UTF-8') to get the correct length and put encoding on every mb function because php may guess wrong.
3

Your arabic string (ie, whatever you get from exchange()) is very likely encoded in UTF-8, or basically some non 8-bit format. As soon as you begin twiddling with the string as an array (which PHP assumes is 8-bit), you break the UTF-8 string and it comes out with those funny question marks when it's printed to the screen (which by the way, ensure your document encoding type is set to UTF-8 as well).

Depending on the version of PHP, you'll need to use mb_string functions to fiddle with multi-byte strings, which is what you have.

4 Comments

sorry i was wrong it's not just breaking the original string but also when i add the commas , those characters appear on the result too , the string comes out fine if i dont add the commas , any thought on that ?
Again, it's because UTF-8 is not just an 8-bit character string. Depending on the actual bits, a UTF-8 character can either be 8-bits or 16-bits (or more, I believe?). The mix of a UTF-8 character and then a simple 8-bit character and a display encoding of UTF-8 could cause the browser to mis-print your commas thinking it's part of a UTF-8 character.
thanx i've tried to force the commas to utf-8 with utf8_encode before adding them to the string but it didn't work , i think i have to use a array to hold each character and then convert it to string at the end
The problem is not encoding commas or anything - you cannot access the string as an array. You must use the mb_string functions, and only those functions for manipulating the string.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.