1

Im trying to assert a function in which mycode converts Windows-1252 code to UTF-8 Example as follows:

function test($article){
       $result = mb_convert_encoding($article[0]['Description'], "UTF-8", "Windows-1252");
retrun $result;
}

I'm trying to input a Windows-1252 and assert its changes but its not working.

My Unititest:

$convertedArray = array(array('Description' => "an example pain— if you’re"));
$someString = $this->getMockBuilder('\Client')
            ->setMethods(['getArticle'])
            ->getMock();
        $someString->expects($this->once())
            ->method('getArticle')
            ->with('12345')
            ->will($this->returnValue($convertedArray));

        \client::set($someString);

Or

In simple: Im trying to input $str = "an example pain— if you’re"; and expect the function to convet it to UTF-8 and return "an example pain— if you’re" how can i do that?

Im getting the following error:

--- Expected
+++ Actual
@@ @@
 Array (
-    'record' => 'an example pain— if you’re'
+    'record' => 'an example pain� if you’re'
 )
9
  • If you create a file/fixture with the text and save it with Win1252 encoding, then do a file_get_contents() and proceed with the unit test? Commented May 29, 2018 at 14:25
  • @Loek will try that.! Thanks Commented May 29, 2018 at 16:51
  • I tried with file.. but haven’t tried with reading using file_get_contents(); used fileread operation instead. Commented May 29, 2018 at 16:52
  • Cool! I remember file_get_contents() to be binary safe so it should work. If not, let me know and I'll search further :) Commented May 29, 2018 at 16:54
  • I feel the problem is because of the Array() :( when we add to array it converts it as string so I can’t abld to convert back. If possible try my code and see if it works for array. Thanks. Commented May 29, 2018 at 17:10

2 Answers 2

3

If you want to guarantee an encoding for your test strings do the following:

  1. Make sure you know what encoding you're writing the code in, eg: UTF8.
    • This will be in your editor settings.
  2. Convert your test string from that encoding to your target.
    • $test_1252 = mb_convert_encoding($test_utf8, 'cp-1252', 'utf-8');
  3. Encode the test string in something 7-bit-safe, like base64.
    • echo base64_encode($test_1252);

Now you have a string that you can safely copy/paste in/out of whatever file you want while maintaining its encoding.

eg:

$test_utf8 = "an example pain— if you’re";
$test_1252 = mb_convert_encoding($test_utf8, 'cp1252', 'utf-8');

var_dump(
    $test_utf8,
    $test_1252,
    bin2hex($test_utf8),
    bin2hex($test_1252),
    base64_encode($test_utf8),
    base64_encode($test_1252)
);

Output:

string(30) "an example pain— if you’re"
string(26) "an example pain� if you�re"
string(60) "616e206578616d706c65207061696ee2809420696620796f75e280997265"
string(52) "616e206578616d706c65207061696e9720696620796f75927265"
string(40) "YW4gZXhhbXBsZSBwYWlu4oCUIGlmIHlvdeKAmXJl"
string(36) "YW4gZXhhbXBsZSBwYWlulyBpZiB5b3WScmU="
Sign up to request clarification or add additional context in comments.

1 Comment

Answers like this make me wish there was a 'Save for later reference' button on SO :)
1

Glad I coulp help! Answer for reference:

It seems that you changed up the parameters of the mb_convert_encoding() function, unfortunately.

// Change this
$result = mb_convert_encoding($article[0]['Description'], "UTF-8", "Windows-1252");

// To this
$result = mb_convert_encoding($article[0]['Description'], "Windows-1252", "UTF-8");

See your expected working code in action here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.