0

I'm getting an error executing the following code:

use JSON;
use Encode qw( encode decode encode_utf8 decode_utf8);
my $arr_features_json = '[{"family":"1","id":107000,"unit":"","parent_id":"0","cast":"2","search_values_range":"1,2,3,4,5,6,7,8,9,10,11,12","category_id":"29","type":"2","position":"3","name":"Número de habitaciones","code":"numberofrooms","locales":"4","flags":"1"}]';
$arr_features_json = decode_json( $arr_features_json );

The following is the error I get:

malformed UTF-8 character in JSON string, at character offset 169 (before "\x{fffd} de habitaci...") at test.pl line 13.

decode_json is issuing the error because of the ú character in the json, so I want to convert this character to \u00fa. How can I do that?

0

2 Answers 2

2

decode_json expects UTF-8, but the string you have isn't encoded using UTF-8. decode the string if it's not already, then use from_json instead of decode_json.

#!/usr/bin/perl

use strict;
use warnings;
use feature qw( say );

use utf8;                             # Perl code is encoded using UTF-8.
use open ':std', ':encoding(UTF-8)';  # Terminal provides/expects UTF-8.

use JSON qw( from_json );

my $features_json = '
  [
    {
      "family": "1",
      "id": 107000,
      "unit": "",
      "parent_id": "0",
      "cast": "2",
      "search_values_range": "1,2,3,4,5,6,7,8,9,10,11,12",
      "category_id": "29",
      "type": "2",
      "position": "3",
      "name": "Número de habitaciones",
      "code": "numberofrooms",
      "locales": "4",
      "flags": "1"
    }
  ]
';

my $features = from_json( $features_json );

say $features->[0]{name};
Sign up to request clarification or add additional context in comments.

Comments

1

The error says that string you are trying to process is not an UTF-8 or faulty UTF-8 string. So, you need to convert it to UTF-8 string using encode_utf8 before decoding it to json.

use JSON;
use Data::Dumper;
use Encode qw( encode decode encode_utf8 decode_utf8);

my $arr_features_json = '[{"family":"1","id":107000,"unit":"","parent_id":"0","cast":"2","search_values_range":"1,2,3,4,5,6,7,8,9,10,11,12","category_id":"29","type":"2","position":"3","name":"Número de habitaciones","code":"numberofrooms","locales":"4","flags":"1"}]';
my $arr_features = decode_json( encode_utf8($arr_features_json) );

print Dumper($arr_features);

Probably you should check a this article to know difference between UTF-8 strings and character strings.

3 Comments

This is backwards. decode_json expects a UTF-8 string (octets) but the input contains a "wide character". encode_utf8 converts a string that is not UTF-8 encoded into a string that is. (Well, it can also convert a string that is already UTF-8 encoded into one that is doubly encoded)
The code is right, but both sentences of the explanation are backwards.
Thanks @ikegami corrected. I didn't noticed it before.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.