1

I am facing an issue where I have a JSON array of objects in a .json file. I can get the content of the file using file_get_contents $str = file_get_contents($jsonFile); However when I perform json_decode on the content I just get null as result. Below is some of the content from the .json file

[
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11749,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de santé McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En présentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Context 04072022 12h41",
    "shortname.fr": "Formation Context 04072022 12h41",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Context 04072022 12h41",
    "title.fr": "Formation Context 04072022 12h41",
    "visible": false
},
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11748,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de santé McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En présentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Contexte 040722 08h51m",
    "shortname.fr": "Formation Contexte 040722 08h51m",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Contexte 040722 08h51m",
    "title.fr": "Formation Contexte 040722 08h51m",
    "visible": true
},
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11747,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de santé McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En présentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Contexte 04072022",
    "shortname.fr": "Formation Contexte 04072022",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Contexte 04072022",
    "title.fr": "Formation Contexte 04072022",
    "visible": false
}]

How can I convert it into valid json string for php or an array. The JSON is a valid JSON but after I use file_get_contents it inserts line breaks and \n like here: https://3v4l.org/5Zd7O Below is a snippet of my code:

$str = file_get_contents('jsondump.json');
var_dump(gettype($str));
var_dump($str);

$jsonArr = json_decode($str,1); // decode the JSON into an associative array
var_dump($jsonArr) ;
echo json_last_error_msg();

I tried checking the encoding using mb_convert_encoding() however the result is still the same, I did:

$str = file_get_contents($jsonFile); 

$encoding = mb_detect_encoding($str, 'UTF-8, ISO-8859-1', true);
$str2 =  mb_convert_encoding($str, 'UTF-8', $encoding);
var_dump(gettype($str));
var_dump($str);
var_dump($str2);
var_dump($encoding);

When I display the var_dump results I get $encoding value as "\nstring(5) "UTF-8" The first $str is like below: [{\n "accreditation": false,\n "category.en": "Template",\n "category.fr": "Gabarit",\n "clientele.en": n ull,\n "clientele.fr": null,\n "courseid": 816,\n "duree": "1h00m",\n "dureeminutes": 60,\n "establishmentaltname": "06-ciusss-cusm",\n "establishmentfullname": "Centre universitaire de sant \xc3\xa9 McGill",\n "fcpresponsable": "",\n "idnumber": "",\n "idnumberalt": "",\n "imgurl": null,\n "ispartageable": true,\n "keywords": null,\n "lastupdate": 1483246800,\n "m odalite.en": "Online",\n "modalite.fr": "En ligne",\n "nombreinscriptions": 6,\n "parentestablishmentfullname": "Territoire CUSM",\n "parentestablishmentshortname": "CUSM-FCP",\n "partageable": "Pa rtageable",\n "shortname.en": "E-learning Course Template",\n "shortname.fr": "gabarit d\'une formation en ligne",\n "summary.en": "This template is to be used when creating an e-learning course as part of the F CP program. It is important that we standardize the training structure to allow users a more user friendly experience. ",\n "summary.fr": "Ce gabarit devra \xc3\xaatre utilis\xc3\xa9 lors de la cr\xc3\xa9ation d\'un cours FCP en ligne. Il est important d\'uniformiser la structure de formation afin de permettre une exp\xc3\xa9rience plus conviviale aux apprenants.",\n "title.en": "FCP E-learning Course Template",\n "title.fr": "FCP Gabarit de formation en ligne",\n "visible": false\n }] and $str2 is the same like this [{\n "accreditation": false,\n "category.en": "Template",\n "category.fr": "Gabarit",\n "clientele.en": n ull,\n "clientele.fr": null,\n "courseid": 816,\n "duree": "1h00m",\n "dureeminutes": 60,\n "establishmentaltname": "06-ciusss-cusm",\n "establishmentfullname": "Centre universitaire de sant \xc3\xa9 McGill",\n "fcpresponsable": "",\n "idnumber": "",\n "idnumberalt": "",\n "imgurl": null,\n "ispartageable": true,\n "keywords": null,\n "lastupdate": 1483246800,\n "m odalite.en": "Online",\n "modalite.fr": "En ligne",\n "nombreinscriptions": 6,\n "parentestablishmentfullname": "Territoire CUSM",\n "parentestablishmentshortname": "CUSM-FCP",\n "partageable": "Pa rtageable",\n "shortname.en": "E-learning Course Template",\n "shortname.fr": "gabarit d\'une formation en ligne",\n "summary.en": "This template is to be used when creating an e-learning course as part of the F CP program. It is important that we standardize the training structure to allow users a more user friendly experience. ",\n "summary.fr": "Ce gabarit devra \xc3\xaatre utilis\xc3\xa9 lors de la cr\xc3\xa9ation d\'un cours FCP en ligne. Il est important d\'uniformiser la structure de formation afin de permettre une exp\xc3\xa9rience plus conviviale aux apprenants.",\n "title.en": "FCP E-learning Course Template",\n "title.fr": "FCP Gabarit de formation en ligne",\n "visible": false\n }]

11
  • 1
    Use echo json_last_error_msg() to see the reason. Commented Sep 8, 2022 at 16:49
  • For us to help you, you should provide the code that is actually erroring, you have provided the JSON content and not the full thing just a snippet so that's no use also we can't see how your supposedly parsing it in PHP, we can only assume and this is of no benefit to anyone. Commented Sep 8, 2022 at 16:51
  • 1
    The text you posted is valid JSON: jsonlint.com. SUGGESTIONS: 1) show us your json_decode(). 2) Add json_last_error_msg() to determine the exact error, 3) Look at these examples: php.net/manual/en/function.json-decode.php Commented Sep 8, 2022 at 16:52
  • 1
    In addition, did you try to only use the json snippet your shared? Did it work? If yes, there could be some error on another part of the full JSON file. Commented Sep 8, 2022 at 16:56
  • 1
    Please do the following: 1) Add these three lines: $text1 = file_get_contents(), $encoding = mb_detect_encoding($text1, 'UTF-8, ISO-8859-1', true); and $text2 = mb_convert_encoding($text1, 'UTF-8', $encoding); 2) Edit your post. Show us the code and the results. 3) Please confirm the original file is OK. 4) Please tell us the encoding of the original file (e.g. French/ISO 8859-1). Commented Sep 8, 2022 at 20:22

2 Answers 2

1
  1. The JSON text you posted is OK. Unfortunately, that's NOT the text you're passing to json_decode(). Hence the error.

  2. Assuming your original .json file is OK, it appears that file_get_contents() is corrupting the JSON text.

  3. SUGGESTION:

http://truelogic.org/wordpress/2018/08/19/php-file_get_contents-for-utf-encoded-content/

One of the problems of file_get_contents() is that it messes up the data if the file contains special characters outside the standard ASCII character set.

The solution is to convert the encoding of the contents to UTF-8, but only after it has detected the desired encoding. So for instance if we know the file contains European languages like Spanish or French then we specify the detection for ISO-8859-1. For Arabic it would be ISO-8859-6 and so on.

function file_get_contents_utf8($fn) {
     $content = file_get_contents($fn);
      return mb_convert_encoding($content, 'UTF-8',
          mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}

It sounds like your file is French/ISO-8859-1, and it sounds like all you have to do is use mb_convert_encoding() to convert it to UTF-8 before attempting json_decode().

See also mb_detect_encoding for more details.


Per the OP, he's reading a perfectly legal JSON file like this:

[
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11749,
    ...
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En présentiel",
    "nombreinscriptions": 1,
    ...
    "partageable": "Locale",

... but file_get_contents() is corrupting the text, like this:

[{
        "accreditation": false,
        "category.en": "Template",
        "category.fr": "Gabarit",
        "clientele.en": n ull,
        ...
        "m odalite.en": "Online",
        "modalite.fr": "En ligne",
        "nombreinscriptions": 6,
        ...
        "partageable": "Pa rtageable",

file_get_contents() doesn't always "play nice" with non-ASCII, multi-byte text, per the link I cited above. A common solution is to call mb_convert_encoding() to convert the string to UTF-8. I gave an example above.

It appears, however, that the OP's input text is corrupted badly enough that mb_convert_encoding() doesn't work. I can't explain this.

SUGGESTED ALTERNATIVE: read the bytes directly (instead of using file_get_contents()). Then call mb_convert_encoding(), to ensure json_decode() gets UTF-8 text:

Is there an alternative to file_get_contents?

fwrite() and UTF8

https://stackoverflow.com/a/31214886/421195

@Karan -

Q: Are you SURE the input file is 100% OK? There seem to be a few minor discrepancies between the examples.

Q: Have you looked at one of the "bad" files in a hex editor? Perhaps the "mysterious spaces" might be due to "hidden characters" that would only show up if you viewed the file in hex?

Q: What's your PHP version? Perhaps upgrading might resolve the problem?

Sign up to request clarification or add additional context in comments.

8 Comments

I tried this but still the UTF-8 characters are mal coded, It didn't fix the issue
Please: 1) Modify your code. Make separate statements for $text1 = file_get_contents(), $encoding = mb_detect_encoding($text1, 'UTF-8, ISO-8859-1', true); and $text2 = mb_convert_encoding($text1, 'UTF-8', $encoding); 2) Get the values for $text1, $encoding and $text2 (e.g. "echo", or whatever's convenient), 3) Edit your post. Show us exactly what you tried; copy/paste the results. Also confirm 1) the original file is OK, and 2) the original file is encoded as 8859-1 text.
@Karan: Q: Any updates? The problem seems to be file_get_contents() is "corrupting" the text in your JSON file. I believe the solution is probably mb_convert_encoding(). But we need to be methodical: take a small step at a time. We absolutely shouldn't "assume" at any step along the way. Please update your post with my suggestions above.
You may wish to note this requires the mbstring extensions to be installed and activated to work this does not come through with standard install php on most nix based OSes, but also it's now bad practise and by default blocked to allow fread from remote sources should use cURL and since your expecting the mbstring extension it not to much to expect the cURL extension.
@paulsm4, I am accepting your answer, The issue was with file_get_contents. I used just the file function which converted the file contents to an array and I later used implode to reconvert to string like: $str= file($jsonFile); $str = implode("", $str);
|
-2

I did try with the JSON data you provided and it's working fine. You can check if you are using correct path where your JSON file is stored in file_get_contents().

Below is the example: https://3v4l.org/bFS59

2 Comments

This is really a comment, not an answer. With a bit more rep, you will be able to post comments.
@MohammedJhosawa The problem seems to be after I use file_get_contents($jsonFile). which inserts line breaks and empty spaces. I have updated my question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.