0

I have a JSON-like string read from many files that have JSON-like syntax without commas separating the root key-value pairs (e.g. "name": "Apple Beery" instead of "name": "Apple Beery",):

var data = `{
    "name": "Apple Beery"
    "phrases": [
        "a",
        "b",
        "It's"
    ]
    "reference": 19
    "friends": [
        {
            "name": "Dog",
            "reference": 14
        },
        {
            "name": "Markus Beery",
            "reference": 30
        }
    ]
    "last": "b"
}`;

Is there way to add dynamically commas back to the JSON string, so the JSON can be parsed like this:

var data = JSON.parse(`{
    "name": "Apple Beery",
    "phrases": [
        "a",
        "b",
        "It's"
    ],
    "reference": 19,
    "friends": [
        {
            "name": "George Beery",
            "reference": 14
        },
        {
            "name": "Markus Beery",
            "reference": 30
        }
    ],
    "last": "b"
}`);
4
  • 1
    The first JSON string is not valid AFAIK, and it would take a full parser to bring it up to snuff to be parsed later by a JSON parser. What is the source of that JSON string? I think you should fix the problem at its source. Commented Jan 1, 2019 at 2:23
  • The first JSON string is just missing the commas. These strings come from YAML config files that were written with valid JSON, except for the commas. I want to move everything to the proper JSON format. Commented Jan 1, 2019 at 2:26
  • Can you explain what you mean by "These strings come from YAML config files that were written with valid JSON, except for the commas."? Because YAML can be parsed and dumped to JSON. So, maybe you're going about this all wrong. Commented Jan 1, 2019 at 3:28
  • Thanks, I ended up doing that instead. Commented Jan 1, 2019 at 3:44

1 Answer 1

1

You can use a regular expression to match the last character in a line (anything but a comma, {, or [) and, if the next line starts with a {, [, or ", replace with that character plus a comma. If there may be digits on the next line, add \d into the final character set too:

const data = `{
    "name": "Apple Beery"
    "phrases": [
        "a",
        "b",
        "It's"
    ]
    "reference": 19
    "friends": [
        {
            "name": "Dog",
            "reference": 14
        },
        {
            "name": "Markus Beery",
            "reference": 30
        }
    ]
    "last": "b"
}`;
const json = data.replace(/[^,{[](?=\n *["[{\d])/gm, '$&,');
console.log(JSON.parse(json));

The pattern

[^,{[](?=\n *["[{\d])

means:

  • [^,{[] - Any character that isn't a comma, a {, or [
  • (?=\n *["[{\d]) - Lookahead for:
    • \n * - A newline character, followed by any number of spaces, and
    • ["[{\d] - any character but a [, {, or digit
Sign up to request clarification or add additional context in comments.

1 Comment

Good answer, but I fear that it might break if the actual JSON input could happen to be all on a single line.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.