0

I have a file the contents of which are formatted as follows:

{
  "title": "This is a test }",
  "date": "2017-11-16T20:47:16+00:00"
}

This is a test }

I'd like to extract just the JSON heading via regex (which I think is the most sensible approach here). However, in the future the JSON might change, (e.g. it might have extra fields added, or the current fields could change), so I'd like to keep the regex flexible. I have tried with the solution suggested here, however, that seems a bit too simplistic for my use case: in fact, the above "tricks" the regex, as shown in this regex101.com example.

Since my knowledge of regex is not that advanced, I'd like to know whether there's a regex approach that is able to cover my use case.

Thank you!

11
  • 4
    Why wouldn't you want to parse the JSON? Commented Nov 16, 2017 at 21:12
  • what exactly do you want to extract? can you give an input and output examples? Also what programming language are you using? Commented Nov 16, 2017 at 21:16
  • Why do you think using regex is more sensible approach than using JSON parser? Commented Nov 16, 2017 at 21:21
  • No need for regex If it always starts with \n{ and ends with \n} Commented Nov 16, 2017 at 21:28
  • 1
    There is little chance to parse it the right way with regex. You may try to get it with regex101.com/r/4Ds3sO/2 though. But Slai's comment is hinting that a non-regex solution might be simpler. Commented Nov 16, 2017 at 21:32

2 Answers 2

1

You can check for the first index of \n} to get the sub-string:

s = `{
  "title": "This is a test }",
  "date": "2017-11-16T20:47:16+00:00"
}
This is a test }
}`

i = s.indexOf('\n}')

if (i > 0) {
  o = JSON.parse(s = s.slice(0, i + 2))
  console.log(s); console.log(o)
}

or a bit shorter with RegEx:

s = `{
  "title": "This is a test }",
  "date": "2017-11-16T20:47:16+00:00"
}
This is a test }
}`

s.replace(/.*?\n}/s, function(m) {
  o = JSON.parse(m)
  console.log(m); console.log(o)
})

Sign up to request clarification or add additional context in comments.

1 Comment

This is actually a clever way to approach this problem, and I believe it’s even more foolproof than using a regex. Thanks for your input!
1

If the JSON always starts with { at the left margin and ends with } at the right margin, with everything else indented as you show, you can use the regular expression

/^{.*?^}$/ms

The m modifier makes ^ and $ match the beginning and end of lines, not the whole string. The s modifier allows . to match newlines.

var str = `{
  "title": "This is a test }",
  "date": "2017-11-16T20:47:16+00:00"
}

This is a test }
`;

var match = str.match(/^{.*?^}$/ms);
if (match) {
  var data = JSON.parse(match[0]);
}
console.log(data);

1 Comment

Thanks for this solution, this I believe, is the same that @Wiktor Stribiżew has suggested in the comment to my original question. I do appreciate the time you took for explaining what each bit does. As I mentioned above, I'll run this regex through a few more case scenarios, and I'll update this question then.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.