I have an unstructured String and I would like to extract the following JSON string with the "restaurant" tag from there using the regex. The data is for the example but the format and the "restaurant" tag is correct.
{
"restaurant": {
"id": "abcd-efgh-ijkl",
"created_at": "2020-12-31",
"cashier_payments": []
}
}
I come up with the regex String findMe = "\"restaurant\": {(\\n.*?)+}";, however, its taking all the data till the last }.
How do I correct the regex?
As asked, I get the unstructured String using the Jsoup:
String htmlString = contentBuilder.toString();
Document doc = Jsoup.parse(htmlString);
Elements elements = doc.getElementsByTag("script");
for (Element element :elements ){
for (DataNode node : element.dataNodes()) {
String s = node.getWholeData();
if(s.contains("\"restaurant\":")){
System.out.println(s);
}
}
System.out.println("-------------------");
}
So I would like to parse from the String s.
.in your regex matches any character. Is there a character you could exclude to get the result you want? Have you looked at greedy vs non-greedy matching?"\"restaurant\": \\{[^}]*\\}", which would work in your example, but it's still a bad regex because it cannot handle nested objects or end-brace characters inside the string values. Regex is the wrong tool for the job. Since the data is well-structured JSON, use a JSON parser instead.