The input string is mix of some text with valid JSON:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<TITLE>Title</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<META HTTP-EQUIV="Content-language" CONTENT="en">
<META HTTP-EQUIV="keywords" CONTENT="search words">
<META HTTP-EQUIV="Expires" CONTENT="0">
<script SRC="include/datepicker.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>
<script SRC="include/jsfunctions.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>
<link REL="stylesheet" TYPE="text/css" HREF="css/datepicker.css">
<script language="javascript" type="text/javascript">
function limitText(limitField, limitCount, limitNum) {
if (limitField.value.length > limitNum) {
limitField.value = limitField.value.substring(0, limitNum);
} else {
limitCount.value = limitNum - limitField.value.length;
}
}
</script>
{"List":[{"ID":"175114","Number":"28992"]}
The task is to deserialize the JSON part of it into some object. The string can begin with some text, but it surely contains the valid JSON. I've tried to use JSON validation REGEX, but there was a problem parsing such pattern in .NET.
So in the end I'd wanted to get only:
{
"List": [{
"ID": "175114",
"Number": "28992"
}]
}
Clarification 1:
There is only single JSON object in whole the messy string, but the text can contain {}(its actually HTML and can contain javascripts with <script> function(){..... )
{and}? If not then the simple solution would be to find the 1st and last bracers and assume that is the start and end of your JSON. Otherwise I'd say you are screwed since you wont be able to tell where the actual JSON starts.{}is valid json and chances are there that your html-containing text contains things that could be an actual json out of context. However you might try to use a DOM-parser to extract your JSON from the HTML, if you have a clue where it is placed. You might be even more screwed, if your JSON is formatted with HTML :X