I have a xml file with a malformed HTML in its content ..
Since xml cannot parse html tags like <br> I have used CDATA for saving and parsing .
I have used documentBuilder.setCoalescing(true) ; while parsing for recovering data <![CDATA[<br>test<br>data<br>]]> without CDATA tag ..
but in the optput < and > tags are replaced by < and > respectively ..
I m expecting this string in result ...
<br>test<br>data<br>
in the parsed string .
How to do this ? Any Idea ? Thanks in advance !
UPDATE:I have two more Questions in follow up ..
1.Is there any way to make a malformed HTML (eg.<br>) to parsable xml (eg.<br/>) via code , if so will it handle also ?
2.Is there any solution to convert a html text to plain text via java (eg.<div>test text</div> to test text)?