I am currently using the Java XPath API to extract some text from a String.
This String, however, often has HTML formatting (<b>, <em>, <sub>, etc). When I run my code, the HTML tags are stripped off. Is there any way to avoid this?
Here is a sample input:
<document>
<summary>
The <b>dog</b> jumped over the fence.
</summary>
</document>
Here is a snippet of my code:
XPathFactory factory = XPathFactory.newInstance();
XPath xPath = factory.newXPath();
InputSource source = new InputSource(new StringReader(xml));
String output = xPath.evaluate("/document/summary", source);
Here is the current output:
The dog jumped over the fence.
Here is the output I want:
The <b>dog</b> jumped over the fence.
Thanks in advance for all your help.