I'm using XHTML Transitional doctype for displaying content in a browser. But, the content is displayed it is passed through a XML Parser (DOMDocument) for giving final touches before outputting to the browser.
I use a custom designed CMS for my website, that allows me to make changes to the site. I have a module that allows me to display HTML scripts on my website in a way similar to WordPress widgets.
The problem i am facing right now is that I need to make sure any code provided through this module should be in a valid XHTML format or else the module will need to convert the code to valid XHTML. Currently if a portion of the input code is not XHTML compliant then my XML parser breaks and throws warnings.
What I am looking for is a solution that encodes the entities present in the URLs and text portions of the input provided via TextArea control. For example the following string will break the parser giving entity reference error:
<script type="text/javascript" src="http://www.abcxyz.com/foo?bar=1&sumthing"></script>
Also the following line would cause same error:
<a href="http://www.somesite.com">Books & Cool stuff<a/>
P.S. If i use htmlentities or htmlspecialchars, they also convert the angle brackets of tags, which is not required. I just need the urls and text portions of the string to be escaped/encoded.
Any help would be greatly appreciated.
Thanks and regards, Waqar Mushtaq