1

I have some strings that are valid in my database but when I include them in an attribute of a UTF-8 XML output they give me the following error:

XML Parsing Error: not well-formed

My current code (simplified):

header('Content-Type: text/xml'); 
echo '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>';
echo '<root attribute="' . htmlentities($string_from_hell) . '">'; 

How should I format these strings before including them in XML attributes?

A possible value for $string_from_hell:  (don't know if it will show up properly)

2
  • I wouldn't use a word "sanitize" here. "Formatting" seems more appropriate word to me. Commented Aug 12, 2010 at 11:54
  • @Col. Shrapnel You're right. Edited. Commented Aug 12, 2010 at 12:02

1 Answer 1

6

Try

htmlspecialchars($string_from_hell, ENT_QUOTES, "UTF-8")

htmlentities won't do because it will create HTML entities that are not recognized in XML, only HTML. You should also specify the charset because the default is not UTF-8, it's the ISO-8859-1.

You're also missing the quotes (") around the attribute value.

There are also better ways to create XML files that handle escaping for you. See e.g. XMLWriter.

Sign up to request clarification or add additional context in comments.

3 Comments

I think the real answer should be to use the appropriate DOM APIs to construct the XML instead of string concatenation. Also the OP's code misses the quotes around the attribute value as far as I can tell.
@Johan You're right, I missed the quotes. As to the DOM API, I think it's unnecessarily complicated (and inefficient) for XML building unless you need the complete DOM tree afterwards.
No idea how those APIs look in PHP. But something SAX-like might suffice too (which XMLWriter seems to be). I'm not doing that much in XML so pardon the inaccuracy :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.