1

I have a valid XML document (this has been confirmed using multiple XML validators including online validators and the Sublime Text XML validator plugin).

I receive the following error when attempting to import the XML document into MSSQL 2008 using a stored procedure named ImportNXML (command: exec [dbo].[ImportNXML];)

Msg 9420, Level 16, State 1, Line 2
XML parsing: line 17, character 35, illegal xml character

I have confirmed no illegal characters are in the XML document and line 17, character 35 is just the number 1. I've tried modifying this line, replacing the entire line with letters, replacing the entire line with a single number, padding other lines in the document before this line with letters/numbers, but i receive exactly the same error complaining about the exact same location.

If i open the ImportNXML stored procedure and run the query contents, i receive no errors at all.

What could be causing the stored procedure to fail when being executed using the 'exec' command but succeed when the procedure contents are executed as an expanded query?

Mock data for the first 17 lines is as follows:

<?xml version="1.0" ?>
<ClientData>
<Policy><policyName>The Policy Name</policyName>
<Preferences><ServerPreferences><preference><name>Sessions</name>
<value>3</value>
</preference>
<preference><name>Detection</name>
<value>yes</value>
</preference>
<preference><name>Mac</name>
<value>no</value>
</preference>
<preference><name>Plugin</name>
<value>108478;84316;32809;93635;36080;87560;61117;35292;75260;83156;61271;103773;12899;82513;56376;77796;85655;60338;56763;79951;</value>
</preference>
<preference><name>TARGET</name>
<value>123.123.123.123,234.234.234.234</value>

The portion of the stored proc that imports the XML is as follows:

EXEC(' INSERT INTO XmlImportTest(xmlFileName, xml_data) SELECT ''' + @importPath + ''', xmlData FROM ( SELECT * FROM OPENROWSET (BULK ''' + @importPath + ''' , SINGLE_BLOB) AS XMLDATA ) AS FileImport (XMLDATA) ') 
15
  • 1
    Run this query in SSMS select cast('click here' as xml), then click the link in the results window, select all and delete, then paste your XML document into the empty window. See how SSMS parses your XML document and red-underlines what it thinks is illegal. Commented Apr 4, 2018 at 0:29
  • 1
    How exactly are you providing the XML to the proc? The example command in your question doesn't have parameters. Commented Apr 4, 2018 at 0:33
  • Is there an extended character? I can't remember exactly what extended character it was that caused me issues, but I think it was a non-breaking space (ASCII 160, Hex 0xA0). Commented Apr 4, 2018 at 0:33
  • 1
    If the XML isn't confidential data, it would really help if you added it to the question (at least the first 30 lines or so...). Commented Apr 4, 2018 at 1:12
  • 1
    Is it possible you've still got the data in binary? I created a dummy document consisting of <x></x> then ran select * from openrowset(bulk 'C:\test.xml', single_blob) a. The result was 0x3C783E0D0A3C2F783E Wondering if maybe it's not converting correctly. I know that doesn't quite explain various aspects of it, but it's a theory Commented Apr 4, 2018 at 2:12

1 Answer 1

2

Pure guessing:

  • The file is utf-8 encoded (or any other encoding, SQL-Server 2008 cannot read natively).
    • You must know, that SQL-Server is rather limited with file encodings. CHAR (or VARCHAR) is extended ASCII 1-byte encoding and NCHAR (or NVARCHAR) is UCS-2 2-byte encoding (which is almost identical with UTF-16).
    • With SQL-Server 2016 (and SP2 for v2014) some further support was introduced, especially for utf-8.
    • Try to open your XML with an appropriate editor (e.g. notepad++) and try to find out the file's encoding. Try to save this as "unicode / UCS-2 / utf-16" and retry the import.
    • Try to use your import with CLOB instead of BLOB. Reading the file as binary LargeObject will take the bytes one after the next. SQL-Server will try to read these bytes as string with fixed size per character. A character LOB might work under special circumstances.
    • Check the first two bytes for a BOM (byte order mark)
  • There is some dirt within your XML
    • Open the file with an HEX-editor and try to find strange codes
  • Your code processes the file's content within a dynamically created statement.
    • In such cases sometimes you run into truncation or string-breaking quotes
  • General hint:
    • If you import data and you expect issues it is highly recommended to use a 2-step-approach
    • Read your file into a tolerant staging table (with NVARCHAR(MAX) or even VARBIANRY(MAX) target columns) and try to continue with this.
    • It might be necessary to use another tool to change your file before the import.
Sign up to request clarification or add additional context in comments.

1 Comment

Thank-you very much for your reply! Changing to CLOB appears to have resolved the issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.