1

I'm trying to load and parse a simple utf-8-encoded XML file in javascript using node and the xpath and xmldom packages. There are no XML namespaces used and the same XML parsed when converted to ASCII. I can see in the debugger in VS Code that the string has embedded spaces in between each character (surely due to loading the utf-8 file incorrectly) but I can't find a way to properly load and parse the utf-8 file.

Code:

var xpath = require('xpath')
  , dom = require('xmldom').DOMParser;

const fs = require('fs');

var myXml = "path_to_my_file.xml";

var xmlContents = fs.readFileSync(myXml, 'utf8').toString();

// this line causes errors parsing every single tag as the tag names have spaces in them from improper utf-8 decoding
var doc = new dom().parseFromString(xmlContents, 'application/xml');
var cvNode = xpath.select1("//MyTag", doc);

console.log(cvNode.textContent);

The code works fine if the file is ASCII (textContent has the proper data), but if it is UTF-8 then there are a number of parsing errors and cvNode is undefined.

Is there a proper way to parse UTF-8 XML in node/javascript? I can't for the life of me find a decent example.

3
  • 1
    Have you tried 'utf8' without the minus? That is the correct value to use for utf-8 encoding in this API. On the other hand, when you see additional white spaces between each letter this suggests that the file isn't actually encoded using utf-8 but uses an encoding with 16 bits base. Have you tried 'utf16le'? Commented Nov 19, 2019 at 18:29
  • yes, sorry, typo. I have tried both Commented Nov 19, 2019 at 18:32
  • @NineBerry 'utf16le' did the trick. Thanks so much. If you want to add an official answer I will mark it as such. Commented Nov 19, 2019 at 18:35

1 Answer 1

1

When you see additional white spaces between each letter, this suggests that the file isn't actually encoded using utf-8 but uses a 16 bit unicode encoding.

Try 'utf16le'.

For a list of supported encodings see Buffers and Character Encodings.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.