2

I am being passed html as a string. My goal is to create a new document from the html that has all the appropriate nodes so that I can do things like call doc.getElementsByTagName on the doc I create and have it work as expected. An example of my code is here.

var doc = window.document.implementation.createDocument
    ('http://www.w3.org/1999/xhtml', 'html',  null);
doc.getElementsByTagName('html')[0].innerHTML =
      '<head><script>somejs</script>' +
      '<script>var x = 5; var y = 2; var foo = x + y;</script>' +
      '</head><body></body>';
var scripts = doc.getElementsByTagName('script');
console.log(scripts[0] + " code = " + scripts[0].innerHTML);

I am having the following issues:

  1. If something inside a script tag contains a character like < (eg in the example above in the "var foo = x + y;" statement change the + to a < symbol), I get an INVALID_STATE_ERR: DOM Exception 11.
  2. Even if nothing inside the script tag uses such characters, when I run the above I get the output "[object Element] code =undefined"

So my questions are:

A. How do I handle characters such as < that give DOM Exception 11 when I try to use them in whatever I am setting the innerHTML to B. How do I make the document properly parse the script tags and put their code into their innerHTML attribute so that I can later read it.

EDIT: As Ryan P pointed out this code actually works in FF. So if anyone could help me get it working in chrome that would be much appreciated!

2 Answers 2

5

Taken from https://github.com/rails/turbolinks, why dont you try to create the document this way:

doc = document.implementation.createHTMLDocument("");
doc.open("replace");
doc.write(html);
doc.close();

where the html should be your html contents. I havent tested it and dont know if you should escape characters first.

Sign up to request clarification or add additional context in comments.

2 Comments

I've confirmed this to work (although I don't think the empty "" is necessary).
The "" will tell the document not to load the requests specified in the HTML, such are images, scripts, etc... in other words, the "" don't load HTTP request inside, otherwise, if not specified it does, and even runs the JS.. You probably don't want this, so the quotes should be quite necessary.
0

A. You need to convert any < to an HTML entity (&lt;). The rules don't cease to apply just cause you're in a script tag.

B. You call your variable 'doc' but try to get the script tags from an undefined variable 'tempDoc'. When I run your code in my browser after changing that variable, it all seems to work fine.

2 Comments

Thanks for the reply! The tempDoc vs doc was an error of me putting the code into stack overflow. I originally had two functions. Interstingly, after reading your response I tried running my code in FF and it works but it does not work in chrome. Any idea why that might be? Can you confirm that behavior? Also is there a standard regexp for converting all the necessary characters to their HTML entity equivilents?
Yeah, I was using FF. Checked in Chrome and innerHTML doesn't work, but textContent works for both Chrome and FF. Dunno about IE.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.