Escaping HTML entities in JavaScript string literals within the <script> block

Question

On the one hand if I have

<script>
var s = 'Hello </script>';
console.log(s);
</script>

the browser will terminate the <script> block early and basically I get the page screwed up.

On the other hand, the value of the string may come from a user (say, via a previously submitted form, and now the string ends up being inserted into a <script> block as a literal), so you can expect anything in that string, including maliciously formed tags. Now, if I escape the string literal with htmlentities() when generating the page, the value of s will contain the escaped entities literally, i.e. s will output

Hello &lt;/script&gt;

which is not desired behavior in this case.

One way of properly escaping JS strings within a <script> block is escaping the slash if it follows the left angle bracket, or just always escaping the slash, i.e.

var s = 'Hello <\/script>';

This seems to be working fine.

Then comes the question of JS code within HTML event handlers, which can be easily broken too, e.g.

<div onClick="alert('Hello ">')"></div>

looks valid at first but breaks in most (or all?) browsers. This, obviously requires the full HTML entity encoding.

My question is: what is the best/standard practice for properly covering all the situations above - i.e. JS within a script block, JS within event handlers - if your JS code can partly be generated on the server side and can potentially contain malicious data?

possible duplicate of JavaScript and error "end tag for element which is not open" — Quentin
– Quentin, Commented Jan 5, 2012 at 21:47

ThinkingStiff · Accepted Answer · 2012-01-05 23:51:01Z

45

The following characters could interfere with an HTML or Javascript parser and should be escaped in string literals: <, >, ", ', \, and &.

In a script block using the escape character, as you found out, works. The concatenation method (</scr' + 'ipt>') can be hard to read.

var s = 'Hello <\/script>';

For inline Javascript in HTML, you can use entities:

<div onClick="alert('Hello &quot;>')">click me</div>

Demo: http://jsfiddle.net/ThinkingStiff/67RZH/

The method that works in both <script> blocks and inline Javascript is \uxxxx, where xxxx is the hexadecimal character code.

< - \u003c
> - \u003e
" - \u0022
' - \u0027
\ - \u005c
& - \u0026

Demo: http://jsfiddle.net/ThinkingStiff/Vz8n7/

HTML:

<div onClick="alert('Hello \u0022>')">click me</div>

<script>
    var s = 'Hello \u003c/script\u003e';
alert( s );
</script>

edited Jan 5, 2012 at 23:51

answered Jan 5, 2012 at 20:23

ThinkingStiff

65.4k31 gold badges148 silver badges241 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mojuba Over a year ago

The hex escape method is the best so far: you don't have to worry where your string ends up in the code, just send everything through one basic server-side function. Great, I like it!

Magnus Over a year ago

Shouldn't newline - \u000a be on that list as well?

hugomg · Accepted Answer · 2012-01-05 20:21:09Z

2

I'd say the best practice would be avoiding inline JS in the first place.

Put the JS code in a separate file and include it with the src attribute

<script src="path/to/file.js"></script>

and use it to set event handlers from the inside isntead of putting those in the HTML.

//jquery example
$('div.something').on('click', function(){
    alert('Hello>');
})

answered Jan 5, 2012 at 20:21

hugomg

70.4k30 gold badges168 silver badges257 bronze badges

2 Comments

mojuba Over a year ago

And what if I have my reasons for using inline code? For efficiency, saving traffic, connections, etc. on a highly loaded web site.

hugomg Over a year ago

@mojuba: Well, by the time you get to this kind of performance tuning most best practices have already been thrown out the window :)

Jamie Treworgy · Accepted Answer · 2012-01-05 20:27:03Z

2

(edit - somehow didn't notice you mentioned slash-escape in your question already...)

OK so you know how to escape a slash.

In inline event handlers, you can't use the bounding character inside a literal, so use the other one:

<div onClick='alert("Hello \"")'>test</div>

But this is all in aid of making your life difficult. Just don't use inline event handlers! Or if you absolutely must, then have them call a function defined elsewhere.

Generally speaking, there are few reasons for your server-side code to be writing javascript. Don't generate scripts from the server - pass data to pre-written scripts instead.

(original)

You can escape anything in a JS string literal with a backslash (that is not otherwise a special escape character):

var s = 'Hello <\/script>';

This also has the positive effect of causing it to not be interpreted as html. So you could do a blanket replace of "/" with "\/" to no ill effect.

Generally, though, I am concerned that you would have user-submitted data embedded as a string literal in javascript. Are you generating javascript code on the server? Why not just pass data as JSON or an HTML "data" attribute or something instead?

edited Jan 5, 2012 at 20:27

answered Jan 5, 2012 at 20:13

Jamie Treworgy

24.4k9 gold badges81 silver badges120 bronze badges

5 Comments

mojuba Over a year ago

Re: passing strings to JS: it's a valid point to use, say, JSON instead, but I'm trying to save some traffic and connections by inserting data directly into HTML/JS. For small amounts of data I think it's OK.

Jamie Treworgy Over a year ago

This technique can only cost you in terms of bandwidth, since such scripts cannot be cached by the browser. Quick and dirty, stick it in a hidden element: <span style="display:none;" id="mule" data-text="... attributed encoded text or JSON structure"></span> There's no rule against doing it however you want, but it sure saves a lot of headaches and makes for easier, more secure, more maintainable code to avoid generating scripts.

mojuba Over a year ago

Re: your solution with reverting the bounding characters will require my server-side code to look for quotes within my JS snippet and decide whether it should be enclosed in single or double quotes. Getting too complicated. Far easier to just escape everything like any HTML literal text.

Jamie Treworgy Over a year ago

Except it won't work reliably because javascript isn't a literal. You need to combine the rules for escaping within javascript literals, and the rules for escaping within an HTML element, which is pretty darn complicated all of the sudden. A double-quote inside single-quotes becomes " but what about a double-quote that's bounding a string literal? Answer is simple: avoid inline scripts. Pass data instead.

mojuba Over a year ago

To be honest, I already fixed my code and it works. Rule #1: when generating a JS string literal on the server, escape quotes, newlines and slash with backslash. Rule #2: when inserting anything into HTML other than JS code in the script block, escape as usual with htmlentities().

Dave Brown · Accepted Answer · 2015-07-26 14:27:50Z

2

Here's how I do it:

function encode(r){
return r.replace(/[\x26\x0A\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"})
}

var myString='Encode HTML entities!\n"Safe" escape <script></'+'script> & other tags!';

test.value=encode(myString);

testing.innerHTML=encode(myString);

/*************
* \x26 is &ampersand (it has to be first),
* \x0A is newline,
*************/

<textarea id=test rows="9" cols="55"></textarea>

<div id="testing">www.WHAK.com</div>

answered Jul 26, 2015 at 14:27

Dave Brown

9479 silver badges6 bronze badges

Comments

Diodeus - James MacFarlane · Accepted Answer · 2012-01-05 20:08:02Z

-2

Most people use this trick:

var s = 'Hello </scr' + 'ipt>';

answered Jan 5, 2012 at 20:08

Diodeus - James MacFarlane

115k33 gold badges164 silver badges180 bronze badges

1 Comment

mojuba Over a year ago

So if the code is generated on the server side, I need to look for <script> and replace it with the broken one? Isn't it easier to just escape the slashes?

Collectives™ on Stack Overflow

Escaping HTML entities in JavaScript string literals within the <script> block

5 Answers 5

2 Comments

2 Comments

5 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

2 Comments

5 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related