10

Is there any function to do the following?

var specialStr = 'ipsum áá éé lore';
var encodedStr = someFunction(specialStr);
// then encodedStr should be like 'ipsum \u00E1\u00E1 \u00E9\u00E9 lore'

I need to encode the characters that are out of ASCII range, and need to do it with that encoding. I don't know its name. Is it Unicode maybe?

5
  • @mplungjan this has nothing to do with URI encoding; neither of the linked questions do what the OP wants. Commented Sep 21, 2011 at 12:16
  • See javascripter.net/faq/escape.htm or, even better, see developer.mozilla.org/en/Core_JavaScript_1.5_Guide/…. Commented Sep 21, 2011 at 12:17
  • Or here Convert special characters to HTML in Javascript Commented Sep 21, 2011 at 12:17
  • 2
    @mplungjan you yet again seem to have failed to read the OP's question. Commented Sep 21, 2011 at 12:20
  • @Domenic - granted, I deleted the first links but the last link is more relevant (not the accepted answer but some of the other answers), I object to "Yet again" Commented Sep 21, 2011 at 12:21

4 Answers 4

20

This should do the trick:

function padWithLeadingZeros(string) {
    return new Array(5 - string.length).join("0") + string;
}

function unicodeCharEscape(charCode) {
    return "\\u" + padWithLeadingZeros(charCode.toString(16));
}

function unicodeEscape(string) {
    return string.split("")
                 .map(function (char) {
                     var charCode = char.charCodeAt(0);
                     return charCode > 127 ? unicodeCharEscape(charCode) : char;
                 })
                 .join("");
}

For example:

var specialStr = 'ipsum áá éé lore';
var encodedStr = unicodeEscape(specialStr);

assert.equal("ipsum \\u00e1\\u00e1 \\u00e9\\u00e9 lore", encodedStr);
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot Domenic, i will use this solution ;)
To decode, just assign the value to a variable: let a = "\u00e1"; console.log(a); // prints á
3

If you need hex encoding rather than unicode then you can simplify @Domenic's answer to:

"aäßåfu".replace(/./g, function(c){return c.charCodeAt(0)<128?c:"\\x"+c.charCodeAt(0).toString(16)})

returns: "a\xe4\xdf\xe5fu"

1 Comment

Do you know that the charcode can be larger than 255? "ė".replace(/./g, function(c){return c.charCodeAt(0)<128?c:"\\x"+c.charCodeAt(0).toString(16)}) returns \x117 and that will lead to trouble.
1

Just for information you can do as Domenic said or use the escape function but that will generate unicode with a different format (more browser friendly):

>>> escape("áéíóú");
"%E1%E9%ED%F3%FA"

2 Comments

Interestingly enough: escape("☃") === "%u2603" while escape("á") === "%E1". I wonder how they decide when to switch formats and add a "u" at the beginning...
Ah, well, MDN says "The escape and unescape functions do not work properly for non-ASCII characters and have been deprecated.": developer.mozilla.org/en/Core_JavaScript_1.5_Guide/… so maybe that's the source of the inconsistency.
1

This works for me. Specifically when using the Dropbox REST API:

   encodeNonAsciiCharacters(value: string) {
        let out = ""
        for (let i = 0; i < value.length; i++) {
            const ch = value.charAt(i);
            let chn = ch.charCodeAt(0);
            if (chn <= 127) out += ch;
            else {
                let hex = chn.toString(16);
                if (hex.length < 4)
                    hex = "000".substring(hex.length - 1) + hex;
                out += "\\u" + hex;
            }
        }
        return out;
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.