5

Does somebody know a script that is able to convert a string to a ArrayBuffer using unicode encoding?

I´m creating a browser-side eqivalent of the "Buffer" of node.js. The only encoding that is left is unicode. All others are done.

Thanks for your help!

10
  • which unicode encoding: utf-8, utf-16le, utf-16be, utf-32le, utf-32be? there are quite a few. Commented Jan 25, 2012 at 16:56
  • the node.js docs say it´t the unicode BMP(Basic Multilingual Plane) encoding. Commented Jan 25, 2012 at 16:58
  • Basic Multilingual Plane is an abstraction related to unicode, but not an encoding and is related to all encodings listed above. UTF-16LE is used in Javascript browser engines and it is that, according to your answer. Commented Jan 25, 2012 at 17:50
  • 1
    is your Buffer port open source? Commented Sep 19, 2012 at 17:39
  • 1
    @Janus Troelsen I haven't published it on github, but if you wish I can do it. But there are better ones, I think. Just search for "buffer browserify" on github and you'll find very good code. One repo is also used by node-browserify. Hope it helps. Commented Sep 19, 2012 at 19:03

1 Answer 1

8

I found it out by myself.

Decoding:

var b = new Uint8Array(str.length*2);
for(var i = 0; i < b.length; i+=2){
    var x = str.charCodeAt(i/2);
    var a = x%256;
    x -= a;
    x /= 256;
    b[i] = x;
    b[i+1] = a;
}

Encoding

var s = "";
for(var i = 0; i < this.length;){
    s += String.fromCharCode(this[i++]*256+this[i++]);
}
Sign up to request clarification or add additional context in comments.

5 Comments

s += String.fromCharCode(this[i++]*256+this[i++]); would be slow for long strings. Gather charcodes in array arr and execute String.fromCharCode.apply(arr).
Ouch, sorry. String.fromCharCode.apply(*null*, arr)
ROFL. I just faced the same problem, when transfering data from Java applet into Javascript.
some unicode characters use more than 2 bytes, so I'm not sure how you detect those etc, it's a long spec and it's been a while since I browsed it.
This doesn't seem to work. sha1sum the bytes of "hello world" (in your terminal) and then convert it with that method and you'll get something completely different using the Web Crypto API. It may contain a the string, but it doesn't convert it. See gist.github.com/coolaj86/87d834cfe6ec07d2ee81 I still haven't figured it out for multi-byte characters, but I have gotten single byte characters to match sha1sums as expected.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.