1

I have a situation where I receive UTF-16 codepoints (one at a time). So I'm collecting them in a list and later convert the list to an array.

That leaves me with a uint16[], but GLib.convert () needs a string instead:

int main () {
    var utf16data = new Gee.ArrayList<uint16> ();

    utf16data.add ('A');
    utf16data.add (0xD83C);
    utf16data.add (0xDC1C);

    var utf16array = utf16data.to_array ();

    try {
        // convert expects a string here
        var s = convert (utf16array, utf16data.size * 2, "UTF-8", "UTF-16LE");
        stdout.printf ("%s\n", s);
    } 
    catch (ConvertError e) {
        stderr.printf (@"error: $(e.message)\n");
    }

    return 0;
}

So how do I convert a UTF-16 array into a UTF-8 string?

Update:

I tried to just cast the array:

int main () {
    var utf16data = new Gee.ArrayList<uint16> ();

    utf16data.add ('A');
    utf16data.add (0xD83C);
    utf16data.add (0xDC1C);
    // utf16data.add (0);

    var utf16array = utf16data.to_array ();

    try {
        size_t bytes_read;
        size_t bytes_written;
        var s = convert ((string) utf16array, utf16data.size * 2, "UTF-8", "UTF-16LE", out bytes_read, out bytes_written);
        stdout.puts (@"bytes_read = $bytes_read\n");
        stdout.puts (@"bytes_written = $bytes_written\n");
        stdout.puts (@"s.length = $(s.length)\n");
        // Should print "A🀜", but the Unicode symbol is not printed
        stdout.puts (@"s = $s\n");
    } 
    catch (ConvertError e) {
        stderr.printf (@"error: $(e.message)\n");
    }

    return 0;
}

Now at least the "A" is written to stdout, but the Unicode symbol is not.

bytes_read = 6
bytes_written = 3
s.length = 1
s = A

Is it correct to just cast an array to a string in this context?

Why is the Unicode symbol not converted?

Update 2:

This is the code that I have now settled with:

int main () {
    var utf16data = new Gee.ArrayList<uint16> ();

    utf16data.add ('A');
    utf16data.add (0xD83C);
    utf16data.add (0xDC1C);

    // Replacement for 
    // utf16array = utf16data.to_array;
    uint16[] utf16array = new uint16[utf16data.size];
    for (int i = 0; i < utf16data.size; i++)
        utf16array[i] = utf16data[i];

    try {
        var s = convert ((string)utf16array, utf16array.length * 2, "UTF-8", "UTF-16LE");
        stdout.puts (@"$s\n");
    } 
    catch (ConvertError e) {
        stderr.puts (@"error: $(e.message)\n");
    }

    return 0;
}
2
  • Are you sure the values are bytes, and not something like characters? (Which, given Java's string handling, may possibly be stored 'conveniently', rather than as literal bytes. Disclaimer: this -> · <- is the full extent of my Java knowledge.) Also, your 2 Unicode characters are well within the High Surrogate Range (meaning the two codes ought to represent one character. Does it work with an easier glyph? Commented Aug 24, 2015 at 20:55
  • @Jongware The utf16data array is fine, I think my cast breaks everything here. It doesn't even work with a 'A', 'B', 'C' input. I have to figure out how to put the UTF-16 data into the convert () method. It's Vala not Java, btw. (different languages) Commented Aug 24, 2015 at 21:16

1 Answer 1

2

The problem is with the to_array. It does not produce an array of uint16, but an array to pointers, with the value set to the uint16 value. This is the standard boxed representations. There seems to be a problem in Gee that it isn't producing an array of the correct type. If you change the array to:

uint16[] utf16array = {'A', 0xD83C, 0xDC1C};

It works just fine.

Sign up to request clarification or add additional context in comments.

3 Comments

Any idea how to dynamically add values though? I just tried GLib.Array<uint32> that doesn't work either.
Ok, I'm no using a simple loop which is not efficient, but it works (see my update of the question).
You can dynamically expand an array in Vala using += so long as it is either a local variable or a private field.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.