7

I have used the below code to convert Charsequence to Byte Array. Then I save the Byte Array as Blob to my Sqlite Database.

For this , I have used the below code,

 public static byte[] toByteArray(CharSequence charSequence) {
        if (charSequence == null) {
          return null;
        }
        byte[] barr = new byte[charSequence.length()];
        for (int i = 0; i < barr.length; i++) {
          barr[i] = (byte) charSequence.charAt(i);
        }

        return barr;
      }

Now I would like to convert my byte array retrieved from sqlite to Charsequence. But I couldn't get any help on it.

How to convert Byte Array to Charsequence?

Any help is much appreciated.

4
  • Is this ASCII only? Because if not, that conversion will lose data. Commented Aug 21, 2012 at 9:38
  • CharSequence is an interface, so you need an actual implementation to put your byte array into... Commented Aug 21, 2012 at 9:40
  • @Thilo No my firned. It is TSCII format. I am working for a Indic language app. Loss of data might affect my html sequence I believe. Commented Aug 21, 2012 at 9:40
  • If you got a CharSequence in Android, it has already been transformed to Unicode (or is already broken). Why not use UTF-8 for everything in your system, and then (maybe, if really required) convert it to TSCII for import/export to whatever else you are running there? Commented Aug 21, 2012 at 9:56

3 Answers 3

21

To convert a CharSequence into a byte array

CharSequence seq;
Charset charset;
...
byte[] bytes = seq.toString().getBytes(charset);

To convert back again

CharSequence seq2 = new String(bytes, charset);

Just remember that CharSequence is an interface that is implemented by String, StringBuilder, StringBuffer, etc so all String instances are CharSequence instances but not all CharSequence instances are String but the contract for CharSequence is that its toString() method should return the equivalent String

Internally all strings in Java are represented as Unicode, so as long as the consumer and producer are both Java the safest charset to use is one of UTF-8 or UTF-16 depending on the likely encoding size of your data. Where Latin scripts predominate,

Charset charset = Charset.forName("UTF-8"); 

will 99.9% of the time give the most space efficient encoding, for non-latin character sets (e.g. Chinese) you may find UTF-16 more space efficient depending on the data set you are encoding. You would need to have measurements showing that it is a more space efficient encoding and as UTF-8 is more widely expected I recommend UTF-8 as the default encoding in any case.

Sign up to request clarification or add additional context in comments.

2 Comments

+1 for showing the proper way to encode a CharSequence in the first place.
If you are really caught in a TSCII encoding nightmare, you may find this: unicode.org/notes/tn15/Tscii2Unicode2.pdf helpful
8

It looks like you are using ASCII data (if not, your code is quite lossy).

To get a CharSequence from ASCII bytes, you can do

CharSequence x = new String(theBytes, "US-ASCII");

For other encodings, just specify the name of the character set.

11 Comments

What If I want to use TSCII type? Is that possible?
+1 I would use ISO-8852-1 which is 8-bit characters whereas US-ASCII is technically 7-bit. docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html
I am not sure if Java supports TSCII. Cannot you use UTF-8? Where do you get the data from (the original CharSequence that you wrote into the DB)?
@PeterLawrey: But that won't match his "encoder" (which is just taking the first byte of every Unicode codepoint). Nothing will survive that except for 7-bit ASCII.
@AndroSelva You also have to make sure that your encoder (the code you posted above) converts the CharSequence to a byte[] using the encoding that you want to use. Now you're only casting chars to bytes.
|
1
CharSequence c = new String(byte[]);

2 Comments

Note that this will use the default character encoding of your system to interpret the bytes as characters. That may or may not be what you want.
Fair point - you will need to let know the charset in the constructor if you are not using the default. Uboat for you sir! docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.