0

I am extracting string from an byte array. The string is a sql script.

String sql = System.Text.Encoding.GetEncoding(1200).GetString(script);

The first character is coming out to be junk(square box in preview). Due to which the whole script is failing. Any idea why this is happening?

I don't want to specifically remove the first character. More interested in knowing why and how can this be avoided.

3
  • There is no difference between System.Text.Encoding.GetEncoding and System.Text.UTF32Encoding.GetEncoding. I've removed the reference to UTF32Encoding because it might confuse people. Commented Dec 6, 2010 at 16:55
  • Do you have the actual sequence of bytes? Commented Dec 6, 2010 at 16:56
  • @lganacio: Actual sequence? 'script' is a byte array. Commented Dec 6, 2010 at 17:03

1 Answer 1

2

The first character(s) are probably Byte Order Marks (BOM).

You can use a StreamReader to automatically detect any BOM and select the appropriate encoding:

byte[] script;
string sql;

using (var reader = new StreamReader(new MemoryStream(script), true))
{                                   //                          ↑ 
    sql = reader.ReadToEnd();       //        detectEncodingFromByteOrderMarks
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.