2

I have a hexidecimal string that I need to convert to a byte array. The best way (ie efficient and least code) is:

string hexstr = "683A2134";
byte[] bytes = new byte[hexstr.Length/2];
for(int x = 0; x < bytes.Length; x++)
{
    bytes[x] = Convert.ToByte(hexstr.Substring(x * 2, 2), 16);
}

In the case where I have a 32bit value I can do the following:

string hexstr = "683A2134";
byte[] bytes = BitConverter.GetBytes(Convert.ToInt32(hexstr, 16)); 

However what about in the general case? Is there a better built in function, or a clearer (doesn't have to be faster, but still performant) way of doing this?

I would prefer a built in function as there seems to be one for everything (well common things) except this particular conversion.

3
  • Note that your solution and the accepted solution will fail if passed an odd-length string. In the case of "A", for example, the returned byte array will have nothing in it. Commented Mar 26, 2009 at 13:20
  • @Jim, I've just posted an answer addressing that concern, although it's easy enough to fix the other solutions too. Commented Mar 26, 2009 at 13:22
  • I realize that. I was assuming a validated string. Commented Mar 26, 2009 at 22:07

5 Answers 5

7

You get the best performance if you calculate the values from the character codes instead of creating substrings and parsing them.

Code in C#, that handles both upper and lower case hex (but no validation):

static byte[] ParseHexString(string hex) {
    byte[] bytes = new byte[hex.Length / 2];
    int shift = 4;
    int offset = 0;
    foreach (char c in hex) {
        int b = (c - '0') % 32;
        if (b > 9) b -= 7;
        bytes[offset] |= (byte)(b << shift);
        shift ^= 4;
        if (shift != 0) offset++;
    }
    return bytes;
}

Usage:

byte[] bytes = ParseHexString("1fAB44AbcDEf00");

As the code uses a few tricks, here a commented version:

static byte[] ParseHexString(string hex) {
    // array to put the result in
    byte[] bytes = new byte[hex.Length / 2];
    // variable to determine shift of high/low nibble
    int shift = 4;
    // offset of the current byte in the array
    int offset = 0;
    // loop the characters in the string
    foreach (char c in hex) {
        // get character code in range 0-9, 17-22
        // the % 32 handles lower case characters
        int b = (c - '0') % 32;
        // correction for a-f
        if (b > 9) b -= 7;
        // store nibble (4 bits) in byte array
        bytes[offset] |= (byte)(b << shift);
        // toggle the shift variable between 0 and 4
        shift ^= 4;
        // move to next byte
        if (shift != 0) offset++;
    }
    return bytes;
}
Sign up to request clarification or add additional context in comments.

1 Comment

Hard to decide between this one and Johns. Johns is a lot easier to understand, but I think this one is more 'elegant'
4

There's nothing built-in, unfortunately. (I really should have the code I've got here somewhere else - it's at least the 3rd or 4th time I've written it.)

You could certainly create a more efficient version which parsed a nybble from a char rather than taking a substring each time, but it's more code. If you're using this a lot, benchmark the original code to see whether or not it's adequate first.

private static int ParseNybble(char nybble)
{
    // Alternative implementations: use a lookup array
    // after doing some bounds checking, or use 
    // if (nybble >= '0' && nybble <= '9') return nybble-'0' etc
    switch (nybble)
    {
        case '0' : return 0;
        case '1' : return 1;
        case '2' : return 2;
        case '3' : return 3;
        case '4' : return 4;
        case '5' : return 5;
        case '6' : return 6;
        case '7' : return 7;
        case '8' : return 8;
        case '9' : return 9;
        case 'a': case 'A' : return 10;
        case 'b': case 'B' : return 11;
        case 'c': case 'C' : return 12;
        case 'd': case 'D' : return 13;
        case 'e': case 'E' : return 14;
        case 'f': case 'F' : return 15;
        default: throw new ArgumentOutOfRangeException();
    }
}

public static byte[] ParseHex(string hex)
{
    // Do error checking here - hex is null or odd length
    byte[] ret = new byte[hex.Length/2];
    for (int i=0; i < ret.Length; i++)
    {
        ret[i] = (byte) ((ParseNybble(hex[i*2]) << 4) |
                         (ParseNybble(hex[i*2+1])));
    }
    return ret;
}

1 Comment

Thanks for spotting the error (I was playing with 2 different versions)
3

Take a look at this - it's very short and is part of the .NET framework:

System.Runtime.Remoting.Metadata.W3cXsd2001.SoapHexBinary.Parse("C3B01051359947").Value

Comments

0

Here's a one-liner using LINQ. It's basically just a translation of your original version:

string hexstr = "683A2134";

byte[] bytes = Enumerable.Range(0, hexstr.Length / 2)
    .Select((x, i) => Convert.ToByte(hexstr.Substring(i * 2, 2), 16))
    .ToArray();

If you'll potentially need to convert strings of uneven length (ie, if they might have an implicit leading-zero) then the code becomes a bit more complicated:

string hexstr = "683A2134F";    // should be treated as "0683A2134F"

byte[] bytes = Enumerable.Range(0, (hexstr.Length / 2) + (hexstr.Length & 1))
    .Select((x, i) => Convert.ToByte(hexstr.Substring((i * 2) - (i == 0 ? 0 : hexstr.Length & 1), 2 - (i == 0 ? hexstr.Length & 1 : 0)), 16))
    .ToArray();

Comments

0
public class HexCodec {
  private static final char[] kDigits =
      { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        'a', 'b', 'c', 'd', 'e', 'f' };

  public static byte[] HexToBytes(char[] hex) {
    int length = hex.length / 2;
    byte[] raw = new byte[length];
    for (int i = 0; i < length; i++) {
      int high = Character.digit(hex[i * 2], 16);
      int low = Character.digit(hex[i * 2 + 1], 16);
      int value = (high << 4) | low;
      if (value > 127)
        value -= 256;
      raw[i] = (byte) value;
    }
    return raw;
  }

  public static byte[] HexToBytes(String hex) {
    return hexToBytes(hex.toCharArray());
  }
}

5 Comments

Isn't this the same as my example?
No - it swallows exceptions and returns bad data ;)
That's true. I'm working with validated data though, I know it's a valid hex string.
It's not C#, so what is it? J#?
@Robert: The "final" keyword, the case insensetive identifiers, the "Character" class...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.