C# Bit Bashing – The BitConverter

One of the most common reasons for having to resort to bit manipulation is that you have a raw stream of bytes and need to turn them into structured data.

As the logical operations don’t work for non-integral types this is something of a problem if you want to convert bytes into floats, say. It is even not particularly easy to convert bytes into integral types.

This is where the BitConverter class becomes important. It is intended to help you serialise an object to bytes and back again but this doesn’t stop us using it for other purposes.

The BitConverter has a GetBytes method that will convert any of the standard data type into a byte array and a specific ToType method that converts the same byte array back into the appropriate data type. So for example

double a=1.234;
byte[] data;
data=BitConverter.GetBytes(a);

returns an eight-byte array containing the bits that were stored in the eight-byte double.

To reconstruct the double we use:

double b = BitConverter.ToDouble(data);

There are also two alternative forms of the ToTypemethod that let you specify the location in the array of the data – very useful when processing raw byte streams.

The BitConverter class allows you to both generate and reconstruct data from byte streams but it also gives you the opportunity to perform logical operations on a range of data types as a byte array and then restore the result. It is important to realise that BitConverter is “little endian” which is something that non-Intel programmers don’t take for granted.

For example, what value is stored in x after this:

byte[] IntVal={0x00,0x01} ;
Int16 x = BitConverter.ToInt16(IntVal, 0);

If you think the answer is 1 then you are working in big endian format, i.e. the high order byte comes first, but the actual answer on Intel-based machines is 256 because they are little endian and take the first byte as the low order byte.

Usually this isn’t a problem once you know about it, but it can result in some messy byte switching if you are trying to work with little endian data.

There is one other problem with using BitConverter and strings. What do you think is stored in s in this case?

byte[] data={(byte)'A',(byte)'B',(byte)'C'};
String s = BitConverter.ToString(data);

You might think that it would be “ABC” but the answer is “41-53-4A”, i.e. a string representation of the hexadecimal values stored in the byte array.

In short, BitConverter isn’t much use converting raw bytes into strings. The only solution to this problem seems to be to use one of the “unsafe” string constructors that takes a pointer to a byte array.

 

unsafe String ToString(sbyte[] data)
{
 string s;
 fixed(sbyte* pdata=data)
 {
  s = new string(pdata) ;
 }
 return s;
}

To use this method you have to compile with “allow unsafe”. If you now try:

sbyte[] data ={ (sbyte)'A',
            (sbyte)'B', (sbyte)'C',0 };
string s = ToString(data);

you will see that the string “ABC” has been constructed from the raw byte array.

There is a lot more to say about bit manipulation in C#, there’s a bit array type for example, but this will have to wait.