Hex Encode and Decode in C

Introduction

A very common task when working with binary data in C is, converting it to and from Hex. It’s unfortunate that C doesn’t provide a standard function that can take care of this for us. However, it’s pretty easy to implement a set of functions to handle it.

Hex encoding is always twice the size of binary. Since hex is base 16 we can take any `unsigned char` value 0-255 and represent it in two hex digits, 0x00-0xFF.

When dealing with Hex encoding, always use two characters even if the numeric value fits within one hex digit (0-F). It’s very important to have consistent sizing because 0FAB could be 0F AB or 00 0F 0A 0B. I can’t stress enough that always using the width of the largest value (FF) means you always know the number of characters that represent each value. In this situation one `unsigned char` is two hex characters. Going back to binary two hex characters will always convert back to one binary value.

Binary to hex

char *bin2hex(const unsigned char *bin, size_t len)
{
	char   *out;
	size_t  i;

	if (bin == NULL || len == 0)
		return NULL;

	out = malloc(len*2+1);
	for (i=0; i<len; i++) {
		out[i*2]   = "0123456789ABCDEF"[bin[i] >> 4];
		out[i*2+1] = "0123456789ABCDEF"[bin[i] & 0x0F];
	}
	out[len*2] = '\0';

	return out;
}

“0..F” is a `const` string here and we can index this as an array because it is an array. Assignment to a `const char *` variable means the variable points to the memory address of the constant. Since it’s just a memory address we an access it as an array.

There are a total of 16 hex characters. An unsigned char is 8 bits which is split into two 4 bit parts. 4 bits can have a value 0 to 15 which is the same number of characters for hex encoding. The right shift masks off the high part which is the first hex character and the 0x0F mask masks off the low part to get the second hex digit.

Hex to binary

int hexchr2bin(const char hex, char *out)
{
	if (out == NULL)
		return 0;

	if (hex >= '0' && hex <= '9') {
		*out = hex - '0';
	} else if (hex >= 'A' && hex <= 'F') {
		*out = hex - 'A' + 10;
	} else if (hex >= 'a' && hex <= 'f') {
		*out = hex - 'a' + 10;
	} else {
		return 0;
	}

	return 1;
}

Every hex digit needs to be turned back into a 4 bit binary value. Meaning 0 = 0, 1 = 1, … A = 10 … E = 14, F = 15. The character is subtracted from the base character in it’s range and for the alpha values 10 is added since they represent 10+. This calculation is based on the numeric values of each character in the ASCII text encoding table.

size_t hexs2bin(const char *hex, unsigned char **out)
{
	size_t len;
	char   b1;
	char   b2;
	size_t i;

	if (hex == NULL || *hex == '\0' || out == NULL)
		return 0;

	len = strlen(hex);
	if (len % 2 != 0)
		return 0;
	len /= 2;

	*out = malloc(len);
	memset(*out, 'A', len);
	for (i=0; i<len; i++) {
		if (!hexchr2bin(hex[i*2], &b1) || !hexchr2bin(hex[i*2+1], &b2)) {
			return 0;
		}
		(*out)[i] = (b1 << 4) | b2;
	}
	return len;
}

The first thing we do is determine the size of the buffer and allocate it. Then we can move onto the main part where we combine the two 4 bit values into one 8 bit unsigned character.

Testing

Finally, here is a simple test app to demonstrate the use of each function.

int main(int argc, char **argv)
{
	const char    *a = "Test 123! - jklmn";
	char          *hex;
	unsigned char *bin;
	size_t         binlen;

	hex = bin2hex((unsigned char *)a, strlen(a));
	printf("%sn", hex);

	binlen = hexs2bin(hex, &bin);
	printf("%.*sn", (int)binlen, (char *)bin);

	free(bin);
	free(hex);
	return 0;
}

You might notice that the input variable `a` is a string. We’ll, it’s still data and we can treat it as binary. Using a string makes it easier to verify the decode since we can print it out and see the result.