Looping Through Bytes to Check for Bits

Checking for bits in 1 byte is easy. Checking in 2 bytes is also easy. Checking an odd number of bits in a variable number bytes isn’t so easy. The hard part is dealing with the boundary between bytes where we need to move from one to the next.

Lets say we have 3 bytes. We need to count the number of bits set for the first 19 bits.

First we need the block of bytes we want to look at.

24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
 1  1  1  0  1  0  0  1  0  0  0  0  0  1  1 1 0 1 0 1 0 0 0 0

This breaks down to 3 bytes with the following values for each byte.

byte 1 = 11101001 = 233
byte 2 = 00000111 = 7
byte 3 = 01010000 = 80

Lets see how we can check the bits.

#include <stdio.h>

static const size_t last_bit = 19;

int main(int argc, char **argv)
{
    unsigned char data[3] = { 233, 7, 80 };
    size_t total_blocks   = sizeof(data)/sizeof(*data);
    size_t block_size     = sizeof(*data)*8;
    size_t idx;
    size_t bit_num;
    size_t cnt            = 0;
    size_t i;

    for (i=last_bit; i-->0; ) {
    	idx     = total_blocks - 1 - (i / block_size);
    	bit_num = i - (i / block_size * block_size);

    	if (data[idx] & (1 << bit_num)) {
    		cnt++;
        }
    }
    printf("%zu\n", cnt);

    return 0;
}

The count is 6 bits.

Now let’s look at the code in depth.

static const size_t last_bit = 19;

last_bit is a constant set to 19 because this is number of bytes that should be checked. Checking happens from right to left and will be index 0 - 18.

unsigned char data [3] = { 233, 7, 80 };
short data[2]          = { 233, 1872 };

This is the bits broken out into three bytes. The second line isn’t in the code but it is another way to represent the same bits. A short is 2 bytes (16 bits). 2 shorts are necessary to represent the 3 bytes of data. Either of these two could be used in the code. The code is generic to the point it doesn’t matter what the data type is because the code works on the bytes themselves.

size_t total_blocks = sizeof(data)/sizeof(*data);

We need to know how many elements are in data. With char or unsigned char this could be sizeof(data). This won’t work with short (or any other type that’s larger than 1 byte) because sizeof returns the total number of bytes of the given type. sizeof will return 4 so we need to divide by the size of a single element in the array. Giving us the total number of bytes in the array. 4/2 = 2 for short data which has 2 elements.

size_t block_size = sizeof(*data)*8;

This determines the number of bits in each element of the data array. sizeof returns the number of bytes in the element and 8 is the constant which is the number of bits in a byte.

for (i=last_bit; i-->0; ) {

Start with the number of bits we want to count and count down to 0. The loop start with last_bit (19), checks if i is greater than 0, then decrements by 1. A full explanation of this looping method is here.

idx = total_blocks - 1 - (i / block_size);

Determine what element we are looking at. The loop goes from left to right skipping over any bytes and bits that are not part of the segment we are looking for.

To clarify 24 … 19 … 1. The loop starts at 19 and goes bit by bit from left to right. Even though bits are counted from right to left we’ve already determined what bit is the 19th (the one we need to start on to have the first 19 bits checked).

bit_num = i - (i / block_size * block_size);

Now we need to know what bit within the block we need to check. This will be based on the total number of bits in the data type. 8 for char 16 for short. Again this is left to right. As element boundaries are passed the bit number will restart. When it starts, if bits are skipped, this will start at the correct bit number.

if (data[idx] & (1 << bit_num)) {

Check the bit at the given position in the given element of data. The code will then increment the count if it is set.

This code is an example of how to manipulate bits in an array. This is very useful when the number of elements couldn’t just be put into a larger data type. For example 3 bytes fits in an int. It is possible to simply use an int and shift the number of bits necessary. But if you have 64 bytes there isn’t a data type this would fit into so you need to split into an array.