Introduction
Something that comes up surprisingly often is traversing an array backwards.
Maybe you’re emptying a queue. How about my personal favorite, reversing the
order of elements. Counting in a for
loop is so common you just don’t think
about it. But counting backwards can lead to issues if you don’t do it right.
The Wrong Way
When you count down to and include 0 with an unsigned integer you’d think you could do something like this:
size_t len;
size_t i;
for (i=len-1; i>=0; i--)
...
This leads to an infinite loop because i
will be decremented when i == 0
. Instead of
i
becoming -1 it will wrap around because it’s unsigned and an unsigned type can never
be negative. So i
becomes the largest value for the data type. With i
always being
greater than or equal to zero this loop will never exit.
The next possible solution is to case to a signed type like so
size_t len
int i;
for (i=(int)len-1; i>=0; i--)
...
This will work but now we get into the whole 64 bit indexes can exceed the size
of an int
issue which I’m sure you know all about. If len
is greater than INT_MAX
you
will undoubtedly end up with the int
wrapping to a negative value before the loop even start.
And it won’t run at that point because i
is already less than 0. I realize that signed wrapping
is undefined but if you’ve ever used C you’ll notice that pretty much every modern and common
compiler will wrap signed variables in the same manner as unsigned.
Assuming a 64 bit system and a 8 byte size_t
, kake a look at this to see what I mean:
int main(int argc, char **argv)
{
size_t len;
int i;
len = INT_MAX;
len += 7;
i = len;
printf("len=%zu, i=%d\n", len, i);
}
Output:
$ ./a.out
len=2147483654, i=-2147483642
Another idea is instead of using an int
use an ssize_t
. This will work because ssize_t
is
like size_t
right? They both have size_t
in their name… The s in ssize_t
stands for
signed so this won’t work either. Just like any other signed and unsigned types the signed
version has effectively half the positive range as the unsigned version. While the range
might be larger than an int
you still have the same problems.
The Right Way
We can “abuse” the comparison to do the decrement and solve all our problems.
size_t len;
size_t i;
for (i=len; i-->0; )
...
This will properly count down from size, and include 0. Realize the first value
of i
is not len
but will be len-1
. You don’t have to do the -1 yourself
because the loop will take care of that for you.
The key is i--
will take place after the comparison with 0. So i
in the loop
body will be one less than what was compared. When len = 2
, i
will be set
to 2, and compared to 0. 2 is larger than zero but after the comparison i
will be decremented to 1 and the loop will be entered. The next iteration i
will be 1 which is greater than 0, decremented to 0 and the loop will be
entered. The next iteration will compare i
> 0 which it is not because i
is
0 so the loop will stop. This will to set i
to -1 (which wraps to the
largest value for the type) but the loop will have already existed due to the
decrement happening after the check so it doesn’t matter what i
is set to at
that point.
Since i
wraps when the loop fully runs, we can tell if the loop exited early
by checking if i >= len
.
Testing
Lets look at an example:
int main(int argc, char **argv)
{
size_t len = 7;
size_t i;
for (i=len; i-->0; ) {
printf("On i=%zu\n", i);
}
printf("End i=%zu\n", i);
printf("---\n");
for (i=len; i-->0; ) {
printf("On i=%zu\n", i);
if (i == 4) {
break;
}
}
printf("End i=%zu\n", i);
return 0;
}
Output:
On i=6
On i=5
On i=4
On i=3
On i=2
On i=1
On i=0
End i=18446744073709551615
---
On i=6
On i=5
On i=4
End i=4
After the first loop exists we get i
as the max size of a size_t
. This is
exactly what we expect to see.
The second loop exits early and i
is still the value of that iteration.
by looking for i >= len
we can see if the loop exited early and because
i
is set to the last iteration value we know exactly when it exited.