# Booleans

A Boolean is a simple data type that has two states; TRUE and FALSE. However, the representations of these states can differ. One possibility is that any non-zero value is defined to be TRUE, and zero is false. Another is that TRUE always takes the value of 1, and yet another is that TRUE is always -1. Each of these possibilities has different strengths and weaknesses.

The flexibility of the non-zero TRUE case can be very helpful when writing code. However, the greater specificity of the other cases can lead to greater optimization. Different computer languages make different choices here. The C programming language uses the more flexible definition in `if` statements, and a less flexible one for `_Bool` types and the results from boolean expressions.

### C-like Booleans

The problem is how to efficiently evaluate boolean expressions when given the flexible inputs. (The more concrete cases are easy as the bitwise `and`, `or`, `not` and `xor` work correctly.) The first thing we may need to do is to work out how to go from a flexible input into just 1 for TRUE, and 0 for FALSE.

Such a "clamp" or "booleanize" operation looks like:

``````
int booleanize(int x)
{
return !!x;
}
``````
in C. Fortunately, this is relatively easy to optimize. We just compare with zero, and set the result to be 1 or 0 based on the comparison. The `setcc` instructions are perfect for this.

``````
booleanize:
xor %eax, %eax
test %edi, %edi
setne %al
retq
``````

Similarly, the Boolean `not` operation can be implement by inverting the sense of the `setcc` instruction:

``````
not:
xor %eax, %eax
test %edi, %edi
sete %al
retq
``````

By using the above two routines, it is easy to convert any Boolean expression in to ones where bitwise operations can compute the results. However, often we can do better than having to use multiple `setcc` instructions. The trick is to notice when the condition flags are set the way we want after an operation.

A Boolean `or` can be computed without any "Booleanization" because the result of a bitwise `or` will be non-zero if the result of a Boolean `or` is also TRUE. This means that the zero flag will be set if and only if the result of a bitwise/Boolean or is zero. Thus the result is very simple to implement:

``````
or:
xor %eax, %eax
or %edi, %esi
setne %al
retq
``````

Similarly, a `nor` operation can be calculated:

``````
nor:
xor %eax, %eax
or %edi, %esi
sete %al
retq
``````

Unfortunately, no other Boolean operation are as compact. The reason is that, for example, a bitwise `and` of two TRUE values 2 and 4 will be zero (FALSE), even though the Boolean `and` should be TRUE. Similar problems occur when the `xor` operation is used.

However, if we can cheaply convert one of the inputs to the `and` into the -1 TRUE, 0 FALSE scheme, then the bitwise `and` will work properly. This magic can be done via the `neg` instruction. It will set the carry flag if the input was non-zero. Finally, the `sbb` instruction can be used to copy the carry flag into all bits of a register. The resulting optimized code looks like:

``````
and:
neg %edi
sbb %edi, %edi
xor %eax, %eax
and %esi, %edi
setne %al
retq
``````

The above can be modified in a couple of ways. Firstly, we can obviously toggle the sense of the output `setcc` instruction to implement a `nand` operation:

``````
nand:
neg %edi
sbb %edi, %edi
xor %eax, %eax
and %esi, %edi
sete %al
retq
``````

Secondly, we can use a comparison or subtraction to set the carry flag on zero instead of non-zero. This allows us to implement the following operation:

``````
int other(int x, int y)
{
return !x && y; /* or !(x || !y) */
}

other:
sub \$1, %esi
sbb %esi, %esi
xor %eax, %eax
or %edi, %esi
sete %al
retq
``````

Finally, the above can have its output inverted to yield:

``````
int nother(int x, int y)
{
return x || !y; /* or !(!x && y) */
}

nother:
sub \$1, %esi
sbb %esi, %esi
xor %eax, %eax
or %edi, %esi
setne %al
retq
``````

This covers many Boolean operations on two variables. However, there are a couple more interesting ones left. The first is the comparison of two Booleans. In C, this can be expressed as:

``````
int cmp(int x, int y)
{
return !x == !y;
}
``````
The above can be computed by converting to Booleans, then using an `xor` or `cmp` instruction. However, be using a property of addtion, we can create something faster and more compact. (The idea is that addition acts like xor on the lowest bit... and xor acts like an inverted comparison operation.)

``````
cmp:
neg %edi
sbb %eax, %eax
neg %esi
and \$1, %eax
retq
``````

Similarly, the result of an inverted Boolean comparison can be obtained:

``````
int ncmp(int x, int y)
{
return !x != !y;
}

ncmp:
neg %edi
sbb %eax, %eax
neg %esi
and \$1, %eax
retq
``````

The above examples show how ordinary Boolean operations can be optimized in C-like languages. How well does your compiler do?

### Booleans with TRUE = -1

When TRUE is always -1, we have a different problem. We don't need to worry about doing operations on a Boolean as the bitwise ones will work just as well. We need to think about optimizing the initial comparisons which create Boolean values instead.

As shown above, the `sbb` instruction is very useful. It allows us to convert the value within the carry flag into -1 or 0 within a register. This can be much faster and more compact than the normal way of using a inverted `setcc` instruction followed by a subtraction of 1. We just need to get the result of the comparison within the carry flag. Fortunately, for many cases this isn't too hard.

Testing for (in)equality with zero is easy:

``````
int eq(int x)
{
return -(x == 0);
}

eq:
cmp \$1, %edi
sbb %eax, %eax
retq

int ne(int x)
{
return -(x != 0);
}

ne:
neg %edi
sbb %eax, %eax
retq
``````

Special cases also exist for signed comparison with zero.

``````
int less(int x)
{
return -(x < 0);
}

less:
sar \$31, %edi
mov %edi, %eax
retq

int le(int x)
{
return -(x <= 0);
}

le:
cmp \$0x80000001, %edi
sbb %eax, %eax
retq

int greater(int x)
{
return -(x > 0);
}

greater:
neg %edi
sar \$31, %edi
mov %edi, %eax
retq

int ge(int x)
{
return -(x >= 0);
}

ge:
mov \$-1, %eax
shr \$31, %edi
retq
``````

The above can be extended for comparison with arbitrary constants, rather than just zero. In what follows, we'll use the value 137, but almost any other will work. However, be careful in that some edge cases dealing with INT_MIN or INT_MAX may fail.

Firstly, equality and inequality:

``````
int eq137(int x)
{
return -(x == 137);
}

eq137:
sub \$137, %edi
sub \$1, %edi
sbb %eax, %eax
retq

int ne137(int x)
{
return -(x != 137);
}

ne137:
sub \$137, %edi
neg %edi
sbb %eax, %eax
retq
``````

Next signed inequalities. (Noting that you can easily convert <= and >= into < and > with slightly different constants.)

``````
int less137s(int x)
{
return -(x < 137);
}

less137s:
cmp \$137+0x7fffffff, %edi
sbb %eax, %eax
retq

int greater137s(int x)
{
return -(x > 137);
}

greater137s:
sbb %eax, %eax
retq
``````

Finally, unsigned comparison is even simpler:

``````
int less137u(unsigned x)
{
return -(x < 137);
}

less137u:
cmp \$137-1, %edi
sbb %eax, %eax
retq

int greater137u(unsigned x)
{
return -(x > 137);
}

greater137u:
sbb %eax, %eax
retq
``````

### The Carry Flag

The above makes use of the carry flag extensively. Due to the existence of the `sbb` and `adc` instructions, that single bit is the most important Boolean that exists within a computer. There are many optimizations that can be done if you are doing a conditional increment of decrement of a variable. You just need to set the carry flag with the result of the comparison, and then use the above instructions.

The only problem to worry about is the memory model for your computer language. The following identities assume that the number of writes to a given variable can be changed. This may be true for local automatic variables which have not had their address taken. Global variables are a different matter. Another thread might be reading/writing to them simultaneously with regards to this code.

Taking that limitation into account, we can show conditional increments or decrements if a variable is equal to zero:

``````
/* if (x) x++; */
sub \$1, %eax
sbb \$-2, %eax

/* if (x) x--; */
sub \$1, %eax
``````

The above can be slightly altered into the following identities:

``````
/* if (!x) x = 1; */
sub \$1, %eax

/* if (!x) x = -1; */
sub \$1, %eax
sbb \$-1, %eax
``````

Conditional increments or decrements as the result of unsigned comparisons are easy to implement. (Where again we default to testing with the constant 137.)

``````
/* if (x < 137) x++; */
cmp \$137, %eax

/* if (x < 137) x--; */
cmp \$137, %eax
sbb \$0, %eax

/* if (x > 137) x++; */

/* if (x > 137) x--; */
sbb \$-137, %eax
``````

Unfortunately, doing this with signed comparisons is much more complex. They set the overflow and sign flags instead of the carry flag. Thus, we need a trick to get the result where we want it. We fix things by changing the range of value so that INT_MAX corresponds to 0xffffffff, and INT_MIN to 0 by toggling the sign bit. This can be done by adding 0x80000000 to the integer we wish to compare. (Similarly, xoring or subtracting will also work to toggle the sign bit.)

Signed comparisons with zero can be simpler than other values:

``````
/* if (x < 0) x++; */

/* if (x < 0) x--; */
sbb \$0x80000000, %eax

/* if (x > 0) x++; */

/* if (x > 0) x--; */
sbb \$-1, %eax

/* if (x >= 0) x++; */
sbb \$0x7fffffff, %eax

/* if (x >= 0) x--; */

/* if (x <= 0) x++; */
sbb \$-2, %eax

/* if (x <= 0) x--; */
``````

Of course, we can also obtain instruction sequences containing comparisons with arbitrary constants. Again, be careful with INT_MIN or INT_MAX, where the following will fail.

``````
/* if (x < 137) x++; */
sbb \$-137 - 1, %eax

/* if (x < 137) x--; */

/* if (x > 137) x++; */
add \$0x80000000 - 137 - 1, %eax

/* if (x > 137) x--; */
add \$0x80000000 - 137 - 1, %eax
sbb \$-137 - 1, %eax
``````

### Summary

Booleans are a fundamental part of computation. From the carry flag through to the flexible Booleans of the C programming language, they are used everywhere. This article shows how common expressions can be optimized by using their properties. How many of these does your compiler know? 