Lockless Inc

Booleans

A Boolean is a simple data type that has two states; TRUE and FALSE. However, the representations of these states can differ. One possibility is that any non-zero value is defined to be TRUE, and zero is false. Another is that TRUE always takes the value of 1, and yet another is that TRUE is always -1. Each of these possibilities has different strengths and weaknesses.

The flexibility of the non-zero TRUE case can be very helpful when writing code. However, the greater specificity of the other cases can lead to greater optimization. Different computer languages make different choices here. The C programming language uses the more flexible definition in if statements, and a less flexible one for _Bool types and the results from boolean expressions.

C-like Booleans

The problem is how to efficiently evaluate boolean expressions when given the flexible inputs. (The more concrete cases are easy as the bitwise and, or, not and xor work correctly.) The first thing we may need to do is to work out how to go from a flexible input into just 1 for TRUE, and 0 for FALSE.

Such a "clamp" or "booleanize" operation looks like:


int booleanize(int x)
{
	return !!x;
}
in C. Fortunately, this is relatively easy to optimize. We just compare with zero, and set the result to be 1 or 0 based on the comparison. The setcc instructions are perfect for this.


booleanize:
	xor %eax, %eax
	test %edi, %edi
	setne %al
	retq

Similarly, the Boolean not operation can be implement by inverting the sense of the setcc instruction:


not:
	xor %eax, %eax
	test %edi, %edi
	sete %al
	retq

By using the above two routines, it is easy to convert any Boolean expression in to ones where bitwise operations can compute the results. However, often we can do better than having to use multiple setcc instructions. The trick is to notice when the condition flags are set the way we want after an operation.

A Boolean or can be computed without any "Booleanization" because the result of a bitwise or will be non-zero if the result of a Boolean or is also TRUE. This means that the zero flag will be set if and only if the result of a bitwise/Boolean or is zero. Thus the result is very simple to implement:


or:
	xor %eax, %eax
	or %edi, %esi
	setne %al
	retq

Similarly, a nor operation can be calculated:


nor:
	xor %eax, %eax
	or %edi, %esi
	sete %al
	retq

Unfortunately, no other Boolean operation are as compact. The reason is that, for example, a bitwise and of two TRUE values 2 and 4 will be zero (FALSE), even though the Boolean and should be TRUE. Similar problems occur when the xor operation is used.

However, if we can cheaply convert one of the inputs to the and into the -1 TRUE, 0 FALSE scheme, then the bitwise and will work properly. This magic can be done via the neg instruction. It will set the carry flag if the input was non-zero. Finally, the sbb instruction can be used to copy the carry flag into all bits of a register. The resulting optimized code looks like:


and:
	neg %edi
	sbb %edi, %edi
	xor %eax, %eax
	and %esi, %edi
	setne %al
	retq

The above can be modified in a couple of ways. Firstly, we can obviously toggle the sense of the output setcc instruction to implement a nand operation:


nand:
	neg %edi
	sbb %edi, %edi
	xor %eax, %eax
	and %esi, %edi
	sete %al
	retq

Secondly, we can use a comparison or subtraction to set the carry flag on zero instead of non-zero. This allows us to implement the following operation:


int other(int x, int y)
{
	return !x && y; /* or !(x || !y) */
}

other:
	sub $1, %esi
	sbb %esi, %esi
	xor %eax, %eax
	or %edi, %esi
	sete %al
	retq

Finally, the above can have its output inverted to yield:


int nother(int x, int y)
{
	return x || !y; /* or !(!x && y) */
}

nother:
	sub $1, %esi
	sbb %esi, %esi
	xor %eax, %eax
	or %edi, %esi
	setne %al
	retq

This covers many Boolean operations on two variables. However, there are a couple more interesting ones left. The first is the comparison of two Booleans. In C, this can be expressed as:


int cmp(int x, int y)
{
	return !x == !y;
}
The above can be computed by converting to Booleans, then using an xor or cmp instruction. However, be using a property of addtion, we can create something faster and more compact. (The idea is that addition acts like xor on the lowest bit... and xor acts like an inverted comparison operation.)


cmp:
	neg %edi
	sbb %eax, %eax
	neg %esi
	adc $1, %eax
	and $1, %eax
	retq

Similarly, the result of an inverted Boolean comparison can be obtained:


int ncmp(int x, int y)
{
	return !x != !y;
}

ncmp:
	neg %edi
	sbb %eax, %eax
	neg %esi
	adc $0, %eax
	and $1, %eax
	retq

The above examples show how ordinary Boolean operations can be optimized in C-like languages. How well does your compiler do?

Booleans with TRUE = -1

When TRUE is always -1, we have a different problem. We don't need to worry about doing operations on a Boolean as the bitwise ones will work just as well. We need to think about optimizing the initial comparisons which create Boolean values instead.

As shown above, the sbb instruction is very useful. It allows us to convert the value within the carry flag into -1 or 0 within a register. This can be much faster and more compact than the normal way of using a inverted setcc instruction followed by a subtraction of 1. We just need to get the result of the comparison within the carry flag. Fortunately, for many cases this isn't too hard.

Testing for (in)equality with zero is easy:


int eq(int x)
{
	return -(x == 0);
}

eq:
	cmp $1, %edi
	sbb %eax, %eax
	retq

int ne(int x)
{
	return -(x != 0);
}

ne:
	neg %edi
	sbb %eax, %eax
	retq

Special cases also exist for signed comparison with zero.


int less(int x)
{
	return -(x < 0);
}

less:
	sar $31, %edi
	mov %edi, %eax
	retq

int le(int x)
{
	return -(x <= 0);
}

le:
	add $0x80000000, %edi
	cmp $0x80000001, %edi
	sbb %eax, %eax
	retq

int greater(int x)
{
	return -(x > 0);
}

greater:
	neg %edi
	sar $31, %edi
	mov %edi, %eax
	retq

int ge(int x)
{
	return -(x >= 0);
}

ge:
	mov $-1, %eax
	shr $31, %edi
	add %edi, %eax
	retq

The above can be extended for comparison with arbitrary constants, rather than just zero. In what follows, we'll use the value 137, but almost any other will work. However, be careful in that some edge cases dealing with INT_MIN or INT_MAX may fail.

Firstly, equality and inequality:


int eq137(int x)
{
	return -(x == 137);
}

eq137:
	sub $137, %edi
	sub $1, %edi
	sbb %eax, %eax
	retq

int ne137(int x)
{
	return -(x != 137);
}

ne137:
	sub $137, %edi
	neg %edi
	sbb %eax, %eax
	retq

Next signed inequalities. (Noting that you can easily convert <= and >= into < and > with slightly different constants.)


int less137s(int x)
{
	return -(x < 137);
}

less137s:
	add $0x80000000, %edi
	cmp $137+0x7fffffff, %edi
	sbb %eax, %eax
	retq

int greater137s(int x)
{
	return -(x > 137);
}

greater137s:
	add $0x80000000, %edi
	add $-137+0x7fffffff, %edi
	sbb %eax, %eax
	retq

Finally, unsigned comparison is even simpler:


int less137u(unsigned x)
{
	return -(x < 137);
}

less137u:
	cmp $137-1, %edi
	sbb %eax, %eax
	retq

int greater137u(unsigned x)
{
	return -(x > 137);
}

greater137u:
	add $-137-1, %edi
	sbb %eax, %eax
	retq

The Carry Flag

The above makes use of the carry flag extensively. Due to the existence of the sbb and adc instructions, that single bit is the most important Boolean that exists within a computer. There are many optimizations that can be done if you are doing a conditional increment of decrement of a variable. You just need to set the carry flag with the result of the comparison, and then use the above instructions.

The only problem to worry about is the memory model for your computer language. The following identities assume that the number of writes to a given variable can be changed. This may be true for local automatic variables which have not had their address taken. Global variables are a different matter. Another thread might be reading/writing to them simultaneously with regards to this code.

Taking that limitation into account, we can show conditional increments or decrements if a variable is equal to zero:


/* if (x) x++; */
	sub $1, %eax
	sbb $-2, %eax

/* if (x) x--; */
	sub $1, %eax
	adc $0, %eax

The above can be slightly altered into the following identities:


/* if (!x) x = 1; */
	sub $1, %eax
	adc $1, %eax

/* if (!x) x = -1; */
	sub $1, %eax
	sbb $-1, %eax

Conditional increments or decrements as the result of unsigned comparisons are easy to implement. (Where again we default to testing with the constant 137.)


/* if (x < 137) x++; */
	cmp $137, %eax
	adc $0, %eax

/* if (x < 137) x--; */
	cmp $137, %eax
	sbb $0, %eax

/* if (x > 137) x++; */
	add $-137, %eax
	adc $137, %eax

/* if (x > 137) x--; */
	add $-137, %eax
	sbb $-137, %eax

Unfortunately, doing this with signed comparisons is much more complex. They set the overflow and sign flags instead of the carry flag. Thus, we need a trick to get the result where we want it. We fix things by changing the range of value so that INT_MAX corresponds to 0xffffffff, and INT_MIN to 0 by toggling the sign bit. This can be done by adding 0x80000000 to the integer we wish to compare. (Similarly, xoring or subtracting will also work to toggle the sign bit.)

Signed comparisons with zero can be simpler than other values:


/* if (x < 0) x++; */
	add $0x80000000, %eax
	adc $0x80000000, %eax

/* if (x < 0) x--; */
	add $0x80000000, %eax
	sbb $0x80000000, %eax

/* if (x > 0) x++; */
	add $0x80000000, %eax
	add $0x7fffffff, %eax
	adc $1, %eax

/* if (x > 0) x--; */
	add $0x80000000, %eax
	add $0x7fffffff, %eax
	sbb $-1, %eax

/* if (x >= 0) x++; */
	add $0x80000000, %eax
	sbb $0x7fffffff, %eax

/* if (x >= 0) x--; */
	add $0x80000000, %eax
	adc $0x7fffffff, %eax

/* if (x <= 0) x++; */
	add $0x80000000, %eax
	add $0x7fffffff, %eax
	sbb $-2, %eax

/* if (x <= 0) x--; */
	add $0x80000000, %eax
	add $0x7fffffff, %eax
	adc $0, %eax

Of course, we can also obtain instruction sequences containing comparisons with arbitrary constants. Again, be careful with INT_MIN or INT_MAX, where the following will fail.


/* if (x < 137) x++; */
	add $0x80000000, %eax
	add $0x80000000 - 137, %eax
	sbb $-137 - 1, %eax

/* if (x < 137) x--; */
	add $0x80000000, %eax
	add $0x80000000 - 137, %eax
	adc $137 - 1, %eax

/* if (x > 137) x++; */
	add $0x80000000, %eax
	add $0x80000000 - 137 - 1, %eax
	adc $137 + 1, %eax

/* if (x > 137) x--; */
	add $0x80000000, %eax
	add $0x80000000 - 137 - 1, %eax
	sbb $-137 - 1, %eax

Summary

Booleans are a fundamental part of computation. From the carry flag through to the flexible Booleans of the C programming language, they are used everywhere. This article shows how common expressions can be optimized by using their properties. How many of these does your compiler know?

Comments

said...
 <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">Enter your comments here</span> Inserisci i tuoi commenti qui</span>

Enter the 10 characters above here


Name
About Us Returns Policy Privacy Policy Send us Feedback
Company Info | Product Index | Category Index | Help | Terms of Use
Copyright © Lockless Inc All Rights Reserved.