Lockless Inc

Dealing with Complexity

Programming is a difficult task. You typically need to solve a large and complex problem, and need to minimize the possibility of making errors doing so. Exacerbating things is that the requirements for the task can change over time, possibly invalidating earlier work in the process.

A large amount of programming is thus dealing with complexity. Unfortunately, there is no way to decrease the inherent "intrinsic" complexity of a problem. Any program that implements a task must be complex enough to do so. However, not all programs are the same. Some appear to be much simpler and easier to understand than others. The questions for today are: "Why can this happen?", and "How can we minimize the complexity in the programs we write?".

The source-code for a program has not one, but two tasks. The first is the most obvious. A program is a series of instructions to a compiler or interpreter. That compiler or interpreter then will create something that will follow those instructions as we have given them. Any mistakes in the instructions will then directly correspond to errors in the computer's behaviour in trying to carry out the task we wish to give it.

The second duty of the source-code is to communicate to programmers. Either to another programmer, who is trying to read the code to see what it does, or to yourself perhaps a few months from now when you've forgotten exactly what you've written. This is where complexity matters the most. The smaller the apparent complexity of a program, the easier it is to debug. Most of the time, the critical path in creating a program is not writing it. It is debugging it. Thus, if we can minimize this form of complexity, then significant gains in productivity can perhaps be had.

Thus we have two completely different measures of complexity. The first is fixed, the complexity of the task we wish to solve. The second is variable. The complexity of how we solve it. The two complexities do not have to be the same. For example, in modular programs, someone working on one module can be completely oblivious to the inner workings of another well-separated module. Thus even though the total complexity obviously depends on the sum total of all the modules, the "debugging" or "local" complexity can be much less.

Modular Programming

Programming languages are typically designed so that you can create new abstractions. By using abstractions recursively, we can break the problem we wish to solve into smaller and smaller parts. Conversely, we can implement more and more complex tasks by putting together a recursive set of building blocks.

Different languages implement different things, but the underlying concept is the same. Whether you use functions, objects, operators, types, classes, meta-functions, macros, meta-types, or whatever, the basic idea is that you can conceptually treat each building block as somewhat separate from its peers. The modularization then results in decreased complexity for the programmer.

So how much decrease can we get? What is the optimal break-up of abstractions? To discuss this, we will first need a model of how we wish to construct a program. One of the simplest that follows this concept is to have a recursive construction. We will define each layer of recursion to have n components, and use m layers. Thus the total expressible complexity Ce = nm.

Using the above program layout, we can then estimate the local complexity for a programmer looking at it trying to solve a bug. Assume we know our bug lies somewhere on a branch in the recursive tree. How many things affect that branch? Well... at any given level, there are n modular constructs, and there are m levels. Thus we need to perhaps hold n × m things in our head in order to understand what is going on. So, in this case, the local complexity Cl ∝ nm.

Given that Ce is fixed, how should we choose n and m so that Cl is minimized? This is a fairly basic math problem. First, take the log of both sides of the definition of the expressible complexity. Doing this we get:

log(Ce) = m log(n)

Now, we can re-arrange to get m as a function of n and Ce:

m = log(Ce) / log(n).

We can then substitute this expression for m into our model for the local complexity we wish to minimize. The result is that we want to minimize:

log(Ce) n/log(n)

Now, since the effective complexity is a multiplicative factor here, it can be removed, leading us to want to minimize the function: n/log(n)

Minimizing this isn't too difficult. Using simple calculus, we can take the derivative, and set the resulting expression to zero. After a very small amount of manipulation, we end up with the result that in this model the optimal n = e, where e is Euler's number, 2.71828...

So, with this model, the optimal program should break down into two or three sub-programs recursively until reaching the most primitive language constructs. (This is similar to why computers use the binary number system. Binary numbers are the simplest way to express a result. Ternary may be better, but the circuits are larger, and don't make up for the decreased complexity elsewhere.) In addition, we find that the number of abstraction layers, m, should scale logarithmically with the size of the complexity of the problem. Large programs can have huge inherent complexities, but the local complexities can be much smaller due to modularization.

So do real programs look like this? The answer is: sometimes. Certain styles of programming certainly lead to the construction of general building-blocks that can then be attached to each other in highly flexible ways. Modern Functional programming often uses a very large number of small functions that together solve the required problem. However, other programming styles are quite different. They can use many more than two or three building blocks per layer of abstraction. Are these other styles inefficient? Or is there something else we are missing?

Large Programs

The previous model of local complexity works fairly well at describing small programs. However, it falls flat with large ones. In a large program, it is very rare to have a bug that requires the knowledge of the full software stack. (Of course, this can happen... but not often.) Instead, most bugs are also local in scope. We may only need to understand a few layers of abstraction at a time to work out what is wrong, and then how to fix it.

Thus our previous model of local complexity should be modified. We don't need to know every detail for all layers of abstraction. We just need to know about the layers we are working with. Thus instead of using Cl ∝ nm, we now have Cl ∝ a n + m, where a is a constant that defines the weighting of branch-out factor to number of layers. If a is greater than one, then we penalize having a large number of abstractions. If a is smaller than one, then we penalize having too many layers.

So what are the ideal n and m now? We again can do the math, firstly by solving for m in terms of the expressible complexity, and substituting in our new expression for the local complexity:

Cl ∝ a n + log(Ce)/log(n)

This is quite a bit more difficult to minimize, but it still can be done. The result is:

na n log(n)=Ce, from which we can read off that m = a n log(n).

The above implicit expression for n cannot be inverted in terms of elementary functions. However, if we use the Lambert W function, we can get an analytic result:

n = Exp{2 W[sqrt(log(Ce)/a)/2]}, which we can then insert in the formula for m.

The above isn't too enlightening. However, in the regime we are interested in, the graph of n verses log(Ce) is close to being linear. Thus without losing too much information, we can use a series expansion followed by numerical evaluation to obtain:

If a=0.1, then n = 8.4 + 0.6 log(Ce).
If a=1, then n = 3.2 + 0.15 log(Ce).
If a=10, then n = 1.7 + 0.04 log(Ce).

If a=0.1, then m = 1.3 + 0.3 log(Ce).
If a=1, then m = 2.9 + 0.4 log(Ce).
If a=10, then m = 7.6 + 0.7 log(Ce).

So now both n and m depend on the log of the expressible complexity. The lower you weight the complexity of having more abstractions per level, the more are favoured. Conversely, less levels of recursion are required. If you plug in values of a and log(Ce) that seem reasonable, you get a branch-out factor of a little more than ten or so. This is quite a bit more than what is expected from the small-program analysis done in the previous section.

So how do these results look? To me, at least, they seem to describe a wide range of well-written software projects. The branch-out factor isn't too large (or too small), and the number of layers of abstraction also looks right. Perhaps now the use of too little abstraction (spaghetti code), or too many layers (lasagna code) can be quantified in a less abstract way. Poorly written interfaces can also be quantified: "If we rewrite this in a more orthogonal manner, we can reduce the exposed complexity by a factor of..."


Complexity is inherent in programming. However, the complexity of the task can be divorced from the complexity of debugging it through the use of modularization. A good program will scale logarithmically with the problem it seeks to solve. Thus a large program doesn't have to be orders of magnitude more difficult to work on than a small one.

Due to finite-horizon effects, programmers don't need to know the full software stack in order to work on a large project. Thus the total complexity of working on small programs doesn't scale in the same way as the complexity of large programs. Programming styles that are optimal on small scales aren't necessarily so on large ones. People who take experience from small test-programs, and then seek to apply it to large tasks will most likely need to alter their habits in a quantifiable way.


Enter the 10 characters above here