Lockless Inc

Mocking Functions in C

Unit-testing is important. You want to know if your code works the way it is designed do, and the only way to be sure is to test it. However, that code may be quite complex. It might call many other routines, and deal with other interfaces. If so, it can be difficult to test.

Mocking interfaces can make the job of testing much easier. Instead of your module calling others, it can call a "mock" interface that is much simpler. Your testing code can then interpose itself on all sides of the module you wish to test. By examining all outputs, and handling all inputs, the scope of effects is greatly diminished. The problem is finding the best way to do this.

To be more concrete, what we want to do is convert our set of functions in a module to call our testing mock-functions instead of some others that they might otherwise use. i.e:


int bar(int x)
{
	int result = 0;
	
	for (int i = 0; i < 10; i++)
	{
		result += foo(i);
	}
	
	return result;
}

Pretend we want to test the function bar(), above. To do so, we want to make it call a mock version of foo(). Our "testing" version of foo() can be much simpler than the original. The original foo() might inspect a database, do a cpu-intensive calculation, or something else unwieldy to test. Thus we need some way to replace the call.

One way to do that is to use a macro. With simple text-replacing we can make bar() do different things depending on whether or not we are unit-testing:


#ifdef TEST
#define FOO mock_foo
#else
#define FOO foo
#endif

int mock_foo(int x)
{
	return x;
}

int bar(int x)
{
	int result = 0;
	
	for (int i = 0; i < 10; i++)
	{
		result += FOO(i + x);
	}
	
	return result;
}

So now bar() will call mock_foo when the TEST is #defined. This works... but isn't as good as it could be. The problem is that different tests on the function bar() might like different ways of mocking the functions that bar() calls. The above only handles a single method of interposing. We would like something more flexible. One way to add flexibility is to use a function pointer for extra indirection:


#ifdef TEST
#define FOO test_foo
#else
#define FOO foo
#endif

int (*test_foo)(int x) = foo;

int bar(int x)
{
	int result = 0;
	
	for (int i = 0; i < 10; i++)
	{
		result += FOO(i + x);
	}
	
	return result;
}

int mock_foo1(int x)
{
	return 0;
}

int mock_foo2(int x)
{
	return x;
}

int unittest(void)
{
	test_foo = mock_foo1;
	ASSERT(bar(3) == 0);
	
	test_foo = mock_foo2;
	ASSERT(bar(3) == 75);
	
	test_foo = foo;
}

The above is much better. We can test now in more complex ways, and thus be more convinced that our implementation is correct. However, the technique is not without its flaws. The first is that the tested version of the module isn't the same as the non-tested version. We are calling via a function pointer instead of using a function. The difference is small, but could cause the compiler to use different register and stack-slot allocations. It is conceivable that some (rare) bugs might be visible in the normal code, but not trigger in the unit tests.

Another problem is that it requires the function bar() to be compiled twice. Once with testing, once without. If the code-base is large, the doubling of compile time can be an issue. We would like to minimize the amount of extra compilation if at all possible

The third problem is more political. In some corporate settings, macros are greatly discouraged in code-bases. So the above might be disallowed do to that. Another possibility is that macros are forced to be in all-upper-case (just like shown in the example). This isn't too much of a problem if only a few interface functions are mocked. However, as more and more unit testing is added, more and more code can be "uglified" by the requirement. We would ideally like the tested code to be unaffected by the external requirements of the test harness.

There is another way to mock the interface. We can add a mock function that wraps calls to foo(). The wrapper function can test a function pointer to see if it should call the original function or not. Since it is a function and not a macro, the coding style issues disappear:


int bar(int x)
{
	int result = 0;
	
	for (int i = 0; i < 10; i++)
	{
		result += foo(i + x);
	}
	
	return result;
}


#ifdef TEST
int (*test_foo)(int x) = NULL;
int foo__(int x)
#else
int foo(int x)
#endif
{
	int result;
	
	/* Do something complex to calculate the result */
	
	return result;
}

#ifdef TEST
int foo(int x)
{
	if (test_foo) return test_foo(x);
	return foo__(x);
}
#endif

int mock_foo1(int x)
{
	return 0;
}

int mock_foo2(int x)
{
	return x;
}

int unittest(void)
{
	test_foo = mock_foo1;
	ASSERT(bar(3) == 0);
	
	test_foo = mock_foo2;
	ASSERT(bar(3) == 75);
	
	test_foo = foo;
}

The above has different pros and cons compared to the previous method. One massive advantage is that bar() is completely unaffected by unit testing. We only need to compile it once. The problem is that the code for foo() is now "contaminated" by the need to mock it. We use as simple a wrapper as possible... but the function header is split up in a confusing way. There are other ways of using the C pre-processor to do the above, but all are just as ugly.

There is also another subtle issue with the above technique. It works in the case shown, but sometimes will not be as easy. The big problem occurs when the function specification for foo describes a variable-argument function (i.e. printf()). If a 'v' version of that function exists (vprintf()), then we are okay. Our wrapper can construct a va_list pointer, and then use that to call the 'v' version of the function. However, if it doesn't we are stuck. We are forced to perhaps create one.

Ideally, what we would like is to create some kind of automated wrapping technique that handles all function types, including ones with variable arguments. We would also like it to be not quite as ugly as the method shown. Minimal changes to the mocked functions is the goal.

Obviously, such a wrapper cannot be done in pure C. Pure C will fail to handle the variable arguments. Fortunately, assembly language is powerful and expressive enough to do what we want. Basically, we want to tail-call foo__() or *test_foo depending on whether or not test_foo is NULL. The tail-call means that arguments passed on the stack are not corrupted. An asm wrapper that does this looks like:


movq	test_foo(%rip), %r11
testq	%r11, %r11
je	foo__
jmp	*%r11

So we load the pointer into %r11, test to see if it is zero, and if not, call it. Otherwise we jump to foo__directly. We choose the %r11 register because it is a "scratch" register as defined by the ABI. Other registers are owned by the caller or callee, and we can't use them here. Note that %rax appears at first glance to be free. However, it is used by the ABI to record the number of floating point values passed in registers to variable-argument routines, so it isn't always free.

The 32-bit version of the above is a bit more tricky. We don't have any spare registers we can use. (The caller might be using a "regparm" method, passing values in registers.) The trick here is to work out how to branch to our mock version. We can't use a simple indirect jump, as the register used for it will be corrupted. However, there is one other way to dynamically jump somewhere: the return instruction. It pops a value off of the stack, and jumps there. Thus by manipulating the stack, we can get control flow altered the way we need. Some code that does this looks like:


push	%edx
push	%edx
push	%eax
movl	test_foo, %eax
leal	foo__, %edx
test	%eax, %eax
cmove	%edx, %eax
mov	%eax, 8(%esp)
pop	%eax
pop	%edx
ret

The above saves a slot for the dynamic return, and then saves %edx and %eax by pushing to the stack. We then decide where to jump, and store it in the first stack slot. Finally, we restore %eax and %edx and make our tail-call via a return statement.

So how do we introduce the above into our .c file? It is assembly language, not C. Normally, you can only use inline assemble from within a function. However, we would like to do so from file scope. This is basically impossible... but there is one trick. The section __attribute__ pastes its string input directly into the asm output gcc is generating. So by inserting asm directives into that string, we can do all sorts of crazy things. The following shows how we can dynamically wrap functions for mocking:


#ifdef TESTING

#ifdef __x86_64__
#define MOCK(F) __attribute__((section(\
	".bss\n\t"\
	".globl test_" #F "\n\t"\
	".align 8\n\t"\
	".type	test_" #F ", @object\n\t"\
	".size	test_" #F ", 8\n"\
	"test_" #F ":\n\t"\
	".zero	8\n\t"\
	".text\n\t"\
	".p2align 4,,15\n\t"\
	".globl	" #F "\n\t"\
	".type	" #F ", @function\n"\
	#F ":\n\t"\
	".cfi_startproc\n\t"\
	"movq	test_" #F "(%rip), %r11\n\t"\
	"testq	%r11, %r11\n\t"\
	"je	" #F "__\n\t"\
	"jmp	*%r11\n\t"\
	".cfi_endproc\n\t"\
	".size	" #F ", .-" #F "\n\t"\
	".section	.text"))) F ## __

#else
#define MOCK(F) __attribute__((section(\
	".bss\n\t"\
	".globl test_" #F "\n\t"\
	".align 4\n\t"\
	".type	test_" #F ", @object\n\t"\
	".size	test_" #F ", 4\n"\
	"test_" #F ":\n\t"\
	".zero	4\n\t"\
	".text\n\t"\
	".p2align 4,,15\n\t"\
	".globl	" #F "\n\t"\
	".type	" #F ", @function\n"\
	#F ":\n\t"\
	".cfi_startproc\n\t"\
	"push %edx\n\t"\
	"push %edx\n\t"\
	"push %eax\n\t"\
	"movl test_" #F ", %eax\n\t"\
	"leal " #F "__, %edx\n\t"\
	"test %eax, %eax\n\t"\
	"cmove %edx, %eax\n\t"\
	"mov %eax, 8(%esp)\n\t"\
	"pop %eax\n\t"\
	"pop %edx\n\t"\
	"ret\n\t"\
	".cfi_endproc\n\t"\
	".size	" #F ", .-" #F "\n\t"\
	".section	.text"))) F ## __

#endif

#define DECLARE_MOCK(F) \
	extern typeof(F) F ## __, *test_ ## F

#else

#define MOCK(F) F
#define DECLARE_MOCK(F)

#endif

We've added extra directives so debugging and exceptions work correctly. You use the above very simply, just place it into a "mocking" header used by the unit-testing code. The unit testing code then uses them:


/* In header for module for foo */
int foo(int x);
DECLARE_MOCK(foo);


/* In C file for module for foo */
int MOCK(foo)(int x)
{
	int result;
	
	/* Do something complex to calculate the result */
	
	return result;
}

Notice how little the code needs to be changed now. We just add one line into a header file, and edit the definition of the mocked functions a tiny bit. The magic macros will declare test_foo and foo__ for us. Since foo isn't a macros, functions calling it do not need to be altered. We can test the same object files as is used by the release version, and our source-code remains clean.

Comments

said...
$ gcc -o project sources*.o wrapit.o -Wl,--wrap,malloc

$ cat wrapit.c

#include <unistd.h>
void * __real_malloc(size_t size);
void * __wrap_malloc(size_t size)
{
    write(2, "Hi, I'm your malloc wrapper!\n", 29);
    return __real_malloc(size);
}
Don said...
so, if we are mocking foo, we still need to modify foo's header in foo.c and foo.h. everything that calls foo will remain the same but foo's source will have to be changed. do i have this correct?

int foo(int x);
DECLARE_MOCK(foo);


/* In C file for module for foo */
int MOCK(foo)(int x)


I am trying to unit test legacy code and there doesn't seem to be a way to mock a function without modifying the legacy code base in some way.
said...
Enter your comments here
Gabor Marton said...
Hi,

Quite good article.
I am doing something very similar but with the help of the compiler (Clang), so we really can avoid any source code modification:
https://github.com/martong/finstrument_mock
It would be great to hear your opinion on my compiler instrumentation approach.

rpg said...
The article was really helpful.
said...
Enter your comments here
den said...
<script>alert(hello);</script>

Enter the 10 characters above here


Name
About Us Returns Policy Privacy Policy Send us Feedback
Company Info | Product Index | Category Index | Help | Terms of Use
Copyright © Lockless Inc All Rights Reserved.