Measuring Memory Usage
How do you work out the total memory used by a process at run-time? One way is to use the malloc_stats() function. If implemented in your allocator, it will print out the memory used, together with perhaps other information about the internal state of itself in user-understandable format. However, unfortunately, the format of this information is not standardized. It also is output to stderr so is not directly accessible.
Another possibility is to use the mallinfo routine. If implemented, this returns a struct filled with data on the internal state of the allocator. Unfortunately, this routine is also not that useful. The reason is that the fields in struct mallinfo were originally defined with respect to the internals of a rather old allocator. A new allocator will use different algorithms, and may have vastly different internal accounting of memory. There may be very little correspondence between the meaning of the individual fields between allocators.
We are left with using operating system specific APIs to determine memory usage. This is however, an advantage. We will be able to determine the complete overhead of an application, including memory not allocated through the C library. For example, when the process loads a shared library, the linker will use the mmap() system call directly. Similarly, the thread stacks in the pthread implementation will be directly mapped rather than malloced.
To access the total memory information about a process on Linux, we use the /proc virtual file system. Within it there is a directory full of information for each active process id (pid). By reading /proc/(pid)/status we can obtain information about memory. Amoungst other things, in Linux version 2.6.39, this file includes:
VmPeak: | Peak virtual memory usage |
VmSize: | Current virtual memory usage |
VmLck: | Current mlocked memory |
VmHWM: | Peak resident set size |
VmRSS: | Resident set size |
VmData: | Size of "data" segment |
VmStk: | Size of stack |
VmExe: | Size of "text" segment |
VmLib: | Shared library usage |
VmPTE: | Pagetable entries size |
VmSwap: | Swap space used |
Given the required /proc filename, it is trivial to parse the kernel data, and then output the information. Some code which obtains the total and peak virtual and resident memory for a process is:
static int main_loop(char *pidstatus)
{
char *line;
char *vmsize;
char *vmpeak;
char *vmrss;
char *vmhwm;
size_t len;
FILE *f;
vmsize = NULL;
vmpeak = NULL;
vmrss = NULL;
vmhwm = NULL;
line = malloc(128);
len = 128;
f = fopen(pidstatus, "r");
if (!f) return 1;
/* Read memory size data from /proc/pid/status */
while (!vmsize || !vmpeak || !vmrss || !vmhwm)
{
if (getline(&line, &len, f) == -1)
{
/* Some of the information isn't there, die */
return 1;
}
/* Find VmPeak */
if (!strncmp(line, "VmPeak:", 7))
{
vmpeak = strdup(&line[7]);
}
/* Find VmSize */
else if (!strncmp(line, "VmSize:", 7))
{
vmsize = strdup(&line[7]);
}
/* Find VmRSS */
else if (!strncmp(line, "VmRSS:", 6))
{
vmrss = strdup(&line[7]);
}
/* Find VmHWM */
else if (!strncmp(line, "VmHWM:", 6))
{
vmhwm = strdup(&line[7]);
}
}
free(line);
fclose(f);
/* Get rid of " kB\n"*/
len = strlen(vmsize);
vmsize[len - 4] = 0;
len = strlen(vmpeak);
vmpeak[len - 4] = 0;
len = strlen(vmrss);
vmrss[len - 4] = 0;
len = strlen(vmhwm);
vmhwm[len - 4] = 0;
/* Output results to stderr */
fprintf(stderr, "%s\t%s\t%s\t%s\n", vmsize, vmpeak, vmrss, vmhwm);
free(vmpeak);
free(vmsize);
free(vmrss);
free(vmhwm);
/* Success */
return 0;
}
Using such a function, we can now create a simple utility that prints the memory used by a process to stderr . To make things easy to use, we can make the process to benchmark executed by the utility itself. By using the fork() and exec() system calls, this isn't hard to do. The only trick is to catch the SIGCHLD signal so we notice when the process has finished executing.
Code which does this is:
/* Utility to print running total of VmPeak and VmSize of a program */
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#define PATH_MAX 2048
int child_pid;
static int usage(char *me)
{
fprintf(stderr, "%s: filename args\n", me);
fprintf(stderr, "Run program, and print VmPeak, VmSize, VmRSS and VmHWM (in KiB) to stderr\n");
return 0;
}
static int child(int argc, char **argv)
{
char **newargs = malloc(sizeof(char *) * argc);
int i;
/* We can't be certain that argv is NULL-terminated, so do that now */
for (i = 0; i < argc - 1; i++)
{
newargs[i] = argv[i+1];
}
newargs[argc - 1] = NULL;
/* Launch the child */
execvp(argv[1], newargs);
return 0;
}
static void sig_chld(int dummy)
{
int status, child_val;
int pid;
(void) dummy;
pid = waitpid(-1, &status, WNOHANG);
if (pid < 0)
{
fprintf(stderr, "waitpid failed\n");
return;
}
/* Only worry about direct child */
if (pid != child_pid) return;
/* Get child status value */
if (WIFEXITED(status))
{
child_val = WEXITSTATUS(status);
exit(child_val);
}
}
int main(int argc, char **argv)
{
char buf[PATH_MAX];
struct sigaction act;
if (argc < 2) return usage(argv[0]);
act.sa_handler = sig_chld;
/* We don't want to block any other signals */
sigemptyset(&act.sa_mask);
act.sa_flags = SA_NOCLDSTOP;
if (sigaction(SIGCHLD, &act, NULL) < 0)
{
fprintf(stderr, "sigaction failed\n");
return 1;
}
child_pid = fork();
if (!child_pid) return child(argc, argv);
snprintf(buf, PATH_MAX, "/proc/%d/status", child_pid);
/* Continual scan of proc */
while (!main_loop(buf))
{
/* Wait for 0.1 sec */
usleep(100000);
}
return 1;
}
This program is downloadable from downloads directory. Using it, we can profile Firefox. The following graph shows a short browsing session with many open tabs, some being opened and closed. The last half of it involved watching a Youtube video. Notice how the total memory used changed over time. The differing user interaction is clearly visible on the graph.
|
Comments
captcher said...