Kenton Lee

X Window System Memory Leaks and Other Memory Bugs

by Kenton Lee

February, 1996

Copyright © 1996 Kenton Lee, All Rights Reserved.

Abstract

This column will be devoted to debugging memory problems in X applications. If this subject interests you, I encourage you to read some of my earlier columns. Many have covered X application bugs and debugging issues of this The X Advisor and The X Journal. My previous columns are available over the World Wide Web at http://www.rahul.net/kenton/bib.html.

This month's column will focus on one of the more important topics in X application debugging: memory bugs, including memory leaks. As you'll see, this subject is very important in X application programming as well as in many other types of modern programming, especially object-oriented programming and event-driven programming.

Because this is a large subject, I'm covering it in two columns. This month, I'll present a general tutorial (with an X-orientation) on understanding, avoiding, and debugging dynamic memory errors. Next month, I'll discuss where the Motif library allocates memory, when the allocation is responsible for freeing this memory, and how the application should free it. Hopefully, you'll be able to use this material to avoid memory bugs in your work.

Contents

  1. Introduction
  2. Memory and Pointers in the X Libraries
    1. Problems with Simple Pointers
    2. Automatic Memory vs. Dynamic Memory
    3. Problems with Dynamic Memory
  3. Avoiding Dynamic Memory Problems
    1. When Is Memory Allocated and By Who?
    2. How Much Memory is Allocated and How is the Length Indicated?
    3. Who Frees the Memory and How?
    4. Object-Oriented Memory Allocation
  4. Debugging Dynamic Memory Problems
    1. Compile-Time Type Checking
    2. Run-Time Type Checking
    3. Manual Debugging Techniques
    4. Automatic Memory Analysis Tools
  5. Conclusion
  6. References

Introduction

Because of their efficiency and flexibility, C and C++ have become the programming languages of choice for most application developers, including X Window System application developers. Two of the more powerful features of these languages are pointers and dynamically allocated memory. The X Window System and X Toolkit rely heavily on both features. Unfortunately, pointers and dynamic memory are also two of the most common problem areas for X programmers. Even expert programmers usually create more bugs with pointers and memory than with any other C or C++ feature.

In this month's column, we'll cover the most common pointer and memory problems, with an emphasis on those that occur in X Window System application programming. We'll also discuss techniques for avoiding these problems and techniques for identifying and debugging them if they do occur.

Memory and Pointers in the X Libraries

If you're familiar with the Xlib[Scheifler] and X Toolkit[McCormack] application programming interfaces (APIs), you already know that both heavily use pointer data types. The pointers can, for example, be:

  1. pointers to simple data types (usually integers) that are assigned via function calls, e.g., the values returned by XGetGeometry()
  2. pointers to structures and arrays that are created by the application and passed to the X libraries, e.g., the XEvent structure passed to XNextEvent() or the resource arrays passed to XtAppInitialize()
  3. pointers to structures and arrays that are created by the X libraries, e.g., the data returned by XGetImage() or XGetVisualInfo()
  4. pointers to opaque data types that are manipulated internally by the X libraries (and should dereferenced directly by the application), e.g., the Display, Screen, GC, and Widget types

While these uses of pointers seem similar, the first two are actually quite different from the last two. In the first two cases, the memory associated with the pointers is typically allocated by the application via automatic declarations. In last two cases, the X libraries allocate dynamic memory to hold the data structures.

As we'll see later in this paper, the use of dynamic memory adds additional layers of both flexibility and complexity to your programs. We'll cover all of the above types of pointers, but focus on the more complex dynamic issues of memory.

Problems with Simple Pointers

I'll use the term simple pointer for the first two cases mentioned in the previous section. In these cases, the variables are typically allocated through an automatic declaration and, thus, are automatically deallocated when the variable goes out-of-scope.

In Figure 1, function f1() declares an automatic variable foo and passes a pointer to foo to function f2(). Function f2(), in turn, modifies the value referenced by the pointer. Because the variable is declared automatic, it is automatically allocated when the program enters f1() and automatically deallocated when f1() returns.


/* automatic variable example */

int *f1();
void *f2(int *);

main()
{
    int *ret = f1();
    printf("*ret = %d\n", *ret);
    exit(0);
}

int *f1()
{
    int foo = 0;
    f2(&foo);
    printf("foo = %d\n", foo);
    return &foo; /* a bug */
}

void f2(int *bar)
{
    *bar = 4;
    return;
}

Figure 1: Example of pointer to automatic variable


Even in this simple case, there is plenty of room for programmer error. Here are three common errors. You've probably seen these before:

There are more cases where simple pointers can cause problems. Most of these problems also occur with pointers to dynamic memory, which we'll cover in the next few sections.

Automatic Memory vs. Dynamic Memory

In most cases where automatic variables are useful, you could instead use dynamic memory. For example, Figure 2 allocates a structure using both techniques.


/* automatic variable vs. dynamic memory example */

struct FooRec {
    int a;
    char *b;
};

main()
{
    /* both foo1 and foo2 are pointers to FooRec structures */
    struct FooRec foobar, *foo1 = &foobar;
    struct FooRec *foo2 = (FooRec *) malloc(sizeof(FooRec));

    /* remember to free dynamic memory */
    free(foo2);

    /* more stuff */
    ...
}

Figure 2: Automatic variable vs. dynamic memory


Despite the problems mentioned in the previous section, experienced programmers generally prefer automatic variables whenever possible. Automatic variables are usually simpler to use and more efficient. And, as we'll see in the next section, dynamic memory is subject to a much larger set of problems. I recommend that you only use dynamic memory when automatic variables are not appropriate, e.g., when the lifetime of the data does not match the scope of the function in which the data is created.

Problems with Dynamic Memory

As we noted in the previous section, application programmers can create many types of bugs with dynamic memory. Here are the major problem areas:

Unfortunately, dynamic memory problems are among the most difficult C and C++ problems to detect. A memory-related bug in one part of your program could easily cause a crash in a very different part of the program. The code that fails may be correctly using memory that is corrupted elsewhere in the program. Most programmers will start debugging the code that exhibits the failure, but that is not where the problem really exists.

In the next section, we'll look at some simple techniques you can use to (try to) avoid dynamic memory problems. The section after that will look at techniques for debugging dynamic memory bugs.

Avoiding Dynamic Memory Problems

As we discussed above, dynamic memory bugs are easy to create, but difficult to solve. A few good design and development techniques, fortunately, can help avoid many of these bugs. The next four subsections describe my basic recommendations:

  1. basic rules for allocating memory
  2. basic rules for maintaining memory
  3. basic rules for freeing memory
  4. object-oriented design for memory allocation

When Is Memory Allocated and By Who?

Whenever you pass pointers to or receive pointers from X library functions, you should ask yourself:

  1. should the application allocate memory to use as an argument to a function call?
  2. will the library function allocate memory and return it?

Many X functions return pointers to arrays or structures, but some allocate the memory for the results and other expect the application to allocate a buffer into which the function will write the results. If the application does not allocate memory when it should, the function may write to an uninitialized pointer. If the application allocated memory when it should not, it will usually cause a memory leak and may also cause logic errors if it thinks data is in one memory block when in fact it is in another.

For example, XNextEvent() and XGetWindowAttributes() expect the application to allocate the structure that will hold the results. Programmers generally use an automatic declaration for these structures, so they need not be explicitly freed. On the other hand, XCreatePixmap() and XmStringCreate() allocate dynamic memory for their results. The programmer must free the results (with XFreePixmap() and XmStringFree() respectively) to avoid a memory leak.

Also, an X library function may return a pointer to an array or structure without allocating new dynamic memory for it. X Toolkit widgets, for example, often return pointers to their internal memory structures. The application should treat these as read-only; modifying or freeing them will cause memory problems within the widget. On the other hand, widgets sometimes return copies of their data that the application should free. The application programmer must determine which is happening to avoid memory problems.

For example, XtGetValues() on a shell widget's XtNtitle resource returns a pointer to a string in the widget's internal memory. The application must not modify or free that string. XtGetValues() on the Motif label widget's XmNlabelString resource, on the other hand, returns a copy of the string. The application may modify this string and must free it to avoid a memory leak.

How Much Memory is Allocated and How is the Length Indicated?

Dynamic memory is generally allocated for either struct's (including C++ class) or arrays. The size of a structure is well defined by the structure definition, but this is not the case with arrays. A common bug is reading or writing past the end of an array.

Unfortunately, X (and UNIX in general) commonly uses two different techniques to specify the sizes of arrays. Some X functions, such as XtSetValues(), require that the caller supply the length of an array as a separate argument. Other functions, such as XtAppSetFallbackResources(), require that the array be NULL-terminated. A few functions, such as XtAppInitialize(), use one technique with some arguments and the other technique with other arguments.

You must be very careful in either case. If the function requires a count, you must be sure to use the correct units for the count. The units may be number of elements, number of bytes, or another unit.

If the function requires a NULL-terminated array, you should explicitly set the NULL terminator. Do not assume that the compiler will initialize the array elements to NULL. Your current compiler may do so, but others you may use in the future may not.

Of course, using the correct array length is irrelevant if you do not allocate enough memory for the array. I've seen more than one X program break because an X Toolkit Arg array was originally allocated (possibly via a global declaration) to hold a certain number of resources. When the program was later modified to add additional resource specifications, the array was not enlarged, causing the program to write past the end of the array.

Who Frees the Memory and How?

Now that you've determined when dynamic memory is allocated by the X libraries, the next questions should be:

  1. does the application have to free the memory?
  2. if yes, when can it safely free the memory?
  3. what function should be used to free the memory?

Unfortunately, the X and, especially, the Motif documentation is not always clear about which case applies to a particular programming interface. In this section, I'll discuss some general guidelines in these areas. Next month's column will cover specific programming interfaces in more specific detail.

Freeing dynamic memory that shouldn't be freed can lead to many problems in the X libraries, including corrupting memory in use by X, freeing memory twice, and writing to invalid memory. Freeing memory with the wrong function can cause it to be improperly freed, including not freeing all the allocated memory.

Here are some rules of thumb to consider when freeing memory:

The last case is usually the most troublesome. For example, Figure 3 shows two memory leaks that Motif programmers commonly make:


/* copy string from text widget to label widget */

void copyTextToLabel(Widget wText, Widget wLabel)
{
    char *textString = XmTextGetString(wText);
    XmString xms = XmStringCreateLocalized(textString);
    XtVaSetValues(wLabel, XmNlabelString, xms, 0);
}

Figure 3: Common Motif memory leaks


There are two memory leaks in this example. First, textString should be freed with XtFree(). Second, xms should be freed with XmStringFree(). If these functions are frequently called, the amount of memory leaked can be large.

Unfortunately, the Motif documentation is not always clear about when memory is allocated. The man page for XmStringCreateLocalized(), for example, does not specify that the string must be freed.

If your documentation is not clear in a particular case, here's a simple test can identify many cases where memory is allocated by a library function: call the function twice and see if it returns the same pointer. If it returns different values each time, then the function is probably allocating new copies every time it is called and the application must free the copy.

Unfortunately, the above test returning the same pointer is not a guarantee that the pointer must not be freed. Some values are reference counted and they must be unreferenced with the appropriate library function to avoid memory leaks. You should pay careful attention to your documentation to avoid these types of problems.

Next month's column will cover specific cases where Motif allocates memory that the application must free. Many of those are not documented (but they should be). Hopefully that material will fill in some of the gaps in the standard documentation (and in your understanding).

Object-Oriented Memory Allocation

Now that we've examined when and how to allocate and free memory, let's look at some techniques for organizing these function calls. By carefully using the same memory allocation semantics throughout your application, you can avoid or catch many bugs.

I prefer an object-oriented memory use design. This design allows your application program and free memory in objects that are similar to the semantics of the program. While this design is more easily implemented with an object-oriented programming language, such as C++, you can also implement it with C. In fact, most expert C programmers were using these techniques long before C++ was developed.

There are a few major pieces to this design:

The first step is to organize your code into groups of modules or objects.[Lee9-94] Each object should contain functions and data (including dynamic memory) that is regularly used together, such as for the same application-level functionality. The C++ class ideal for these modules, but you can do much the same thing with C struct or file based modules.

Because the different pieces of dynamic memory within each object are generally used together, they are often allocated together and freed together. I recommend combining the allocations into a single constructor function for the module. Similarly, a single destructor function should free all the memory for which the object is responsible. By organizing your memory creation and destruction into these functions, you minimize problems caused by accidentally forgetting to allocate or free a particular piece of memory. This object-oriented framework is especially useful late in the development cycle since, like most frameworks, it helps you keep your code organized, even after the code is modified many times.

If you want to be fancy about your C-based constructors and destructors, you could use the structure-of-function-pointers design used by the X Toolkit for widgets and by Xlib for XImages. You could also just use collections of external convenience functions, such as the XmString*() functions used by Motif.

Note that in some cases, memory cannot be easily initialized in the constructor. Perhaps you need some user input before you know how much memory to allocate or you may not want to allocate memory that is sometimes not used by the module. In these cases, I still declare the pointers with the others used by the object, but just initialize them to NULL (or some other constant). When I later allocate memory for the pointer, I first check that it is NULL to avoid allocating it twice. For example:


/* allocate memory if pointer still NULL */

if (pFoo == NULL) {
    pFoo = (cast) malloc(...);
} else {
    /* use the memory, perhaps realloc()'ing first */
}

Figure 4: Safely allocating memory when needed


For memory allocated on-the-fly like this, to avoid freeing NULL pointers (a bug in some implementations of free()), you should check to see if it is NULL before freeing it in your destructor. You should then set the freed pointer to NULL to avoid using it until it is allocated again.

A final basic object technique is the accessor. As we mentioned above, dangling pointers can be caused by creating two pointers to the same piece of memory, then freeing one. This is usually a problem when one object's memory is used by other objects. You can avoid problems like this if you create accessor functions that return pointers to the object's internal memory. External objects can then use the accessors instead of pointing directly to the memory. When your object's pointers change or are freed, you can simply update the accessors, rather than trying to track down all objects that have made copies of the pointers.

Note that an additional benefit of accessor functions is that you can change the implementation of the accessor (as long as you preserve it's API) without affecting external functions that use the accessor. This sort of information hiding or encapsulation is one of the major benefits of object-oriented design.

Debugging Dynamic Memory Problems

So far, we've covered the major dynamic memory problems and important design techniques to help avoid them. Unfortunately, some memory problems always seem to sneak through into our code. In this section, we'll cover techniques to debug these problems:

  1. compile-type type checking
  2. run-time type checking
  3. manual debugging techniques
  4. advanced memory analysis tools

The first two techniques are useful for detecting bugs early in the development process. They detect bugs automatically and are sometimes called self-debugging techniques. These techniques cannot find many common bugs, however, so the last two techniques are sometimes also needed. These are more time consuming and require greater expertise on the part of the programmer, but they are often necessary and should yield good results.

Compile-Time Type Checking

As with most other bug areas, the best debugging techniques are those that catch bugs at compile-time rather than at run-time.[Lee5-95] Your compiler touches all of your code, so it can find errors that may only rarely occur at run-time. I recommend that you, at least occasionally, set your compiler's warning output level to the most verbose setting and then track down and fix all the problems that it reports. Even if a report is not an critical problem, it may be worth fixing for portability reasons or to make real problems easier to find in the output.

Compile-time error messages that are especially important to pointer problems are those generated by function prototypes. As we mentioned above, using incorrect pointer types in X functions is a common and serious application programming problem. Fortunately, the standard X header files include function prototypes and most modern C and C++ compilers can use them (though sometimes not by default). I recommend that you enable this compiler feature all the time and immediately correct any problems it detects. Some problems almost always lead to program bugs, especially when the pointers are to data types of different sizes. Sometimes these problems aren't immediately apparent, since the data types are the same size on a particular machine, but they show up when you try to port to machines with other data type sizes.[Kilgard]

Run-Time Type Checking

If you can't catch a bug at compile-time, the next best thing is to automatically halt your program with a core dump when the bug occurs. While you never want your end users to experience core dumps, they identify the program state at the time of the crash and can help you easily identify and debug many types of bugs.

The assert() macro, available with most C and C++ implementations, is a simple way to force your program to exit with a core dump when unexpected results occur. In this section, we'll look at some easy, but powerful, ways to use assert() to find pointer and memory problems.

Earlier, I mentioned that a good programming technique is to initialize your pointers to NULL and to reset them to NULL whenever they are freed. If you do this, you can easily check for initialized pointers before using the pointers, as in Figure 5.


/* assert() example */

void setLabel(Widget w, char *string)
{
    XmString xms;

    /* check for initialized arguments */
    assert(w && XtIsSubclass(w, xmLabelWidgetClass));
    assert(string && *string);

    /* use the string as a Motif label */
    xms = XmStringCreateLocalized(string);
    XtVaSetValues(w, XmNlabelString, xms, NULL);
    XmStringFree(xms);
}

Figure 5: Using assert() to catch uninitialized pointers


In this example, a faulty algorithm could cause the widget or the string to be uninitialized. Our assert() statement catches these errors easily. Note that we test for two string error conditions:

  1. a NULL string pointer, indicating an uninitialized variable.
  2. NULL string contents, indicating an empty string (which we consider to be an error in this function).

A problem related to uninitialized pointers is valid pointers that are inappropriate in certain contexts. For example, all X Toolkit widget pointers have type Widget, even though many functions only work properly with certain types of widgets. You can check widget pointers with the X Toolkit's type checking functions. For example, if you want to make sure a particular widget is a TopLevelShell, you can use XtIsTopLevelShell(widget). Notice that in Figure 5 we used the XtIsSubclass() function to verify that the widget is a Motif XmLabel widget or a subclass of it. Since these functions return Boolean values, they work fine with assert().

Not all X structures support type checking macros. Instead, some use structure elements to identify the type of the structure. The most common example of this is the XEvent structure. You may have code that only works with one type of event. You can verify that your algorithm found this type event with a simple assertion, as in Figure 6.


/* another assert() example */

int getButton(XEvent *xev)
{
    assert(xev);
    assert((xev->type == ButtonPress) || (xev->type == ButtonRelease));
    return xev->xbutton.button;
}

Figure 6: Using assert() to verify structure types


The type checking is especially useful in cases like the above, since the XEvent structure could hold a different type of event. If that were true, the fields of the structure could hold data unrelated to our function's semantics, causing unpredictable program behavior.

Manual Debugging Techniques

Unfortunately, the "self-debugging" techniques described in the previous two subsections won't find all your bugs. You'll occasionally have to do a more detailed analysis to find your memory leaks and other pointer problems. In fact, I'd recommend that you use the advanced techniques described in this or the next section at least once on your program before you ship it. Even the best programmers will often cause memory problems that are not easily detected during routine testing.

This section discusses manual debugging techniques. These techniques replace the system memory management functions (malloc(), free(), realloc(), etc.) with custom versions that are less efficient but more easily debugged. You will generally want to disable these functions before shipping your code. We'll give only some simple examples here, then refer you to other, more sophisticated, freely available, versions.

In the previous section, I discussed the simple NULL-initialization technique to catch uninitialized pointers. Unfortunately, that technique won't catch improperly initialized pointers. For example, you might accidentally allocate an array that is too short or accidentally use a pointer to the wrong part of a memory block. You can often catch these types of errors by performing some additional initialization on the block of memory allocated by malloc():

If your program exhibits behavior that indicates a memory problem, you can run it under a debugger to examine the memory blocks before they are used. If the memory is not initialized as you expected, you know that the memory is being improperly used. Some possible error conditions are:

Similarly, we could write a fancier version of free() to:

You can probably think of ways to improve the above functions and to write debugging versions of other memory-oriented functions (calloc(), realloc(), strcmp(), strcpy(), XtSetValues(), XtGetValues(), etc.) and the exit() function. Some useful features would be:

This library of functions would need to be fairly sophisticated (and well tested) to catch most of your dynamic memory bugs. Fortunately, several very nice libraries are freely available. These libraries use techniques such as those described above:

Remember that these libraries are significantly slower than the normal system libraries. Most programmers use the normal system libraries during most of their debugging and, of course, when you ship the product.

Automatic Memory Analysis Tools

The tools described in the previous section are freely available and are very powerful, but they are complex and can be time-consuming to learn and to use. Because of this, programmers may not use them as often as they should.

If you don't think you'll have the time to use them properly, you should consider some of commercial alternatives. These tools are not cheap, but they are even more functionally powerful than the free tools. They are also easy to install and come with easy-to-use graphical user interfaces, detailed documentation, and technical support.

Purify was the first tool on the market and is probably still the most popular. Others have matured quickly, however, and there is not much difference among them in their ability to detect your memory bugs.[Armstrong] You should choose the one that works the best in your development environment. Note that some tools are only available on some hardware platforms.

The major tools are listed below (in alphabetical order), with World Wide Web links for more information.

(The names of these products are trademarks of their manufacturers.)


Conclusion

In summary, pointer and dynamic memory problems are some of the most common and most serious bugs in C and C++ programs, including X application programs. We've reviewed the most common problems here. Fortunately, you can avoid many of the bugs with a few simple design and implementation techniques. I've described the techniques that I use in my application development work.

Unfortunately, these techniques won't save you from all dynamic memory bugs. I recommend that you, at least occasionally, use memory debugging tools to find and help fix these. I've described several tools that work well, some free and some commercial.

Next month, we'll cover specific cases where programmers must be careful with dynamic memory allocated by Motif. The standard documentation is often incomplete, but I've been studying the Motif source code and will hopefully be able to fill in many of these gaps.


References

[Armstrong]
James C. Armstrong, Jr., "Leak Detector Shoot-Out," Advanced Systems, October, 1994.
[Kilgard]
Mark Kilgard, "Is Your X Code Ready for 64-bit?," The X Journal, July, 1995.
[Lee9-94]
Kenton Lee, "X Software Modularity," The X Journal, September, 1994.
[Lee5-95]
Kenton Lee, "Debugging Widget Resource Syntax Errors," The X Journal, May, 1995.
[Maguire]
Steve Maguire, Writing Solid Code, Microsoft Press, 1993.
[McCormack]
Joel McCormack, et al, X Toolkit Intrinsics - C Language Reference, included with the X Consortium's X11R6 release.
[Scheifler]
Robert Scheifler and James Gettys, X Window System (third edition), Digital Press, 1992.


Ken Lee is an independent software consultant specializing in X Window System application software. He has been developing UNIX graphical user interface software since 1981. Ken may be reached by Internet electronic mail at kenton @ rahul.net or on the World Wide Web at http://www.rahul.net/kenton/.

Ken has published over two dozen technical papers on X software development. Most are available over the World Wide Web at http://www.rahul.net/kenton/bib.html.