Kenton Lee

Debugging X Window System Protocol Errors

by Kenton Lee

Published in The X Journal, March, 1993.
Copyright © 1993 Kenton Lee. All rights reserved.


Abstract

X protocol errors are common symptoms of X Window System application programming bugs. This article discusses simple, but usually sufficient, techniques for understanding X protocol errors and correcting the associated programming problems.

Contents

  1. Introduction
  2. The X Protocol Error Message
  3. Synchronization
  4. Stack Trace
  5. Combining Stack Trace And Error Message Data
  6. Other Error Types
  7. Other Tools
  8. Subroutine Libraries
  9. Conclusion
  10. References
  11. The Author

Introduction

The most common symptoms of Xlib programming problems are probably X protocol errors. While these errors are less common when using higher level X toolkits, they, unfortunately, sometimes still occur there.

For example, Xlib's default X protocol error handler might report something like:

X Error of failed request: BadValue (integer parameter out of range) Major opcode of failed request: 63 (X_CopyPlane) Value in failed request: 0x3 Serial number of failed request: 37 Current serial number in output stream: 38

This article will discusses simple, but usually sufficient, techniques for understanding these error messages and correcting the associated programming problems. We will discuss the data presented in the error message, techniques for identifying the particular line of code in your program that is associated with the error, and how to use the X documentation to find the specific problem with that line of code.

The examples given in this article are based on UNIX implementations of X11R5. Similar data and tools should be available in other implementations.

Surprisingly, none of the popular tutorial books on X application programming provide any detailed material on debugging techniques. Hopefully, this article and others in this issue of The X Journal will fill some of this void. A companion article provides related information on common X programming errors, many of which generate X protocol errors.[Lee91]

The X Protocol Error Message

The default X protocol error message reports five pieces of information:
  1. error type
  2. major opcode
  3. invalid value/resource
  4. serial number of the failed request
  5. serial number of the current request
Only the first three are generally useful to application programmers.

The first line of the error message indicates the error type. In the above example, this is BadValue. BadValue means that one of your program's X requests accepts a range of integer values, but the value used was not within that range. Other error types are listed on pages 306-308 of [Scheifler] and some are discussed later in this article. By itself, the error type is rarely sufficient to debug your problem.

The second line of the error message gives the major opcode of the X request that caused the error. In this case it is opcode number 63. The descriptive text says this is the CopyPlane request. Some implementations do not give you this descriptive text, but you may look up the request name, on UNIX systems, in the file /usr/include/X11/Xproto.h, e.g.:

% grep 63 /usr/include/X11/Xproto.h #define X_CopyPlane 63

The error type and X request name are sometimes sufficient to debug the problem. The third line of the error message is often also useful. This line gives the invalid value (for BadValue errors) or the invalid resource identifier (for most other errors). In the above example, the invalid value is 0x3, a hexadecimal integer.

The last two lines of the error message give the serial number of the request that caused the error and the serial number of the current request. These are rarely useful to application programmers and will not be discussed in this article.

In some cases, the error message will also give a minor opcode. This is only useful for errors reported by X protocol extensions. The documentation for your protocol extension should provide information on interpreting the minor opcode, so we will not discuss these here.

Synchronization

With experience, you will often be able to use the above information to directly determine the particular line of code in your problem that causes the X protocol error. For completeness, however, we will also discuss how to obtain more information by enabling synchronization and generating stack traces.

X protocol requests are normally buffered and processed asynchronously, thus traditional techniques for identifying code problems often will not work; print statements or debugger breakpoints will identify the point where the low-level X library (Xlib) flushes the problem request, not the desired point where the request is placed in the Xlib request buffer. To work around this, Xlib provides a synchronous mode where each request is sent immediately to the X server and all errors are reported before new requests are sent. You usually want to use the synchronous mode when debugging X programs, though you shouldn't use it all the time, as it can significantly affect the performance of your program.

The Xlib and X Toolkit (Xt) libraries provide different methods for enabling synchonization. The most useful methods are run-time methods, as they do not require modifying and recompiling your program.

On UNIX (and many other) systems, Xlib provides a global variable, _Xdebug, that, when set to a non-zero value before XOpenDisplay() is executed, enables synchonization. This variable is most useful when set via a source level debugger, such as the UNIX dbx debugger. Using dbx, simply stop in your main procedure before XOpenDisplay() is called and enter:

(dbx) assign _Xdebug = 1

The X Toolkit provides two simpler techniques for enabling run-time synchronization:

Note that the _Xdebug method will not work with the X Toolkit because of the way the X Toolkit's techniques are implemented.

In rare cases, you may want to hard code the synchronization mode. This should be used only as a last resort because the above methods are simpler and less likely to be forgotten. If you must hard code the synchronization mode, use the Xlib XSynchronize() function.

Stack Trace

Using the synchronization mode, identifying the particular line of code causing the protocol error is straight forward. Liberal use of print statements in your program could help. An often simpler method is to generate a stack trace using a source level debugger. UNIX implementations of Xlib usually call the undocumented _XError() or _XDefaultError() functions when a protocol error has occurred, so you could set a breakpoint there. For example, again using the UNIX dbx debugger: (dbx) stop in _XError [2] stop in _XError (dbx) run -synchronous [2] stopped at [_XError:2058 ,0x4ea5e4] (dbx) where > 0 _XError ... 1 _XReply ... 2 XSync.XSync ... 3 XSyncFunction ... 4 XCopyPlane.XCopyPlane ... 5 .block1 ["demo_program.c":59, 0x400610] etc. (dbx) cont X Error of failed request: BadValue (integer parameter out of range) Major opcode of failed request: 63 (X_CopyPlane) Value in failed request: 0x3 Serial number of failed request: 37 Current serial number in output stream: 38

The stack trace tells us that line 59 of file demo_program.c contains the offending statement and that the offending statement is a call to XCopyPlane(). Of course, at this point, you can also use other features of your debugger to get more information about the state of the program, such as printing the values of various variables used in the procedure.

Note that, unfortunately, while the above works with most standard implementations, the function _XError() is not officially supported. If this causes problems for you, a more portable alternative is to use the Xlib XSetErrorHandler() function to formally redefine the X protocol error handler and then set a breakpoint or force a core dump in your new error handler.

Combining Stack Trace And Error Message Data

Together, the protocol error message and the stack trace provide enough information to identify almost all programming problems that cause protocol errors.

Let's go back to our example error message:

X Error of failed request: BadValue (integer parameter out of range) Major opcode of failed request: 63 (X_CopyPlane) Value in failed request: 0x3 Serial number of failed request: 37 Current serial number in output stream: 38

The major opcode of 63 identifies a CopyPlane protocol request. Many X protocol requests can be generated by several different Xlib functions. For example, the CreateWindow protocol request can be generated by either XCreateWindow() or XCreateSimpleWindow(). You should use the tables in Appendix A of [Scheifler] to map request types to Xlib functions. (These tables are also available on-line.) In our example, the CopyPlane request is generated only by the XCopyPlane() function, which corroborates the information in our stack trace.

Next, we look up the XCopyPlane() function in [Scheifler] to see what error conditions can cause a BadValue error. Page 195 of [Scheifler] says "If plane does not have exactly one bit set to 1 ... a BadValue error results." Since the invalid value of 0x3, when considered as a bit mask, has two bits set to 1, this is the cause of our protocol error.

In this example, the problem may have been a simple typographical error in the program. Or, the invalid value could have been generated by a faulty algorithm that computed 0x3. In either case, identifying the exact location of the invalid value in your program is a major first step in fixing the problem.

Other Error Types

All X protocol error types may be diagnosed using the above techniques. BadValue errors, such as the above, are caused when a range of values is valid, but the supplied value is not among them. In our case, the supplied value had the wrong number of bits in a bit mask set. More commonly, a supplied value might be negative or zero, when only positive values are valid.

BadWindow, BadDrawable, and other bad identifier errors occur when the supplied identifier is invalid or has a type other than that supported for the request. For example, an uninitialized variable might cause an identifier of 0x0 to be incorrectly supplied. This is easy to catch, as the protocol error message will show the value 0x0, which is invalid for identifiers. More complex is, for example, supplying a pixmap identifier when a window identifier is required. Or, a previously valid window identifier may refer to a window that has since been destroyed. In the latter two cases, a BadWindow error results, but the invalid identifier looks reasonable. Only tracing the error using the techniques provided earlier in this article will find the problem.

BadMatch errors occur when only specific values are acceptable, but another value is provided. The valid values may be a small set of enumerated integers or they may be a relation between other arguments, e.g., a graphics context in a drawing request must have the same depth as the drawing window. There is rarely more than one possible BadMatch error for any particular request type, so identifying the problem is usually straight forward. In my experience, most BadMatch errors are related to drawable depths. Make sure your windows, pixmaps, visual types, colormaps, etc. have the correct depths in your X requests.

The above are the most common error types. Others are usually specific enough that they are usually easily identified once the techniques described in this article are used. [Scheifler] is probably the only book on X programming that details all the error conditions. A companion article[Lee91] describes the most common X programming problems, many of which will cause X protocol errors.

Other Tools

The examples in this article used the commonly available UNIX dbx debugger. Other systems will have similar source level debuggers. These are adequate for most X protocol errors. At times, however, more sophisticated tools such as the xscope protocol analyzer[Lee95] or commercial source code interpreters may also be helpful. You must decide if the cost and complexity of these tools is worth while for your project.

Subroutine Libraries

Of course, source code techniques work best if you have access to the source code containing the call to the Xlib function causing the error. If the Xlib function is in a subroutine library of which you only have binaries or libraries, you can still use many of the above techniques, but you may have to be a little creative with the interpretation of the results. The problem may be in the subroutine library or it may be in the way your program provides data to the library. Hopefully, your experience will point you in the right direction.

Conclusion

This article describes simple, but powerful, techniques for debugging X protocol errors. Almost all X protocol errors can be successfully diagnosed by combining the data from the protocol error message and the data from a synchronized stack trace. Often the problem is solved immediately. Other times, this is a strong first step.

References

[Lee91]
Kenton Lee, "Behind Curtain X," UNIX Review, vol. 9, no. 6, June, 1991. An updated version of this paper is "The 40 Most Common X Programming Errors," The X Journal, vol. 2, no. 4, March, 1993.
[Lee95]
Kenton Lee, "X Input Debugging," The X Journal, January, 1995.
[Scheifler]
Robert W. Scheifler and James Gettys, X Window System (3rd Edition), Digital Press, 1992.


THE AUTHOR

Kenton Lee is an independent software consultant specializing in X Window System and OSF/Motif software development. He has been developing UNIX graphical user interface software since 1981.

Ken has published over two dozen technical papers on the X Window System. Most are available over the World Wide Web at http://www.rahul.net/kenton/bib.html.

Ken may be reached by Internet electronic mail to kenton @ rahul.net or the World Wide Web at http://www.rahul.net/kenton/.


[HOME] For more information on the X Window System, please visit my home page..


Please send me your comments on this paper:

Name: E-mail:

[X Consulting] [Home] [Mail] [X Papers] [X WWW Sites]