Bibliography for “Practical Testing of a C99 Compiler Using Output Comparison”

This is an updated version of the bibliography from my article, “Practical Testing of a C99 Compiler Using Output Comparison,” published in Software: Practice and Experience (; a pre-print is available at The journal version of the bibliography is in reference order, and contains items discussed in the article but not directly relevant to compiler testing. The online version is sorted alphabetically, with items on topics other than compiler testing separated; it will be updated if more articles on compiler testing appear. The emphasis, as the title of the article indicates, is on practical testing of C/C++ compilers.

The literature on compiler testing is surprisingly scant. There is substantial literature on the theoretical design of compilers which would provably not need testing, but the audience for such work is largely disjoint from that for the testing of compilers for widely-used languages which will have a substantial user base. There are also a number of articles on the automated generation of test code, but given that there is now a substantial base of real Open Source software, this is less useful than formerly.

This article, and PalmSource’s testing, was firmly directed towards shipping a high-quality, but imperfect, compiler which would be of practical use to the developer community. Producing an inherently bug-free compiler for a theoretically desirable language was not an option. The goal was to catch as high a proportion of serious bugs as possible in a useful compiler for two widely-used languages, C99 and C++98.

The best available bibliography was over a decade old, by Dr. C.J. Burgess of the University of Bristol; it was a posting to the comp.compilers Usenet newsgroups, below [Burgess]. (See now also the bibliography in [Chen, Patra, et al.]) Bailey & Davidson [Bailey & Davidson] is an academic article on the testing of function calls, somewhat similar to Lindig’s Quest [Lindig]; it contains the interesting observations that “the state-of-the-art in compiler testing is inadequate” (p. 1040), and that in their experience, the ratio of failed tests to bugs was approximately one thousand to one (p. 1041). The standard work on compiler theory is Compilers: Principles, Techniques and Tools [Aho et al], commonly known as the Dragon book. It is a good general introduction, but had little direct relevance to our testing, except for some extra caution in including examples of spaghetti code; other standard compiler texts which were consulted, but did not have significant sections on testing, are omitted from the bibliography. A Retargetable C Compiler: Design and Implementation [Fraser & Hanson] contains a brief section on the authors’ experience with testing their compiler, with some practical advice on the importance of regression test cases; difficulties in using lcc’s regression tests for other compilers are discussed above, in the section on emulated-execution output correctness testing. An updated and alphabetized version of this bibliography will be made available at

Compiler Testing

  1. Bailey, Mark W. and Davidson, Jack W., “Automatic Detection and Diagnosis of Faults in Generated Code for Procedure Calls”, IEEE Transactions on Software Engineering, volume 29, issue 11, 2003. An abstract is available online, at, as is an earlier version of the full paper,
  2. Bhattacharya, Soumyabrata “ANSI C Test suites,” comp.compilers,, 1994.
  3. Burgess, C.J. , “Bibliography for Automatic Test Data Generation for Compilers,” comp.compilers,, 1993.
  4. Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Bing Xie. “Learning to Prioritize Test Programs for Compiler Testing,” ICSE'17: 39th International Conference on Software Engineering, Buenos Aires, Argentina, May 2017.
  5. Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2019. “A Survey of Compiler Testing.” ACM Computing Surveys, to Appear, 2020. (But note that the preprint refers to itself as “ACM Comput. Surv. 1, 1, Article 1 (January 2019),” with an invalid DOI:
  6. Schloss Dagstuhl, Testing and Verification of Compilers
  7. DejaGnu, 1993-.
  8. Delta, a tool for test failure minimization, Wilkerson, Daniel and McPeak, Scott,, 2003-5. Based on [Zeller]. See also [Open Source Quality Project].
  9. Dziubinski, Matt P. “C++ links: compilers - correctness”.
  10. Eide, Eric and Regehr, John “Volatiles are miscompiled, and what to do about it” in Proceedings of the 7th ACM international conference on Embedded software, ISBN 978-1-60558-468-3, Association for Computing Machinery 2008. Preprint at
  11. Ellison, Chucky and Rosu, Grigore “Defining the Undefinedness of C,” University of Illinois technical report, 2012.
  12. Equivalent Modulo Input Compiler Validation Project, UC Davis
  13. Fernandez, Mary and Ramsey, Norman “Automatic Checking of Instruction Specifications,” in Proceedings of the 19th International Conference on Software Engineering, ISBN:0-89791-914-9, Association for Computing Machinery 1997. Preprint at
  14. Fraser, Christopher and Hanson, David, A Retargetable C compiler: Design and Implementation, ISBN: 0-8053-1670-1, Benjamin/Cummings Publishing, 1995, §19.5 pp. 531–3.
  15. Niranjan Hasabnis, Rui Qiao, and R. Sekar, 2015. “Checking correctness of code generator architecture specifications.” In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '15). IEEE Computer Society, Washington, DC, USA, 167-178.
  16. Jones, Derek “Who Guards the Guardians?” (a study of the coverage of the Perennial Validation Suite),, 1993.
  17. Kahan, William Sumner, Thos, et al., Paranoia Floating Point Test,, 1983-5.
  18. lcc, A Retargetable Compiler for ANSI C,; described in A Retargetable C Compiler: Design and Implementation, Hanson, David R. and Fraser, Christopher W., ISBN: 0-8053-1670-1, Benjamin/Cummings Publishing 1995.
  19. Lindig, Christian, “Random Testing of the Translation of C Function Calls”, Proceedings of the Sixth International Workshop on Automated Debugging, ISBN 1-59593-050-7, Association for Computing Machinery 2005.
  20. McKeeman, William M. “Differential Testing for Software,” Digital Technical Journal, Vol. 10 No. 1, 1998.
  21. MattPD: “C++ links: compilers - correctness
  22. Modena Test++ Suite,
  23. George C. Necula, “Translation Validation for an Optimizing Compiler
  24. Open Source Quality Project
  25. Perennial Validation Suites
  26. Plum Hall C and C++ Validation Test Suites
  27. Regehr, John “Embedded in Academia : A Critical Look at the SCADE Compiler Verification Kit,” blog posting, 2011.
  28. Regehr, John “Are Compilers Getting More or Less Reliable?” blog posting, 2013.
  29. Regehr, John “Guidelines for Research on Finding Bugs” blog posting, 2013.
  30. John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang, Test-Case Reduction for C Compiler Bugs (C-Reduce) in Proceedings of 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2012), Beijing, China, June 2012.
  31. Sheridan, Flash, “Practical Testing of a C99 Compiler Using Output Comparison,” Software: Practice and Experience,, 2007. A pre-print is available at A list of bugs discovered using the techniques in the article is at
  32. Small Device C Compiler (SDCC), Dutta, Sandeep et al.,, 1999-.
  33. Zhendong Su et al., Equivalence Modulo Inputs Compiler Validation Project, University of California, Davis, 2014–.
  34. Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su, “Toward Understanding Compiler Bugs in GCC and LLVM,” Proceedings of ISSTA 2016, Saarbrucken, Germany, 2016. Source code and dataset; pre-print.
  35. Tydeman, Fred, C99 FPCE Test Suite,, 1995-2006.
  36. Vallat, Miod “Compilers in OpenBSD,” openbsd-misc posting 2013.
  37. Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama, “Towards optimization-safe systems,”SOSP'13: The 24th ACM Symposium on Operating Systems Principles,2013.
  38. Xuejun Yang, Random Testing of Open Source C Compilers,, doctoral thesis, The University of Utah, December 2014.
  39. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr, “Finding and Understanding Bugs in C Compilers,” Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation,, preprint at
  40. Zeller, A.: “Yesterday, my program worked. Today, it does not. Why?”, Software Engineering - ESEC/FSE'99: 7th European Software Engineering Conference, ISSN 0302-9743, volume 1687 of Lecture Notes in Computer Science, pp. 253-267, 1999.
  41. Qirun Zhang, Chengnian Sun, and Zhendong Su, “Skeletal Program Enumeration for Rigorous Compiler Testing,” in Proceedings of PLDI, Barcelona, Spain, June 2017.

Source Code Useful for Compiler Testing (Primarily C/C++)


Copyright © 2002-2007, Access Systems Americas, Inc. PalmSource, Palm OS and Palm Powered, and certain other trade names, trademarks and logos are trademarks which may be registered in the United States, France, Germany, Japan, the United Kingdom and other countries, and are either owned by PalmSource, Inc. or its affiliates, or are licensed exclusively to PalmSource, Inc. by Palm Trademark Holding Company, LLC. All other brands, trademarks and service marks used herein are or may be trademarks of, and are used to identify other products or services of, their respective owners. All rights reserved. Copyright © 2008-2017, Flash (K.J.) Sheridan.