Presented annually to the author(s) of a paper presented at the PLDI held 10 years prior to the award year. The award includes a prize of $1,000 to be split among the authors of the winning paper. The papers are judged by their influence over the past decade.
Selection Committee The award given in year N is for the most influential paper presented at the conference held in year N-10. The selection committee consists of the following members:
The SIGPLAN Chair shall adjudicate conflicts of interest, appointing substitutes to the committee as necessary.
(for 2012) Test-Case Reduction for C Compiler Bugs
Finding small, valid, C programs that trigger C compiler bugs is hard. Distilling a bug-causing compiler input to its essence was previously a largely manual process assisted by delta debugging. This work enhanced delta debugging with new techniques for generating valid (without undefined behavior) test variants, including pluggable program transformations that perform reduction operations to produce tests 25 times smaller than previous schemes. The paper influenced the broader testing and fuzzing communities beyond compiler testing by reinforcing the importance of domain-specific transformers for test case reduction and in seed generation for fuzzing.
(for 2011) Finding and understanding bugs in C compilers
The paper tackles head-on the difficult problem of automatically finding miscompilation bugs in compilers for unsafe languages such as C. Based on a clever combination of scalable static analysis at generation-time, and dynamic checks at run-time, the paper contributes a method for random generation of C programs that are guaranteed to be free from undefined behavior when they execute, yet are still interesting enough that they have a good chance of exposing compiler bugs that corrupt these results.
These programs can be used to find bugs via cross-checking results across multiple compilers, and are implemented in the Csmith tool which, at the time of the paper, was used to find hundreds of bugs in 11 different C compilers, including the widely-used open source GCC and LLVM compilers.
Csmith revolutionized the field of compiler testing, inspiring and influencing a large body of work that has seen broad uptake by industry, including many practical compiler testing tools for C and many other languages, including C++, OpenCL, CUDA and Verilog.
(for 2010) Green: A Framework for Supporting Energy-Conscious Programming using Controlled Approximation
The PLDI 2010 paper “Green: A Framework for Supporting Energy-Conscious Programming using Controlled Approximation” by Woongki Baek and Trishul Chilimbi demonstrated conclusively two important things: (1) that frameworks supporting approximations that trade energy/performance costs for quality of service could be generalized and automated, and (2) applying this approach in data center search workloads could result in significant energy reduction and cost, saving millions of dollars in practice. Green proved that systematic approaches to approximate computing could have enormous commercial impact and predated and informed many subsequent machine learning optimizations, such as using reduced precision arithmetic in deep learning models. Green also inspired subsequent efforts, to more deeply encode approximation semantics into the underlying programming language design.
(for 2009) FastTrack: Efficient and Precise Dynamic Race Detection
FastTrack had both an immediate and a permanent impact on how dynamic data-race detection is done. Prior work either used lock-set algorithms for performance despite false positives or were slow due to algorithms that required space proportional to the number of threads times the size of the heap and read/write operations that took time proportional to the number of threads. FastTrack contributed an algorithm and implementation that was truly the best of both worlds in most cases: a set of a natural and elegant choices for optimizing data structures and algorithms for dynamic data-race detection leads to a system that is as fast as lock-set algorithms whenever lock-set algorithms would not produce false positives but that gracefully falls back to slower approaches for objects where lock-set algorithms would lead to false positives. In both theory and practice, the FastTrack algorithm became the baseline against which all subsequent dynamic data-race detectors are judged.
(for 2008) A Practical Automatic Polyhedral Parallelizer and Locality Optimizer
The PLDI 2008 paper “A Practical Automatic Polyhedral Parallelizer and Locality Optimizer” significantly advanced the power and practical automatic application of polyhedral analysis to automatically determine parallelism and locality improving affine transforms of regular programs expressed as sequences of possibly imperfect loops. The work presented in this paper produced practical implementations that could automatically generate OpenMP parallel programs from sequential nested loops written in C. This inspired a large number of other researchers to further develop polyhedral optimizations based on this approach, like automatic code generation for GPUs and FPGA circuits by high level synthesis from nested sequential loops. The ideas in this paper have proved to be enduring and widely applicable and even to this day keep finding new applications, e.g., for mapping machine learning algorithms to specialized accelerators.
(for 2007) Valgrind: a framework for heavyweight dynamic binary instrumentation
Valgrind is a framework for writing dynamic binary instrumentation tools, which interleave the execution of program monitoring code with the execution of a program itself. The best known tool built with Valgrind is Memcheck, which detects uses of undefined values down to the bit level. Other tools track “tainted” or special (e.g., password) values. Valgrind facilitates the construction of these tools by abstracting the details of complex computer architectures and providing an expressive interface for writing instrumenting code. Valgrind pioneered dynamic binary recompilation for tight integration and optimization of instrumentation code. The paper contains a careful analysis of the requirements of dynamic binary instrumentation tools and shows that competing tools, unlike Valgrind, sacrifice completeness of features for simplicity of implementation, use, or performance. The paper is widely cited. Many open-source defect detection tools build on Valgrind and have improved a wide range of open-source and commercial software.
(for 2006) DieHard: probabilistic memory safety for unsafe languages
The PLDI 2006 “DieHard” paper by Emery Berger and Benjamin Zorn introduced the concept of probabilistic memory safety (for unsafe languages such as C/C++) through three main techniques: (1) randomization of object placement in the heap; (2) overprovisioning of the heap to provide expected gaps between objects; (3) optional replication of processes to enable even greater fault tolerance. This approach was shown to provide protection against a variety of memory errors, including dangling pointers and buffer overflows, as formalized mathematically and evaluated empirically. By fully randomizing heap object locations in memory – effectively pushing address-space layout randomization to the individual object level – DieHard also makes systems more resilient to attack. Variations of DieHard were implemented on Linux, Mac OS, and Windows. The Windows implementation directly influenced the Fault Tolerant Heap of Windows 7 and the Windows 8 user-mode heap allocator.
(for 2005) Pin: building customized program analysis tools with dynamic instrumentation
This paper introduced Pin, a dynamic binary instrumentation framework that enables the creation of dynamic program analysis tools. Pin uses dynamic compilation to instrument executables and dynamically-linked libraries while they are running, permitting the tool writer to study the behavior of an application at the instruction level without significant perturbation to application behavior. The PLDI 2005 paper is highly cited and the system it describes is in widespread use in academia and industry. Pin’s ease of use and relative efficiency have made it the tool of choice for dynamic binary instrumentation.
(for 2004) Scalable Lock-Free Dynamic Memory Allocation
Maged Michael’s PLDI’04 paper is considered a landmark in memory allocation for multithreaded programs, presenting the first completely lock-free general-purpose user-space allocator. It provides good performance with respect to scalability, speed and space efficiency, while at the same time only relying on common hardware and OS support. The work is highly regarded and frequently referenced, and is also the basis of multiple memory allocator implementations, both in IBM products and in follow-on research.
(for 2003) The nesC language: A holistic approach to networked embedded systems
This is the first publication to describe the design, implementation, and optimization of the language nesC, which is a variant of the C programming language especially well suited to the programming of embedded systems. Much as the C programming language made possible the development of UNIX, so has nesC made possible the development of TinyOS. A decade hence, nesC is still the language of choice for developing applications under TinyOS on “motes” and other embedded systems platforms. Central to the design of nesC and TinyOS is the notion of an asynchronous, or non-blocking, call. This form of software architecture nicely accommodates the timing uncertainties and failures experienced by embedded systems applications in the real world. The success and longevity of nesC can be attributed to its precise and useful formulations of concurrency, atomicity, and modularization with regard to its intended application audience. These formulations allow for efficient implementations while offering encapsulation mechanisms that encourage software sharing and reuse. The nesC paper has amassed hundreds of citations, and the language continues to be in widespread use for teaching, research, and industry in the embedded systems community.
(for 2002) Extended Static Checking for Java
This paper marks a turning point in the field of static checking, describing pragmatic design decisions that promote practicality over completeness. Pioneered in ESC/Modula-3, techniques from ESC/Java are now widely used in various forms in Microsoft’s development tools, notably as part of Code Contracts which ships with VisualStudio. Recent innovations strongly influenced by ESC/Java include refinement types for Haskell, and verification of Eiffel programs.
(for 2001) Automatic predicate abstraction of C programs
The paper, “Automatic Predicate Abstraction of C Programs” by Thomas Ball, Rupak Majumdar, Todd D. Millstein, and Sriram K. Rajamani presented the underlying predicate abstraction technology of the SLAM project for checking that software satisfies critical behavioral properties of the interfaces it uses and to aid software engineers in designing interfaces and software that ensure reliable and correct execution. The technology is now part of Microsoft’s Static Driver Verifier in the Windows Driver Development Kit. This is one of the earliest examples of automation of software verification on a large scale and the basis for numerous efforts to expand the domains that can be verified.
(for 2000) Dynamo: A Transparent Dynamic Optimization System
Dynamo pioneered the technique of monitoring, analyzing, and optimizing binary code on-the-fly while the program executes, without relying on any program modifications, compiler hints, profile data from prior runs, or special purpose hardware. Contrary to intuition, one could use Dynamo to substantially improve the performance of a binary’s execution, even when it was generated by a state-of-the-art optimizing compiler. By continuously monitoring its own overhead, Dynamo could suspend itself (and resume later) when the optimization gains were not sufficient to offset the cost of its own operation. The ability for a software-only system to speed up a program binary without any kind of modification or externally provided hints challenged much entrenched thinking at the time, and catalyzed a rethinking of the compiler-architecture interface. Hardware designs were shifting more of the performance optimization burden from the hardware to the compiler. At the same time software was moving towards greater use of dynamic binding, resulting in a shrinking optimization scope for the compiler. These two trends were in tension with one another. By operating at binary execution time, Dynamo helped bridge this growing gap by complementing the compiler’s traditional strength in static control flow based optimization with instruction-level runtime trace-based optimization. This paper was the first in a series of publications that continue to this day on similar trace-based systems for dynamic binary optimization, dynamic binary instrumentation, and dynamic binary translation.
(for 1999) A Fast Fourier Transform Compiler
The 1999 PLDI paper “A Fast Fourier Transform Compiler” by Matteo Frigo describes the implementation of genfft, a special-purpose compiler that produces the performance critical code for a library, called FFTW (the “Fastest Fourier Transform in the West”), that computes the discrete Fourier transform. FFTW is the predominant open fast Fourier transform package available today, as it has been since its introduction a decade ago. genfft demonstrated the power of domain-specific compilation – FFTW achieves the best or close to best performance on most machines, which is remarkable for a single package. By encapsulating expert knowledge from the FFT algorithm domain and the compiler domain, genfft and FFTW provide a tremendous service to the scientific and technical community by making highly efficient FFTs available to everyone on any machine. As well as being the fastest FFT in the West, FFTW may be the last FFT in the West as the quality of this package and the maturity of the field may mean that it will never be superseded, at least for computer architectures similar to past and current ones.
(for 1998) The Implementation of the Cilk-5 Multithreaded Language
The 1998 PLDI paper “Implementation of the Cilk-5 Multithreaded Language” by Matteo Frigo, Charles E. Leiserson, and Keith H. Randall introduced an efficient form of thread-local deques to control scheduling of multithreaded programs. This innovation not only opened the way to faster and simpler runtimes for fine-grained parallelism, but also provided a basis for simpler parallel recursive programming techniques that elegantly extend those of sequential programming. The stack-like side of a deque acts just like a standard procedure stack, while the queue side enables breadth-first work-stealing by other threads. The work-stealing techniques introduced in this paper are beginning to influence common practice, such as the Intel Threading Building Blocks project, an upcoming standardized fork-join framework for Java, and a variety of projects at Microsoft.
(for 1997) Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling
(for 1996) TIL: A Type-Directed Optimizing Compiler for ML
(for 1995) Selective Specialization for Object-Oriented Languages
(for 1994) ATOM: a system for building customized program analysis tools
(for 1993) Space Efficient Conservative Garbage Collection
(for 1992) Lazy Code Motion
(for 1991) A data locality optimizing algorithm
(for 1990) Profile guided code positioning