The formal verification of the seL4 microkernel started as a research project in 2004 and has achieved commercial scale now, in the number of properties proven, the supported features and platforms, the adoption and deployment by industry and government organisations. It is supported by an open-source Foundation and a growing ecosystem. In this talk, I will reflect on the seL4 verification journey, past, present and future, and the challenges to combine the art and craft of proof with the reality of meeting industry demand for verified software.

Program verification in the large is not only a matter of mechanizing a program logic to handle the semantics of your programming language. You must reason in the mathematics of your application domain--and there are many application domains, each with their own community of domain experts. So you will need to import mechanized proof theories from many domains, and they must all interoperate. Such an ecosystem is not only a matter of mathematics, it is a matter of software process engineering and social engineering. Coq's ecosystem has been maturing nicely in these senses.

A semantic embedding is a logical encoding of a formal language, namely the object language, into the specification language of a logical framework. In their seminal paper “Experience with embedding hardware description languages in HOL”, Boulton et al. coined the terms deep and shallow embeddings depending on whether or not the syntax of terms of the target language is represented by a data type in the specification language. Thus, a deep embedding enables reasoning about classes of terms, while a shallow embedding limits reasoning to concrete terms. Embeddings of programming languages are well-known applications of interactive theorem provers, specially of those based on higher-order logic. These embeddings are often intended to support the study of a programming language semantics or to enhance a programming language with the deductive capabilities of the logical framework. A different type of embeddings, here referred to as structural embeddings, are intended to augment specification languages with structural elements of the object language. In a structural embedding, the outermost elements of the object language, i.e., the structural parts, are encoded, either deeply or shallowly, but the internal elements, i.e., the basic expressions, are those of the specification language. Advances in automated reasoning and user interfaces have enabled structural embeddings to enhance usability of interactive theorem provers and to reduce the gap between verification tools and modeling tools used by practitioners. This talk presents an overview of several years of research on theorem proving in safety-critical aerospace systems through the lens of embeddings and, more particularly, structural embeddings. The talk focuses on lessons learned and provides examples of successful applications to automated reasoning, termination analysis, floating-point analysis, and verification of cyber-physical systems. Our main point, which is hardly original, is that interactive theorem provers will serve as intermediate systems that connect a cluster of components. Structural embeddings could then provide the frontend capabilities to access this cluster of components.

Cogent is a restricted functional language designed to reduce the cost of developing verified systems code. Because of its sometimes-onerous restrictions, such as the lack of support for recursion and its strict uniqueness type system, Cogent provides an escape hatch in the form of a foreign function interface (FFI) to C code. This poses a problem when verifying Cogent programs, as imported C components do not enjoy the same level of static guarantees that Cogent does. Previous verification of file systems implemented in Cogent merely assumed that their C components were correct and that they preserved the invariants of Cogent’s type system. In this paper, we instead prove such obligations. We demonstrate how they smoothly compose with existing Cogent theorems, and result in a correctness theorem of the overall Cogent-C system. The Cogent FFI constraints ensure that key invariants of Cogent’s type system are maintained even when calling C code. We verify reusable higher-order and polymorphic functions including a generic loop combinator and array iterators and demonstrate their application to several examples including binary search and the BilbyFs file system. We demonstrate the feasibility of verification of mixed Cogent-C systems, and provide some insight into verification of software comprised of code in multiple languages with differing levels of static guarantees.

Lexers and parsers are attractive targets for attackers because they often sit at the boundary between a software system's internals and the outside world. Formally verified lexers can reduce the attack surface of these systems, thus making them more secure.

One recent step in this direction is the development of Verbatim, a verified lexer based on the concept of Brzozowski derivatives. Two limitations restrict the tool's usefulness. First, its running time is quadratic in the length of its input string. Second, the lexer produces tokens with a simple "tag and string" representation, which limits the tool's ability to integrate with parsers that operate on more expressive token representations.

In this work, we present a suite of extensions to Verbatim that overcomes these limitations while preserving the tool's original correctness guarantees. The lexer achieves effectively linear performance on a JSON benchmark through a combination of optimizations that, to our knowledge, has not been previously verified. The enhanced version of Verbatim also enables users to augment their lexical specifications with custom semantic actions, and it uses these actions to produce semantically rich tokens---i.e., tokens that carry values with arbitrary, user-defined types. All extensions were implemented and verified with the Coq Proof Assistant.

On in-order processors, without dynamic instruction scheduling, program running times may be significantly reduced by compile-time instruction scheduling. We present here the first effective certified instruction scheduler that operates over superblocks (it may move instructions across branches), along with its performance evaluation. It is integrated within the CompCert C compiler, providing a complete machine-checked proof of semantic preservation from C to assembly.

Our optimizer composes several passes designed by translation validation: program transformations are proposed by untrusted oracles, which are then validated by certified and scalable checkers. Our main checker is an architecture-independent simulation-test over superblocks modulo register liveness, which relies on hash-consed symbolic execution.

Skeletal semantics is a framework to describe semantics of programming languages. We propose an automatic generation of a certified OCaml interpreter for any language written in skeletal semantics. To this end, we introduce two new interpretations, i.e., formal meanings, of skeletal semantics, in the form of non-deterministic and deterministic abstract machines. These machines are derived from the usual big-step interpretation of skeletal semantics using functional correspondence, a standard transformation from big-step evaluators to abstract machines. All these interpretations are formalized in the Coq proof assistant and we certify their soundness. We finally use the extraction from Coq to OCaml to obtain the certified interpreter.

Structured natural languages provide a trade space between ambiguous natural languages that make up most written requirements, and mathematical formal specifications such as Linear Temporal Logic. FRETish is a structured natural language for the elicitation of system requirements developed at NASA. The related open-source tool Fret provides support for translating FRETish requirements into temporal logic formulas that can be input to several verification and analysis tools. In the context of safety-critical systems, it is crucial to ensure that a generated formula captures the semantics of the corresponding FRETish requirement precisely. This paper presents a rigorous formalization of the FRETish language including a new denotational semantics and a proof of semantic equivalence between FRETish specifications and their temporal logic counterparts computed by Fret. The complete formalization and the proof have been developed in the Prototype Verification System (PVS) theorem prover.

A transient data structure is a package of an ephemeral data structure, a persistent data structure, and fast conversions between them. We describe the specification and proof of a transient stack and its iterators. This data structure is a scaled-down version of the general-purpose transient sequence data structure implemented in the OCaml library Sek. Internally, it relies on fixed-capacity arrays, or chunks, which can be shared between several ephemeral and persistent stacks. Dynamic tests are used to determine whether a chunk can be updated in place or must be copied: a chunk can be updated if it is uniquely owned or if the update is monotonic. Using CFML, which implements Separation Logic with Time Credits inside Coq, we verify the functional correctness and the amortized time complexity of this data structure. Our verification effort covers iterators, which involve direct pointers to internal chunks. The specification of iterators describes what the operations on iterators do, how much they cost, and under what circumstances an iterator is invalidated.

We present the first formal specification and verification of the fine-grained concurrent multi-producer-multi-consumer queue algorithm from Meta’s C++ library Folly of core infrastructure components. The queue is highly optimized, practical, and used by Meta in production where it scales to thousands of consumer and producer threads. We present an implementation of the algorithm in an ML-like language and formally prove that it is a contextual refinement of a simple coarse-grained queue (a property that implies that the MPMC queue is linearizable). We use the ReLoC relational logic and the Iris program logic to carry out the proof and to mechanize it in the Coq proof assistant. The MPMC queue is implemented using three modules, and our proof is similarly modular. By using ReLoC and Iris’s support for modular reasoning we verify each module in isolation and compose these together. A key challenge of the MPMC queue is that it has a so-called *external linearization point*, which ReLoC has no support for reasoning about. Thus we extend ReLoC, both on paper and in Coq, with novel support for reasoning about external linearization points.

We use Iris, an implementation of concurrent separation logic in the Coq proof assistant, to verify two queue data structures used for inter-process communication in an operating system under development. Our motivations are twofold. First, we wish to leverage formal verification to boost confidence in a delicate piece of industrial code that was subject to numerous revisions. Second, we aim to gain information on the cost-benefit tradeoff of applying a state-of-the-art formal verification tool in our industrial setting. On both fronts, our endeavor has been a success. The verification effort proved that the queue algorithms are correct and uncovered four algorithmic simplifications as well as bugs in client code. The simplifications involve the removal of two memory barriers, one atomic load, and one boolean check, all in a performance-sensitive part of the OS. Removing the redundant boolean check revealed unintended uses of uninitialized memory in multiple device drivers, which were fixed. The proof work was completed in person months, not years, by engineers with no prior familiarity with Iris. These findings are spurring further use of verification at Meta.

Build systems are a fundamental part of software construction, but their correctness has received comparatively little attention, relative to more prominent parts of the toolchain. In this paper, we address the correctness of forward build systems, which automatically determine the dependency structure of the build, rather than having it specified by the programmer.

We first define what it means for a forward build system to be correct---it must behave identically to simply executing the programmer-specified commands in order. Of course, realistic build systems avoid repeated work, stop early when possible, and run commands in parallel, and we prove that these optimizations, as embodied in the recent forward build system Rattle, preserve our definition of correctness. Along the way, we show that other forward build systems, such as Fabricate and Memoize, are also correct.

We carry out all of our work in Agda, and describe in detail the assumptions underlying both Rattle itself and our modeling of it.

We present a formal, machine checked TLA+ safety proof of *MongoRaftReconfig*, a distributed dynamic reconfiguration protocol. *MongoRaftReconfig* was designed for and implemented in MongoDB, a distributed database whose replication protocol is derived from the Raft consensus algorithm. We present an inductive invariant for *MongoRaftReconfig* that is formalized in TLA+ and formally proved using the TLA+ proof system (TLAPS). We also present a formal TLAPS proof of two key safety properties of *MongoRaftReconfig*, *LeaderCompleteness* and *StateMachineSafety*. To our knowledge, these are the first machine checked inductive invariant and safety proof of a dynamic reconfiguration protocol for a Raft based replication system.

Cryptographic interactive proof systems provide an efficient and scalable means of verifying the results of computation on blockchain. A prover constructs a proof, off-chain, that the execution of a program on a given input terminates with a certain result. The prover then publishes a certificate that can be verified efficiently and reliably modulo commonly accepted cryptographic assumptions. The method relies on an algebraic encoding of execution traces of programs. Here we report on a verification of the correctness of such an encoding of the Cairo model of computation with respect to the STARK interactive proof system, using the Lean 3 proof assistant.

In this paper we derive a suite of lemmas which allows users to internally reflect EasyCrypt programs into distributions which correspond to their denotational semantics (probabilistic reflection). Based on this we develop techniques for reasoning about rewinding of adversaries in EasyCrypt. (A widely used technique in cryptology.) We use our reflection and rewindability results to prove the security of a coin-toss protocol.

In their article titled *From Types to Sets by Local Type Definitions in Higher-Order Logic* and published in the proceedings of the conference *Interactive Theorem Proving* in 2016, Ondřej Kunčar and Andrei Popescu propose an extension of the logic *Isabelle/HOL* and an associated algorithm for the relativization of the *type-based theorems* to more flexible *set-based theorems*, collectively referred to as *Types-To-Sets*. One of the aims of their work was to open an opportunity for the development of a software tool for applied relativization in the implementation of the logic Isabelle/HOL of the proof assistant *Isabelle*. In this article, we provide a description of a software framework for the interactive automated relativization of definitions and theorems in Isabelle/HOL, developed as an extension of the proof language *Isabelle/Isar*, building upon some of the ideas for further work expressed in the original article on Types-To-Sets by Ondřej Kunčar and Andrei Popescu and the subsequent article *Smooth Manifolds and Types to Sets for Linear Algebra in Isabelle/HOL*, which was written by Fabian Immler and Bohua Zhan and published in the proceedings of the *International Conference on Certified Programs and Proofs* in 2019.

We explore the features of a user interface where formal proofs can be built through gestural actions. In particular, we show how proof construction steps can be associated to drag-and-drop actions. We argue that this can provide quick and intuitive proof construction steps. This work builds on theoretical tools coming from deep inference. It also resumes and integrates some ideas of the former proof-by-pointing project.

Theories over strings are among the most heavily researched logical theories in the SMT community in the past decade, owing to the error-prone nature of string manipulations, which often leads to security vulnerabilities (e.g. cross-site scripting and code injection). The majority of the existing decision procedures and solvers for these theories are themselves intricate; they are complicated algorithmically, and also have to deal with a very rich vocabulary of operations. This has led to a plethora of bugs in implementation, which have for instance been discovered through fuzzing.

In this paper, we present CertiStr, a certified implementation of a string constraint solver for the theory of strings with concatenation and regular constraints. CertiStr aims to solve string constraints using a forward-propagation algorithm based on symbolic representations of regular constraints as symbolic automata, which returns three results: *sat*, *unsat*, and *unknown*, and is guaranteed to terminate for the string constraints whose concatenation dependencies are acyclic. The implementation has been developed and proven correct in Isabelle/HOL, through which an effective solver in OCaml was generated. We demonstrate the effectiveness and efficiency of CertiStr against the standard Kaluza benchmark, in which 80.4% tests are in the string constraint fragment of CertiStr. Of these 80.4% tests, CertiStr can solve 83.5% (i.e. CertiStr returns *sat* or *unsat*) within 60s.

Several proof assistants, such as Isabelle or Coq, can concurrently check multiple proofs. In contrast, the vast majority of today's small proof checkers either does not support concurrency at all or only limited forms thereof, restricting the efficiency of proof checking on multi-core processors. This work shows the design of a small, memory- and thread-safe kernel that efficiently checks proofs both concurrently and sequentially. This design is implemented in a new proof checker called Kontroli for the lambda-Pi calculus modulo rewriting, which is an established framework to uniformly express a multitude of logical systems. Kontroli is faster than the reference proof checker for this calculus, Dedukti, on all of five evaluated datasets obtained from proof assistants and interactive theorem provers. Furthermore, Kontroli reduces the time of the most time-consuming part of proof checking using eight threads by up to 6.6x.

Lie algebras are an important class of algebras which arise throughout mathematics and physics. We report on the formalisation of Lie algebras in Lean's Mathlib library. Although basic knowledge of Lie theory will benefit the reader, none is assumed; the intention is that the overall themes will be accessible even to readers unfamiliar with Lie theory.

Particular attention is paid to the construction of the classical and exceptional Lie algebras. Thanks to these constructions, it is possible to state the classification theorem for finite-dimensional semisimple Lie algebras over an algebraically closed field of characteristic zero.

In addition to the focus on Lie theory, we also aim to highlight the unity of Mathlib. To this end, we include examples of achievements made possible only by leaning on several branches of the library simultaneously.

The two squares theorem of Fermat is a gem in number theory, with a spectacular one-sentence "proof from the Book". Here is a formalisation of this proof, with an interpretation using windmill patterns. The theory behind involves involutions on a finite set, especially the parity of the number of fixed points in the involutions. Starting as an existence proof that is non-constructive, there is an ingenious way to turn it into a constructive one. This gives an algorithm to compute the two squares by iterating the two involutions alternatively from a known fixed point.

A direct proof of the Steiner-Lehmus theorem has eluded geometers for over 170 years. The challenge has been that a proof is only considered direct if it does not rely on *reductio ad absurdum*. Thus, any proof that claims to be direct must show, going back to the axioms, that all of the auxiliary theorems used are also proved directly. In this paper, we give a proof of the Steiner-Lehmus theorem that is guaranteed to be direct. The evidence for this claim is derived from our methodology: we have formalized a constructive axiom set for Euclidean plane geometry in a proof assistant that implements a constructive logic and have built the proof of the Steiner-Lehmus theorem on this constructive foundation.

We mechanise central metatheoretic results about second-order logic (SOL) using the Coq proof assistant. Concretely, we consider undecidability via many-one reduction from Diophantine equations (Hilbert's tenth problem), incompleteness regarding full semantics via categoricity of second-order Peano arithmetic, and completeness regarding Henkin semantics via translation to mono-sorted first-order logic (FOL). Moreover, this translation is used to transport further characteristic properties of FOL to SOL, namely the compactness and Löwenheim-Skolem theorems.

The logic of bunched implications (BI) is a substructural logic that forms the backbone of separation logic, the much studied logic for reasoning about heap-manipulating programs. Although the proof theory and metatheory of BI are mathematically involved, the formalization of important metatheoretical results is still incipient. In this paper we present a self-contained formalized, in the Coq proof assistant, proof of a central metatheoretical property of BI: cut elimination for its sequent calculus.

The presented proof is *semantic*, in the sense that is obtained by interpreting sequents in a particular “universal” model. This results in a more modular and elegant proof than a standard Gentzen-style cut elimination argument, which can be subtle and error-prone in manual proofs for BI. In particular, our semantic approach avoids unnecessary inversions on proof derivations, or the uses of cut reductions and the multi-cut rule.

Besides modular, our approach is also robust: we demonstrate how our method scales, with minor modifications, to (i) an extension of BI with an arbitrary set of *simple structural rules*, and (ii) an extension with an S4-like □ modality.

In previous work ("From signatures to monads in UniMath"),we described a category-theoretic construction of abstract syntax from a signature, mechanized in the UniMath library based on the Coq proof assistant.

In the present work, we describe what was necessary to generalize that work to account for simply-typed languages. First, some definitions had to be generalized to account for the natural appearance of non-endofunctors in the simply-typed case. As it turns out, in many cases our mechanized results carried over to the generalized definitions without any code change. Second, an existing mechanized library on 𝜔-cocontinuous functors had to be extended by constructions and theorems necessary for constructing multi-sorted syntax. Third, the theoretical framework for the semantical signatures had to be generalized from a monoidal to a bicategorical setting, again to account for non-endofunctors arising in the typed case. This uses actions of endofunctors on functors with given source, and the corresponding notion of strong functors between actions, all formalized in UniMath using a recently developed library of bicategory theory. We explain what needed to be done to plug all of these ingredients together, modularly.

The main result of our work is a general construction that, when fed with a signature for a simply-typed language, returns an implementation of that language together with suitable boilerplate code, in particular, a certified monadic substitution operation.

Deep data types are those that are constructed from other data types, including, possibly, themselves. In this case, they are said to be truly nested. Deep induction is an extension of structural induction that traverses *all* of the structure in a deep data type, propagating predicates on its primitive data throughout the entire structure. Deep induction can be used to prove properties of nested types, including truly nested types, that cannot be proved via structural induction. In this paper we show how to extend deep induction to GADTs that are not truly nested GADTs. This opens the way to incorporating automatic generation of (deep) induction rules for them into proof assistants. We also show that the techniques developed in this paper do not suffice for extending deep induction to truly nested GADTs, so more sophisticated techniques are needed to derive deep induction rules for them.

We work with combinatorial maps to represent graph embeddings into surfaces up to isotopy. The surface in which the graph is embedded is left implicit in this approach. The constructions herein are proof-relevant and stated with a subset of the language of homotopy type theory.

This article presents a refinement of one characterisation of embeddings in the sphere, called spherical maps, of connected and directed multigraphs with discrete node sets. A combinatorial notion of homotopy for walks and the normal form of walks under a reduction relation is introduced. The first characterisation of spherical maps states that a graph can be embedded in the sphere if any pair of walks with the same endpoints are merely walk-homotopic. The refinement of this definition filters out any walk with inner cycles. As we prove in one of the lemmas, if a spherical map is given for a graph with a discrete node set, then any walk in the graph is merely walk-homotopic to a normal form.

The proof assistant Agda contributed to formalising the results recorded in this article.