We study Satisfiability Modulo Theories (SMT) enriched with the so-called Ramsey quantifiers, which assert the existence of cliques (complete graphs) in the graph induced by some formulas. The extended framework is known to have applications in proving program termination (in particular, whether a transitive binary predicate is well-founded), and monadic decomposability of SMT formulas. Our main result is a new algorithm for eliminating Ramsey quantifiers from three common SMT theories: Linear Integer Arithmetic (LIA), Linear Real Arithmetic (LRA), and Linear Integer Real Arithmetic (LIRA). In particular, if we work only with existentially quantified formulas, then our algorithm runs in polynomial time and produces a formula of linear size. One immediate consequence is that checking well-foundedness of a given formula in the aforementioned theory defining a transitive predicate can be straightforwardly handled by highly optimized SMT-solvers. We show also how this provides a uniform semi-algorithm for verifying termination and liveness with completeness guarantee (in fact, with an optimal computational complexity) for several well-known classes of infinite-state systems, which include succinct timed systems, one-counter systems, and monotonic counter systems. Another immediate consequence is a solution to an open problem on checking monadic decomposability of a given relation in quantifier-free fragments of LRA and LIRA, which is an important problem in automated reasoning and constraint databases. Our result immediately implies decidability of this problem with an optimal complexity (coNP-complete) and enables exploitation of SMT-solvers. It also provides a termination guarantee for the generic monadic decomposition algorithm of Veanes et al. for LIA, LRA, and LIRA. We report encouraging experimental results on a prototype implementation of our algorithms on micro-benchmarks.

We introduce a novel logic for asynchronous hyperproperties with a new mechanism to identify relevant positions on traces. While the new logic is more expressive than a related logic presented recently by Bozzelli et al., we obtain the same complexity of the model checking problem for finite state models. Beyond this, we study the model checking problem of our logic for pushdown models. We argue that the combination of asynchronicity and a non-regular model class studied in this paper constitutes the first suitable approach for hyperproperty model checking against recursive programs.

Rewriting is a versatile and powerful technique used in many domains. *Strategic rewriting* allows programmers to control the application of rewrite rules by composing individual rewrite rules into complex rewrite strategies. These strategies are semantically complex, as they may be nondeterministic, they may raise errors that trigger backtracking, and they may not terminate.

Given such semantic complexity, it is necessary to establish a formal understanding of rewrite strategies and to enable reasoning about them in order to answer questions like: How do we know that a rewrite strategy terminates? How do we know that a rewrite strategy does not fail because we compose two incompatible rewrites? How do we know that a desired property holds after applying a rewrite strategy?

In this paper, we introduce Shoggoth: a formal foundation for understanding, analysing and reasoning about strategic rewriting that is capable of answering these questions. We provide a denotational semantics of System S, a core language for strategic rewriting, and prove its equivalence to our big-step operational semantics, which extends existing work by explicitly accounting for divergence. We further define a *location-based weakest precondition calculus* to enable formal reasoning about rewriting strategies, and we prove this calculus sound with respect to the denotational semantics. We show how this calculus can be used in practice to reason about properties of rewriting strategies, including termination, that they are well-composed, and that desired postconditions hold. The semantics and calculus are formalised in Isabelle/HOL and all proofs are mechanised.

Pushdown Vector Addition Systems with States (PVASS) consist of finitely many control states, a pushdown stack, and a set of counters that can be incremented and decremented, but not tested for zero. Whether the reachability problem is decidable for PVASS is a long-standing open problem.
We consider *continuous PVASS*, which are PVASS with a continuous semantics. This means, the counter values are rational numbers and whenever a vector is added to the current counter values, this vector is first scaled with an arbitrarily chosen rational factor between zero and one.
We show that reachability in continuous PVASS is NEXPTIME-complete. Our result is unusually robust: Reachability can be decided in NEXPTIME even if all numbers are specified in binary. On the other hand, NEXPTIME-hardness already holds for coverability, in fixed dimension, for bounded stack, and even if all numbers are specified in unary.

Algebraic effects and handlers are a mechanism to structure programs with computational effects in a modular way. They are recently gaining popularity and being adopted in practical languages, such as OCaml. Meanwhile, there has been substantial progress in program verification via *refinement type systems*. While a variety of refinement type systems have been proposed, thus far there has not been a satisfactory refinement type system for algebraic effects and handlers. In this paper, we fill the void by proposing a novel refinement type system for languages with algebraic effects and handlers. The expressivity and usefulness of algebraic effects and handlers come from their ability to manipulate *delimited continuations*, but delimited continuations also complicate programs’ control flow and make their verification harder. To address the complexity, we introduce a novel concept that we call *answer refinement modification* (ARM for short), which allows the refinement type system to precisely track what effects occur and in what order when a program is executed, and reflect such information as modifications to the refinements in the types of delimited continuations. We formalize our type system that supports ARM (as well as answer *type* modification, or ATM) and prove its soundness. Additionally, as a proof of concept, we have extended the refinement type system to a subset of OCaml 5 which comes with a built-in support for effect handlers, implemented a type checking and inference algorithm for the extension, and evaluated it on a number of benchmark programs that use algebraic effects and handlers. The evaluation demonstrates that ARM is conceptually simple and practically useful. Finally, a natural alternative to directly reasoning about a program with delimited continuations is to apply a *continuation passing style* (CPS) transformation that transforms the program to a pure program without delimited continuations. We investigate this alternative in the paper, and show that the approach is indeed possible by proposing a novel CPS transformation for algebraic effects and handlers that enjoys bidirectional (refinement-)type-preservation. We show that there are pros and cons with this approach, namely, while one can use an existing refinement type checking and inference algorithm that can only (directly) handle pure programs, there are issues such as needing type annotations in source programs and making the inferred types less informative to a user.

Iris is a generic separation logic framework that has been instantiated to reason about a wide range of programming languages and language features. Most Iris instances are defined on simple core calculi, but by connecting Iris to new or existing formal semantics for practical languages, we can also use it to reason about real programs. In this paper we develop an Iris instance based on CompCert, the verified C compiler, allowing us to prove correctness of C programs under the same semantics we use to compile and run them. We take inspiration from the Verified Software Toolchain (VST), a prior separation logic for CompCert C, and reimplement the program logic of VST in Iris. Unlike most Iris instances, this involves both a new model of resources for CompCert memories, and a new definition of weakest preconditions/Hoare triples, as the Iris defaults for both of these cannot be applied to CompCert as is. Ultimately, we obtain a complete program logic for CompCert C within Iris, and we reconstruct enough of VST's top-level automation to prove correctness of simple C programs.

We study transformational program logics for correctness and incorrectness that we extend to explicitly handle both termination and nontermination. We show that the logics are abstract interpretations of the right image transformer for a natural relational semantics covering both finite and infinite executions. This understanding of logics as abstractions of a semantics facilitates their comparisons through their respective abstractions of the semantics (rather that the much more difficult comparison through their formal proof systems). More importantly, the formalization provides a calculational method for constructively designing the sound and complete formal proof system by abstraction of the semantics. As an example, we extend Hoare logic to cover all possible behaviors of nondeterministic programs and design a new precondition (in)correctness logic.

Two approaches exist to incorporate parametricity into proof assistants based on dependent type theory. On the one hand, parametricity translations conveniently compute parametricity statements and their proofs solely based on individual well-typed polymorphic programs. But they do not offer internal parametricity: formal proofs that any polymorphic program of a certain type satisfies its parametricity statement. On the other hand, internally parametric type theories augment plain type theory with additional primitives out of which internal parametricity can be derived. But those type theories lack mature proof assistant implementations and deriving parametricity in them involves low-level intractable proofs. In this paper, we contribute Agda --bridges: the first practical internally parametric proof assistant. We provide the first mechanized proofs of crucial theorems for internal parametricity, like the relativity theorem. We identify a high-level sufficient condition for proving internal parametricity which we call the structure relatedness principle (SRP) by analogy with the structure identity principle (SIP) of HoTT/UF. We state and prove a general parametricity theorem for types that satisfy the SRP. Our parametricity theorem lets us obtain one-liner proofs of standard internal free theorems. We observe that the SRP is harder to prove than the SIP and provide in Agda --bridges a shallowly embedded type theory to compose types that satisfy the SRP. This type theory is an observational type theory of logical relations and our parametricity theorem ought to be one of its inference rules.

Expressive state-of-the-art separation logics rely on step-indexing to model semantically complex features and to support modular reasoning about imperative higher-order concurrent and distributed programs. Step- indexing comes, however, with an inherent cost: it restricts the adequacy theorem of program logics to a fairly simple class of safety properties. In this paper, we explore if and how intensional refinement is a viable methodology for strengthening higher-order concurrent (and distributed) separation logic to prove non-trivial safety and liveness properties. Specifically, we introduce Trillium, a language-agnostic separation logic framework for showing intensional refinement relations between traces of a program and a model. We instantiate Trillium with a concurrent language and develop Fairis, a concurrent separation logic, that we use to show liveness properties of concurrent programs under fair scheduling assumptions through a fair liveness-preserving refinement of a model. We also instantiate Trillium with a distributed language and obtain an extension of Aneris, a distributed separation logic, which we use to show refinement relations between distributed systems and TLA+ models.

We present decalf, a directed, effectful cost-aware logical framework for studying quantitative aspects of functional programs with effects. Like calf, the language is based on a formal *phase distinction* between the *extension* and the *intension* of a program, its pure *behavior* as distinct from its *cost* measured by an effectful step-counting primitive. The type theory ensures that the behavior is unaffected by the cost accounting. Unlike calf, the present language takes account of *effects*, such as probabilistic choice and mutable state. This extension requires a reformulation of calf’s approach to cost accounting: rather than rely on a ”separable” notion of cost, here *a cost bound is simply another program*. To make this formal, we equip every type with an intrinsic preorder, relaxing the precise cost accounting intrinsic to a program to a looser but nevertheless informative estimate. For example, the cost bound of a probabilistic program is itself a probabilistic program that specifies the distribution of costs. This approach serves as a streamlined alternative to the standard method of isolating a cost recurrence and readily extends to higher-order, effectful programs.

The development proceeds by first introducing the decalf type system, which is based on an intrinsic ordering among terms that restricts in the extensional phase to extensional equality, but in the intensional phase reflects an approximation of the cost of a program of interest. This formulation is then applied to a number of illustrative examples, including pure and effectful sorting algorithms, simple probabilistic programs, and higher-order functions. Finally, we justify decalf via a model in the topos of augmented simplicial sets.

Disentanglement is a run-time property of parallel programs that facilitates task-local reasoning about the memory footprint of parallel tasks. In particular, it ensures that a task does not access any memory locations allocated by another concurrently executing task. Disentanglement can be exploited, for example, to implement a high-performance parallel memory manager, such as in the MPL (MaPLe) compiler for Parallel ML. Prior research on disentanglement has focused on the design of optimizations, either trusting the programmer to provide a disentangled program or relying on runtime instrumentation for detecting and managing entanglement. This paper provides the first static approach to verify that a program is disentangled: it contributes DisLog, a concurrent separation logic for disentanglement. DisLog enriches concurrent separation logic with the notions necessary for reasoning about the fork-join structure of parallel programs, allowing the verification that memory accesses are effectively disentangled. A large class of programs, including race-free programs, exhibit memory access patterns that are disentangled "by construction". To reason about these patterns, the paper distills from DisLog an almost standard concurrent separation logic, called DisLog+. In this high-level logic, no specific reasoning about memory accesses is needed: functional correctness proofs entail disentanglement. The paper illustrates the use of DisLog and DisLog+ on a range of case studies, including two different implementations of parallel deduplication via concurrent hashing. All our results are mechanized in the Coq proof assistant using Iris.

We present *guarded interaction trees* — a structure and a fully formalized framework for representing higher-order computations with higher-order effects in Coq, inspired by domain theory and the recently proposed interaction trees. We also present an accompanying separation logic for reasoning about guarded interaction trees. To demonstrate that guarded interaction trees provide a convenient domain for interpreting higher-order languages with effects, we define an interpretation of a PCF-like language with effects and show that this interpretation is sound and computationally adequate; we prove the latter using a logical relation defined using the separation logic. Guarded interaction trees also allow us to combine different effects and reason about them modularly. To illustrate this point, we give a modular proof of type soundness of cross-language interactions for safe interoperability of different higher-order languages with different effects. All results in the paper are formalized in Coq using the Iris logic over guarded type theory.

Selinger gave a superoperator model of a first-order quantum programming language and proved that it is fully definable and hence fully abstract. This paper proposes an extension of the superoperator model to higher-order programs based on modules over superoperators or, equivalently, enriched presheaves over the category of superoperators. The enriched presheaf category can be easily proved to be a model of intuitionistic linear logic with cofree exponential, from which one can cave out a model of classical linear logic by a kind of bi-orthogonality construction. Although the structures of an enriched presheaf category are usually rather complex, a morphism in the classical model can be expressed simply as a matrix of completely positive maps. The model inherits many desirable properties from the superoperator model. A conceptually interesting property is that our model has only a state whose “total probability” is bounded by 1, i.e. does not have a state where true and false each occur with probability 2/3. Another convenient property inherited from the superoperator model is a ωCPO-enrichment. Remarkably, our model has a sufficient structure to interpret arbitrary recursive types by the standard domain theoretic technique. We introduce *Quantum FPC*, a quantum λ-calculus with recursive types, and prove that our model is a fully abstract model of Quantum FPC.

Fueled by the success of Rust, many programming languages are adding substructural features to their type systems. The promise of tracking properties such as lifetimes and sharing is tremendous, not just for low-level memory management, but also for controlling higher-level resources and capabilities. But so are the difficulties in adapting successful techniques from Rust to higher-level languages, where they need to interact with other advanced features, especially various flavors of functional and type-level abstraction. What would it take to bring full-fidelity reasoning about lifetimes and sharing to mainstream languages? Reachability types are a recent proposal that has shown promise in scaling to higher-order but monomorphic settings, tracking aliasing and separation on top of a substrate inspired by separation logic. However, naive extensions on top of the prior reachability type system λ^{*} with type polymorphism and/or precise reachability polymorphism are unsound, making λ^{*} unsuitable for adoption in real languages. Combining reachability and type polymorphism that is precise, sound, and parametric remains an open challenge.

This paper presents a rethinking of the design of reachability tracking and proposes new polymorphic reachability type systems. We introduce a new freshness qualifier to indicate variables whose reachability sets may grow during evaluation steps. The new system tracks variables reachable in a single step and computes transitive closures only when necessary, thus preserving chains of reachability over known variables that can be refined using substitution. These ideas yield the simply-typed λ^{✦}-calculus with precise lightweight, i.e., quantifier-free, reachability polymorphism, and the *F*_{<:}^{✦}-calculus with bounded parametric polymorphism over types and reachability qualifiers, paving the way for making true tracking of lifetimes and sharing practical for mainstream languages. We prove type soundness and the preservation of separation property in Coq. We discuss various applications (e.g., safe capability programming), possible effect system extensions, and compare our system with Scala’s capture types.

We study nominal recursors from the literature on syntax with bindings and compare them with respect to expressiveness. The term "nominal" refers to the fact that these recursors operate on a syntax representation where the names of bound variables appear explicitly, as in nominal logic. We argue that nominal recursors can be viewed as epi-recursors, a concept that captures abstractly the distinction between the constructors on which one actually recurses, and other operators and properties that further underpin recursion. We develop an abstract framework for comparing epi-recursors and instantiate it to the existing nominal recursors, and also to several recursors obtained from them by cross-pollination. The resulted expressiveness hierarchies depend on how strictly we perform this comparison, and bring insight into the relative merits of different axiomatizations of syntax. We also apply our methodology to produce an expressiveness hierarchy of nominal corecursors, which are principles for defining functions targeting infinitary non-well-founded terms (which underlie lambda-calculus semantics concepts such as Böhm trees). Our results are validated with the Isabelle/HOL theorem prover.

We consider the problem of synthesizing programs with numerical constants that optimize a quantitative objective, such as accuracy, over a set of input-output examples. We propose a general framework for optimal synthesis of such programs in a given domain specific language (DSL), with provable optimality guarantees. Our framework enumerates programs in a general search graph, where nodes represent subsets of concrete programs. To improve scalability, it uses *A*^{*} search in conjunction with a search heuristic based on abstract interpretation; intuitively, this heuristic establishes upper bounds on the value of subtrees in the search graph, enabling the synthesizer to identify and prune subtrees that are provably suboptimal. In addition, we propose a natural strategy for constructing abstract transformers for monotonic semantics, which is a common property for components in DSLs for data classification. Finally, we implement our approach in the context of two such existing DSLs, demonstrating that our algorithm is more scalable than existing optimal synthesizers.

Parallel programs are frequently modeled as *dependency* or *cost* graphs, which can be used to detect various bugs, or simply to visualize the parallel structure of the code. However, such graphs reflect just one particular execution and are typically constructed in a *post-hoc* manner. *Graph types*, which were introduced recently to mitigate this problem, can be assigned statically to a program by a type system and compactly represent the family of all graphs that could result from the program.

Unfortunately, prior work is restricted in its treatment of *futures*, an increasingly common and especially dynamic form of parallelism. In short, each instance of a future must be statically paired with a vertex name. Previously, this led to the restriction that futures could not be placed in collections or be used to construct data structures. Doing so is not a niche exercise: such structures form the basis of numerous algorithms that use forms of pipelining to achieve performance not attainable without futures. All but the most limited of these examples are out of reach of prior graph type systems.

In this paper, we propose a graph type system that allows for almost arbitrary combinations of futures and recursive data types. We do so by indexing datatypes with a type-level *vertex structure*, a codata structure that supplies unique vertex names to the futures in a data structure. We prove the soundness of the system in a parallel core calculus annotated with vertex structures and associated operations. Although the calculus is annotated, this is merely for convenience in defining the type system. We prove that it is possible to annotate arbitrary recursive types with vertex structures, and show using a prototype inference engine that these annotations can be inferred from OCaml-like source code for several complex parallel algorithms.

The goal of programmatic Learning from Demonstration (LfD) is to learn a policy in a programming language that can be used to control a robot’s behavior from a set of user demonstrations. This paper presents a new programmatic LfD algorithm that targets long-horizon robot tasks which require synthesizing programs with complex control flow structures, including nested loops with multiple conditionals. Our proposed method first learns a program sketch that captures the target program’s control flow and then completes this sketch using an LLM-guided search procedure that incorporates a novel technique for proving unrealizability of programming-by-demonstration problems. We have implemented our approach in a new tool called PROLEX and present the results of a comprehensive experimental evaluation on 120 benchmarks involving complex tasks and environments. We show that, given a 120 second time limit, PROLEX can find a program consistent with the demonstrations in 80% of the cases. Furthermore, for 81% of the tasks for which a solution is returned, PROLEX is able to find the ground truth program with just one demonstration. In comparison, CVC5, a syntax-guided synthesis tool, is only able to solve 25% of the cases even when given the ground truth program sketch, and an LLM-based approach, GPT-Synth, is unable to solve any of the tasks due to the environment complexity.

Rig groupoids provide a semantic model of Π, a universal classical reversible programming language over finite types. We prove that extending rig groupoids with just two maps and three equations about them results in a model of quantum computing that is computationally universal and equationally sound and complete for a variety of gate sets. The first map corresponds to an 8th root of the identity morphism on the unit 1. The second map corresponds to a square root of the symmetry on 1+1. As square roots are generally not unique and can sometimes even be trivial, the maps are constrained to satisfy a nondegeneracy axiom, which we relate to the Euler decomposition of the Hadamard gate. The semantic construction is turned into an extension of Π, called √Π, that is a computationally universal quantum programming language equipped with an equational theory that is sound and complete with respect to the Clifford gate set, the standard gate set of Clifford+T restricted to ≤2 qubits, and the computationally universal Gaussian Clifford+T gate set.

A program is said to be well-bracketed if every called function must return before its caller can resume execution. This is often the case. Well-bracketedness has been captured semantically as a condition on strategies in fully abstract games models and multiple prior works have studied well-bracketedness by showing correctness/security properties of programs where such properties depend on the well-bracketed nature of control flow. The latter category of prior works have all used involved relational models with explicit state-transition systems capturing the relevant parts of the control flow of the program. In this paper we present the first Hoare-style *program logic* based on separation logic for reasoning about well-bracketedness and use it to show correctness of well-bracketed programs both directly and also through defining unary and binary logical relations models based on this program logic. All results presented in this paper are formalized on top of the Iris framework and mechanized in the Coq proof assistant.

Very relaxed concurrency memory models, like those of the Arm-A, RISC-V, and IBM Power hardware architectures, underpin much of computing but break a fundamental intuition about programs, namely that syntactic program order and the reads-from relation always both induce order in the execution. Instead, out-of-order execution is allowed except where prevented by certain pairwise dependencies, barriers, or other synchronisation. This means that there is no notion of the 'current' state of the program, making it challenging to design (and prove sound) syntax-directed, modular reasoning methods like Hoare logics, as usable resources cannot implicitly flow from one program point to the next.

We present AxSL, a separation logic for the relaxed memory model of Arm-A, that captures the fine-grained reasoning underpinning the low-overhead synchronisation mechanisms used by high-performance systems code. In particular, AxSL allows transferring arbitrary resources using relaxed reads and writes when they induce inter-thread ordering. We mechanise AxSL in the Iris separation logic framework, illustrate it on key examples, and prove it sound with respect to the axiomatic memory model of Arm-A. Our approach is largely generic in the axiomatic model and in the instruction-set semantics, offering a potential way forward for compositional reasoning for other similar models, and for the combination of production concurrency models and full-scale ISAs.

Verifying safety and liveness over array systems is a highly challenging problem. Array systems naturally capture parameterized systems such as distributed protocols with an unbounded number of processes. Such distributed protocols often exploit process IDs during their computation, resulting in array systems whose element values range over an infinite domain. In this paper, we develop a novel framework for proving safety and liveness over array systems. The crux of the framework is to overapproximate an array system as a string rewriting system (i.e. over a finite alphabet) by means of a new predicate abstraction that exploits the so-called indexed predicates. This allows us to tap into powerful verification methods for string rewriting systems that have been heavily developed in the last two decades or so (e.g. regular model checking). We demonstrate how our method yields simple, automatically verifiable proofs of safety and liveness properties for challenging examples, including Dijkstra's self-stabilizing protocol and the Chang-Roberts leader election protocol.

Passive documents and active programs now widely comingle. Document languages include Turing-complete programming elements, and programming languages include sophisticated document notations. However, there are no formal foundations that model these languages. This matters because the interaction between document and program can be subtle and error-prone. In this paper we describe several such problems, then taxonomize and formalize document languages as levels of a document calculus. We employ the calculus as a foundation for implementing complex features such as reactivity, as well as for proving theorems about the boundary of content and computation. We intend for the document calculus to provide a theoretical basis for new document languages, and to assist designers in cleaning up the unsavory corners of existing languages.

This paper considers direct encodings of generalized algebraic data types (GADTs) in a minimal suitable lambda-calculus. To this end, we develop an extension of System *F*_{ω} with recursive types and internalized type equalities with injective constant type constructors. We show how GADTs and associated pattern-matching constructs can be directly expressed in the calculus, thus showing that it may be treated as a highly idealized modern functional programming language. We prove that the internalized type equalities in conjunction with injectivity rules increase the expressive power of the calculus by establishing a non-macro-expressibility result in *F*_{ω}, and prove the system type-sound via a syntactic argument. Finally, we build two relational models of our calculus: a simple, unary model that illustrates a novel, two-stage interpretation technique, necessary to account for the equational constraints; and a more sophisticated, binary model that relaxes the construction to allow, for the first time, formal reasoning about data-abstraction in a calculus equipped with GADTs.

This paper presents a program analysis method that generates program summaries involving polynomial arithmetic. Our approach builds on prior techniques that use solvable polynomial maps for summarizing loops. These techniques are able to generate all polynomial invariants for a restricted class of programs, but cannot be applied to programs outside of this class---for instance, programs with nested loops, conditional branching, unstructured control flow, etc. There currently lacks approaches to apply these prior methods to the case of general programs. This paper bridges that gap. Instead of restricting the kinds of programs we can handle, our method abstracts every loop into a model that can be solved with prior techniques, bringing to bear prior work on solvable polynomial maps to general programs. While no method can generate all polynomial invariants for arbitrary programs, our method establishes its merit through a monotonicty result. We have implemented our techniques, and tested them on a suite of benchmarks from the literature. Our experiments indicate our techniques show promise on challenging verification tasks requiring non-linear reasoning.

Probabilistic couplings are the foundation for many probabilistic relational program logics and arise when relating random sampling statements across two programs. In relational program logics, this manifests as dedicated coupling rules that, e.g., say we may reason as if two sampling statements return the same value. However, this approach fundamentally requires aligning or "synchronizing" the sampling statements of the two programs which is not always possible.

In this paper, we develop Clutch, a higher-order probabilistic relational separation logic that addresses this issue by supporting asynchronous probabilistic couplings. We use Clutch to develop a logical step-indexed logical relation to reason about contextual refinement and equivalence of higher-order programs written in a rich language with a probabilistic choice operator, higher-order local state, and impredicative polymorphism. Finally, we demonstrate our approach on a number of case studies.

All the results that appear in the paper have been formalized in the Coq proof assistant using the Coquelicot library and the Iris separation logic framework.

Subtypes and quotient types are dual type abstractions. However, while subtypes are widely used both explicitly and implicitly, quotient types have not seen much practical use outside of proof assistants. A key difficulty to wider adoption of quotient types lies in the significant burden of proof-obligations that arises from their use. In this article, we address this issue by introducing a class of quotient types for which the proof-obligations are decidable by an SMT solver. We demonstrate this idea in practice by presenting *Quotient Haskell*, an extension of Liquid Haskell with support for quotient types.

Modifications to the data representation of an abstract data type (ADT) can require significant semantic refactoring of the code. Motivated by this observation, this paper presents a new method to automate semantic code refactoring tasks. Our method takes as input the original ADT implementation, a new data representation, and a so-called *relational representation invariant* (relating the old and new data representations), and automatically generates a new ADT implementation that is semantically equivalent to the original version. Our method is based on counterexample-guided inductive synthesis (CEGIS) but leverages three key ideas that allow it to handle real-world refactoring tasks. First, our approach reduces the underlying relational synthesis problem to a set of (simpler) programming-by-example problems, one for each method in the ADT. Second, it leverages symbolic reasoning techniques, based on logical abduction, to deduce code snippets that should occur in the refactored version. Finally, it utilizes a notion of *partial equivalence* to make inductive synthesis much more effective in this setting. We have implemented the proposed approach in a new tool called Revamp for automatically refactoring Java classes and evaluated it on 30 Java class mined from Github. Our evaluation shows that Revamp can correctly refactor the entire ADT in 97% of the cases and that it can successfully re-implement 144 out of the 146 methods that require modifications.

Differential cryptanalysis is a powerful algorithmic-level attack, playing a central role in evaluating the security of symmetric cryptographic primitives. In general, the resistance against differential cryptanalysis can be characterized by the maximum expected differential characteristic probability. In this paper, we present generic and extensible approaches based on mixed integer linear programming (MILP) to bound such probability. We design a high-level cryptography-specific language EasyBC tailored for block ciphers and provide various rigorous procedures, as differential denotational semantics, to automate the generation of MILP from block ciphers written in EasyBC. We implement an open-sourced tool that provides support for fully automated resistance evaluation of block ciphers against differential cryptanalysis. The tool is extensively evaluated on 23 real-life cryptographic primitives including all the 10 finalists of the NIST lightweight cryptography standardization process. The experiments confirm the expressivity of EasyBC and show that the tool can effectively prove the resistance against differential cryptanalysis for all block ciphers under consideration. EasyBC makes resistance evaluation against differential cryptanalysis easily accessible to cryptographers.

We show that computing the strongest polynomial invariant for single-path loops with polynomial assignments is at least as hard as the Skolem problem, a famous problem whose decidability has been open for almost a century. While the strongest polynomial invariants are computable for *affine loops*, for polynomial loops the problem remained wide open. As an intermediate result of independent interest, we prove that reachability for discrete polynomial dynamical systems is Skolem-hard as well. Furthermore, we generalize the notion of invariant ideals and introduce *moment invariant ideals* for probabilistic programs. With this tool, we further show that the strongest polynomial moment invariant is (i) uncomputable, for probabilistic loops with branching statements, and (ii) Skolem-hard to compute for polynomial probabilistic loops without branching statements. Finally, we identify a class of probabilistic loops for which the strongest polynomial moment invariant is computable and provide an algorithm for it.

*Trace theory* (formulated by Mazurkiewicz in 1987) is a principled framework for defining equivalence relations for concurrent program runs based on a commutativity relation over the set of atomic steps taken by individual program threads. Its simplicity, elegance, and algorithmic efficiency makes it useful in many different contexts including program verification and testing. It is well-understood that the larger the equivalence classes are, the more benefits they would bring to the algorithms and applications that use them. In this paper, we study relaxations of trace equivalence with the goal of maintaining its algorithmic advantages.

We first prove that the largest appropriate relaxation of trace equivalence, an equivalence relation that preserves the order of steps taken by each thread *and* what write operation each read operation observes, does not yield efficient algorithms. Specifically, we prove a *linear space lower bound* for the problem of checking, in a streaming setting, if two arbitrary steps of a concurrent program run are *causally concurrent* (i.e. they can be reordered in an equivalent run) or *causally ordered *(i.e. they always appear in the same order in all equivalent runs). The same problem can be decided in *constant space* for trace equivalence. Next, we propose a new commutativity-based notion of equivalence called *grain equivalence* that is strictly more relaxed than trace equivalence, and yet yields a constant space algorithm for the same problem. This notion of equivalence uses commutativity of *grains*, which are sequences of atomic steps, in addition to the standard commutativity from trace theory. We study the two distinct cases when the grains are contiguous subwords of the input program run and when they are not, formulate the precise definition of causal concurrency in each case, and show that they can be decided in *constant space*, despite being strict relaxations of the notion of causal concurrency based on trace equivalence.

Achieving speed and accuracy for math library functions like exp, sin, and log is difficult. This is because low-level implementation languages like C do not help math library developers catch mathematical errors, build implementations incrementally, or separate high-level and low-level decision making. This ultimately puts development of such functions out of reach for all but the most experienced experts. To address this, we introduce MegaLibm, a domain-specific language for implementing, testing, and tuning math library implementations. MegaLibm is safe, modular, and tunable. Implementations in MegaLibm can automatically detect mathematical mistakes like sign flips via semantic wellformedness checks, and components like range reductions can be implemented in a modular, composable way, simplifying implementations. Once the high-level algorithm is done, tuning parameters like working precisions and evaluation schemes can be adjusted through orthogonal tuning parameters to achieve the desired speed and accuracy. MegaLibm also enables math library developers to work interactively, compiling, testing, and tuning their implementations and invoking tools like Sollya and type-directed synthesis to complete components and synthesize entire implementations. MegaLibm can express 8 state-of-the-art math library implementations with comparable speed and accuracy to the original C code, and can synthesize 5 variations and 3 from-scratch implementations with minimal guidance.

First-order logic, and quantifiers in particular, are widely used in deductive verification of programs and systems. Quantifiers are essential for describing systems with unbounded domains, but prove difficult for automated solvers. Significant effort has been dedicated to finding quantifier instantiations that establish unsatisfiability of quantified formulas, thus ensuring validity of a system’s verification conditions. However, in many cases the formulas are satisfiable—this is often the case in intermediate steps of the verification process, e.g., when an invariant is not yet inductive. For such cases, existing tools are limited to finding finite models as counterexamples. Yet, some quantified formulas are satisfiable but only have infinite models, which current solvers are unable to find. Such infinite counter-models are especially typical when first-order logic is used to approximate the natural numbers, the integers, or other inductive definitions such as linked lists, which is common in deductive verification. The inability of solvers to find infinite models makes them diverge in these cases, providing little feedback to the user as they try to make progress in their verification attempts. In this paper, we tackle the problem of finding such infinite models, specifically, finite representations thereof that can be presented to the user of a deductive verification tool. These models give insight into the verification failure, and allow the user to identify and fix bugs in the modeling of the system and its properties. Our approach consists of three parts. First, we introduce symbolic structures as a way to represent certain infinite models, and show they admit an efficient model checking procedure. Second, we describe an effective model finding procedure that symbolically explores a given (possibly infinite) family of symbolic structures in search of an infinite model for a given formula. Finally, we identify a new decidable fragment of first-order logic that extends and subsumes the many-sorted variant of EPR, where satisfiable formulas always have a model representable by a symbolic structure within a known family, making our model finding procedure a decision procedure for that fragment. We evaluate our approach on examples from the domains of distributed consensus protocols and of heap-manipulating programs (specifically, linked lists). Our implementation quickly finds infinite counter-models that demonstrate the source of verification failures in a simple way, while state-of-the-art SMT solvers and theorem provers such as Z3, cvc5, and Vampire diverge or return “unknown”.

We introduce the class of P-finite automata. These are a generalisation of weighted automata, in which the weights of transitions can depend polynomially on the length of the input word. P-finite automata can also be viewed as simple tail-recursive programs in which the arguments of recursive calls can non-linearly refer to a variable that counts the number of recursive calls. The nomenclature is motivated by the fact that over a unary alphabet P-finite automata compute so-called P-finite sequences, that is, sequences that satisfy a linear recurrence with polynomial coefficients. Our main result shows that P-finite automata can be learned in polynomial time in Angluin's MAT exact learning model. This generalises the classical results that deterministic finite automata and weighted automata over a field are respectively polynomial-time learnable in the MAT model.

Distributed protocols have long been formulated in terms of their safety and liveness properties. Much recent work has focused on automatically verifying the safety properties of distributed protocols, but doing so for liveness properties has remained a challenging, unsolved problem. We present LVR, the first framework that can mostly automatically verify liveness properties for distributed protocols. Our key insight is that most liveness properties for distributed protocols can be reduced to a set of safety properties with the help of ranking functions. Such ranking functions for practical distributed protocols have certain properties that make them straightforward to synthesize, contrary to conventional wisdom. We prove that verifying a liveness property can then be reduced to a simpler problem of verifying a set of safety properties, namely that the ranking function is strictly decreasing and nonnegative for any protocol state transition, and there is no deadlock. LVR automatically synthesizes ranking functions by formulating a parameterized function of integer protocol variables, statically analyzing the lower and upper bounds of the variables as well as how much they can change on each state transition, then feeding the constraints to an SMT solver to determine the coefficients of the ranking function. It then uses an off-the-shelf verification tool to find inductive invariants to verify safety properties for both ranking functions and deadlock freedom. We show that LVR can mostly automatically verify the liveness properties of several distributed protocols, including various versions of Paxos, with limited user guidance.

We show how the basic Combinatory Homomorphic Automatic Differentiation (CHAD) algorithm can be optimised, using well-known methods, to yield a simple, composable, and generally applicable reverse-mode automatic differentiation (AD) technique that has the correct computational complexity that we would expect of reverse-mode AD. Specifically, we show that the standard optimisations of sparse vectors and state-passing style code (as well as defunctionalisation/closure conversion, for higher-order languages) give us a purely functional algorithm that is most of the way to the correct complexity, with (functional) mutable updates taking care of the final log-factors. We provide an Agda formalisation of our complexity proof. Finally, we discuss how the techniques apply to differentiating parallel functional array programs: the key observations are 1) that all required mutability is (commutative, associative) accumulation, which lets us preserve task-parallelism and 2) that we can write down data-parallel derivatives for most data-parallel array primitives.

We study the recursion-theoretic complexity of Positive Almost-Sure Termination (*PAST*) in an imperative programming language with rational variables, bounded nondeterministic choice, and discrete probabilistic choice. A program terminates positive almost-surely if, for every scheduler, the program terminates almost-surely and the expected runtime to termination is finite. We show that *PAST* for our language is complete for the (lightface) co-analytic sets (Π_{1}^{1}-complete). This is in contrast to the related notions of Almost-Sure Termination (*AST*) and Bounded Termination (*BAST*), both of which are arithmetical (Π_{2}^{0}- and Σ_{2}^{0}-complete respectively).

Our upper bound implies an effective procedure to reduce reasoning about probabilistic termination to non-probabilistic fair termination in a model with bounded nondeterminism, and to simple program termination in models with unbounded nondeterminism. Our lower bound shows the opposite: for every program with unbounded nondeterministic choice, there is an effectively computable probabilistic program with bounded choice such that the original program is terminating if, and only if, the transformed program is *PAST*.

We show that every program has an effectively computable normal form, in which each probabilistic choice either continues or terminates execution immediately, each with probability 1/2. For normal form programs, we provide a sound and complete proof rule for *PAST*. Our proof rule uses transfinite ordinals. We show that reasoning about *PAST* requires transfinite ordinals up to ω_{1}^{CK}; thus, existing techniques for probabilistic termination based on ranking supermartingales that map program states to reals do not suffice to reason about *PAST*.

On any modern computer architecture today, parallelism comes with a modest cost, born from the creation and management of threads or tasks. Today, programmers battle this cost by manually optimizing/tuning their codes to minimize the cost of parallelism without harming its benefit, performance. This is a difficult battle: programmers must reason about architectural constant factors hidden behind layers of software abstractions, including thread schedulers and memory managers, and their impact on performance, also at scale. In languages that support higher-order functions, the battle hardens: higher order functions can make it difficult, if not impossible, to reason about the cost and benefits of parallelism.

Motivated by these challenges and the numerous advantages of high-level languages, we believe that it has become essential to manage parallelism automatically so as to minimize its cost and maximize its benefit. This is a challenging problem, even when considered on a case-by-case, application-specific basis. But if a solution were possible, then it could combine the many correctness benefits of high-level languages with performance by managing parallelism without the programmer effort needed to ensure performance. This paper proposes techniques for such automatic management of parallelism by combining static (compilation) and run-time techniques. Specifically, we consider the Parallel ML language with task parallelism, and describe a compiler pipeline that embeds "potential parallelism" directly into the call-stack and avoids the cost of task creation by default. We then pair this compilation pipeline with a run-time system that dynamically converts potential parallelism into actual parallel tasks. Together, the compiler and run-time system guarantee that the cost of parallelism remains low without losing its benefit. We prove that our techniques have no asymptotic impact on the work and span of parallel programs and thus preserve their asymptotic properties. We implement the proposed techniques by extending the MPL compiler for Parallel ML and show that it can eliminate the burden of manual optimization while delivering good practical performance.

We study the proof theory and algorithms for orthologic, a logical system based on ortholattices, which have shown practical relevance in simplification and normalization of verification conditions. Ortholattices weaken Boolean algebras while having polynomial-time equivalence checking that is sound with respect to Boolean algebra semantics. We generalize ortholattice reasoning and obtain an algorithm for proving a larger class of classically valid formulas.

As the key result, we analyze a proof system for orthologic augmented with axioms. An important feature of the system is that it limits the number of formulas in a sequent to at most two, which makes the extension with axioms non-trivial. We show a generalized form of cut elimination for this system, which implies a sub-formula property. From there we derive a cubic-time algorithm for provability from axioms, or equivalently, for validity in finitely presented ortholattices. We further show that propositional resolution of width 5 proves all formulas provable in orthologic with axioms. We show that orthologic system subsumes resolution of width 2 and arbitrarily wide unit resolution and is complete for reasoning about generalizations of propositional Horn clauses.

Moving beyond ground axioms, we introduce effectively propositional orthologic (by analogy with EPR for classical logic), presenting its semantics as well as a sound and complete proof system. Our proof system implies the decidability of effectively propositional orthologic, as well as its fixed-parameter tractability for a bounded maximal number of variables in each axiom. As a special case, we obtain a generalization of Datalog with negation and disjunction.

We present a type system that combines, in a controlled way, first-order polymorphism with intersection types, union types, and subtyping, and prove its safety. We then define a type reconstruction algorithm that is sound and terminating. This yields a system in which unannotated functions are given polymorphic types (thanks to Hindley-Milner) that can express the overloaded behavior of the functions they type (thanks to the intersection introduction rule) and that are deduced by applying advanced techniques of type narrowing (thanks to the union elimination rule). This makes the system a prime candidate to type dynamic languages.

Bidirectional live programming systems (BLP) enable developers to modify a program by directly manipulating the program output, so that the updated program can produce the manipulated output. One state-of-the-art approach to BLP systems is operation-based, which captures the developer's intention of program modifications by taking how the developer manipulates the output into account. The program modifications are usually hard coded for each direct manipulation in these BLP systems, which are difficult to extend. Moreover, to reflect the manipulations to the source program, these BLP systems trace the modified output to appropriate code fragments and perform corresponding code transformations. Accordingly, they require direct manipulation users be aware of the source code and how it is changed, making "direct" manipulation (on output) be "indirect".

In this paper, we resolve this problem by presenting a novel operation-based framework for bidirectional live programming, which can automatically fuse direct manipulations into the source code, thus supporting code-insensitive direct manipulations. Firstly, we design a simple but expressive delta language DM capable of expressing common direct manipulations for output values. Secondly, we present a fusion algorithm that propagates direct manipulations into the source functional programs and applies them to the constants whenever possible; otherwise, the algorithm embeds manipulations into the "proper positions" of programs. We prove the correctness of the fusion algorithm that the updated program executes to get the manipulated output. To demonstrate the expressiveness of DM and the effectiveness of our fusion algorithm, we have implemented FuseDM, a prototype SVG editor that supports GUI-based operations for direct manipulation, and successfully designed 14 benchmark examples starting from blank code using FuseDM.

Dyck reachability is a principled, graph-based formulation of a plethora of static analyses. Bidirected graphs are used for capturing dataflow through mutable heap data, and are usual formalisms of demand-driven points-to and alias analyses. The best (offline) algorithm runs in *O*(*m*+*n*· α(*n*)) time, where *n* is the number of nodes and *m* is the number of edges in the flow graph, which becomes *O*(*n*^{2}) in the worst case.

In the everyday practice of program analysis, the analyzed code is subject to continuous change, with source code being added and removed. On-the-fly static analysis under such continuous updates gives rise to dynamic Dyck reachability, where reachability queries run on a dynamically changing graph, following program updates. Naturally, executing the offline algorithm in this online setting is inadequate, as the time required to process a single update is prohibitively large.

In this work we develop a novel dynamic algorithm for bidirected Dyck reachability that has *O*(*n*· α(*n*)) worst-case performance per update, thus beating the *O*(*n*^{2}) bound, and is also optimal in certain settings. We also implement our algorithm and evaluate its performance on on-the-fly data-dependence and alias analyses, and compare it with two best known alternatives, namely (i) the optimal offline algorithm, and (ii) a fully dynamic Datalog solver. Our experiments show that our dynamic algorithm is consistently, and by far, the top performing algorithm, exhibiting speedups in the order of 1000X. The running time of each update is almost always unnoticeable to the human eye, making it ideal for the on-the-fly analysis setting.

Past years have seen the development of a few proposals for quantum extensions of process calculi. The rationale is clear: with the development of quantum communication protocols, there is a need to abstract and focus on the basic features of quantum concurrent systems, like CCS and CSP have done for their classical counterparts. So far, though, no accepted standard has emerged, neither for the syntax nor for the behavioural semantics. Indeed, the various proposals do not agree on what should be the observational properties of quantum values, and as a matter of fact, the soundness of such properties has never been validated against the prescriptions of quantum theory.

To this aim, we introduce a new calculus, Linear Quantum CCS (lqCCS), and investigate the features of behavioural equivalences based on barbs and contexts. Our calculus can be thought of as an asynchronous, linear version of qCCS, which is in turn based on value-passing CCS. The combination of linearity and asynchronous communication fits well with the properties of quantum systems (e.g. the no-cloning theorem), since it ensures that each qubit is sent exactly once, precisely specifying which qubits of a process interact with the context.

We exploit contexts to examine how bisimilarities relate to quantum theory. We show that the observational power of general contexts is incompatible with quantum theory: roughly, they can perform non-deterministic moves depending on quantum values without measuring (hence perturbing) them.

Therefore, we refine the operational semantics in order to prevent contexts from performing unfeasible non-deterministic choices. This induces a coarser bisimilarity that better fits the quantum setting: (i) it lifts the indistinguishability of quantum states to the distributions of processes and, despite the additional constraints, (ii) it preserves the expressiveness of non-deterministic choices based on classical information. To the best of our knowledge, our semantics is the first one that satisfies the two properties above.

In type systems with dependency tracking, programmers can assign an ordered set of levels to computations and prevent information flow from high-level computations to the low-level ones. The key notion in such systems is *indistinguishability*: a definition of program equivalence that takes into account the parts of the program that an observer may depend on. In this paper, we investigate the use of dependency tracking in the context of dependently-typed languages. We present the Dependent Calculus of Indistinguishability (DCOI), a system that adopts indistinguishability as the definition of equality used by the type checker. DCOI also internalizes that relation as an observer-indexed propositional equality type, so that programmers may reason about indistinguishability within the language. Our design generalizes and extends prior systems that combine dependency tracking with dependent types and is the first to support conversion and propositional equality at arbitrary observer levels. We have proven type soundness and noninterference theorems for DCOI and have developed a prototype implementation of its type checker.

We consider injective first-order interpretations that input and output trees of bounded height. The corresponding functions have polynomial output size, since a first-order interpretation can use a *k*-tuple of input nodes to represent a single output node. We prove that the equivalence problem for such functions is decidable, i.e. given two such interpretations, one can decide whether, for every input tree, the two output trees are isomorphic.

We also give a calculus of typed functions and combinators which derives exactly injective first-order interpretations for unordered trees of bounded height. The calculus is based on a type system, where the type constructors are products, coproducts and a monad of multisets. Thanks to our results about tree-to-tree interpretations, the equivalence problem is decidable for this calculus.

As an application, we show that the equivalence problem is decidable for first-order interpretations between classes of graphs that have bounded tree-depth. In all cases studied in this paper, first-order logic and MSO have the same expressive power, and hence all results apply also to MSO interpretations.

Cyclic proof systems, in which induction is managed implicitly, are a promising approach to automatic verification. The soundness of cyclic proof graphs is ensured by checking them against a trace-based Infinite Descent property. Although the problem of checking Infinite Descent is known to be PSPACE-complete, this leaves much room for variation in practice. Indeed, a number of different approaches are employed across the various cyclic proof systems described in the literature. In this paper, we study criteria for Infinite Descent in an abstract, logic-independent setting. We look at criteria based on Büchi automata encodings and relational abstractions, and determine their parameterized time complexities in terms of natural dimensions of cyclic proofs: the numbers of vertices of the proof-tree graphs, and the vertex width—an upper bound on the number of components (e.g., formulas) of a sequent that can be simultaneously tracked for descent. We identify novel algorithms that improve upon the parameterised complexity of the existing algorithms. We implement the studied criteria and compare their performance on various benchmarks.

We introduce a linear concurrent separation logic, called LinearActris, designed to guarantee deadlock and leak freedom for message-passing concurrency. LinearActris combines the strengths of session types and concurrent separation logic, allowing for the verification of challenging higher-order programs with mutable state through dependent protocols. The key challenge is to prove the adequacy theorem of LinearActris, which says that the logic indeed gives deadlock and leak freedom “for free” from linearity. We prove this theorem by defining a step-indexed model of separation logic, based on *connectivity graphs*. To demonstrate the expressive power of LinearActris, we prove soundness of a higher-order (GV-style) session type system using the technique of logical relations. All our results and examples have been mechanized in Coq.

Type inference in the presence of first-class or “impredicative” second-order polymorphism à la System F has been an active research area for several decades, with original works dating back to the end of the 80s. Yet, until now many basic problems remain open, such as how to type check expressions like (𝜆𝑥. (𝑥 123, 𝑥 True)) id reliably. We show that a type inference approach based on multi-bounded polymorphism, a form of implicit polymorphic subtyping with multiple lower and upper bounds, can help us resolve most of these problems in a uniquely simple and regular way. We define F_{{≤}}, a declarative type system derived from the existing theory of implicit coercions by Cretin and Rémy (LICS 2014), and we introduce SuperF, a novel algorithm to infer polymorphic multi-bounded F_{{≤}} types while checking user type annotations written in the syntax of System F. We use a recursion-avoiding heuristic to guarantee termination of type inference at the cost of rejecting some valid programs, which thankfully rarely triggers in practice. We show that SuperF is vastly more powerful than all first-class-polymorphic type inference systems proposed so far, significantly advancing the state of the art in type inference for general-purpose programming languages.

JSON Schema is the de-facto standard schema language for JSON data. The language went through many minor revisions, but the most recent versions of the language, starting from Draft 2019-09, added two novel features, dynamic references and annotation-dependent validation, that change the evaluation model. Modern JSON Schema is the name used to indicate all versions from Draft 2019-09, which are characterized by these new features, while Classical JSON Schema is used to indicate the previous versions.

These new “modern” features make the schema language quite difficult to understand and have generated many discussions about the correct interpretation of their official specifications; for this reason, we undertook the task of their formalization. During this process, we also analyzed the complexity of data validation in Modern JSON Schema, with the idea of confirming the polynomial complexity of Classical JSON Schema validation, and we were surprised to discover a completely different truth: data validation, which is expected to be an extremely efficient process, acquires, with Modern JSON Schema features, a PSPACE complexity.

In this paper, we give the first formal description of Modern JSON Schema, which we have discussed with the community of JSON Schema tool developers, and which we consider a central contribution of this work. We then prove that its data validation problem is PSPACE-complete. We prove that the origin of the problem lies in the Draft 2020-12 version of dynamic references, and not in annotation-dependent validation. We study the schema and data complexities, showing that the problem is PSPACE-complete with respect to the schema size even with a fixed instance but is in P when the schema is fixed and only the instance size is allowed to vary. Finally, we run experiments that show that there are families of schemas where the difference in asymptotic complexity between dynamic and static references is extremely visible, even with small schemas.

A thunk is a mutable data structure that offers a simple memoization service:
it stores either a suspended computation or the result of this computation.
Okasaki [1999] presents many data structures that exploit thunks to achieve
good amortized time complexity. He analyzes their complexity by associating a
debit with every thunk. A debit can be paid off in several increments; a thunk
whose debit has been fully paid off can be forced. Quite strikingly, a debit is
associated also with future thunks, which do not yet exist in memory. Some of
the debit of a faraway future thunk can be transferred to a nearer future
thunk. We present a complete machine-checked reconstruction of Okasaki's
reasoning rules in Iris^{$}, a rich separation logic with time credits. We
demonstrate the applicability of the rules by verifying a few operations on
streams as well as several of Okasaki's data structures, namely the
physicist's queue, implicit queues, and the banker's queue.

We propose a new language feature for ML-family languages, the ability to selectively *unbox* certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation.

We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes.

Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.

We propose a new synthesis algorithm that can efficiently search programs with local variables (e.g., those introduced by lambdas). Prior bottom-up synthesis algorithms are not able to evaluate programs with free local variables, and therefore cannot effectively reduce the search space of such programs (e.g., using standard observational equivalence reduction techniques), making synthesis slow. Our algorithm can reduce the space of programs with local variables. The key idea, dubbed lifted interpretation, is to lift up the program interpretation process, from evaluating one program at a time to simultaneously evaluating all programs from a grammar. Lifted interpretation provides a mechanism to systematically enumerate all binding contexts for local variables, thereby enabling us to evaluate and reduce the space of programs with local variables. Our ideas are instantiated in the domain of web automation. The resulting tool, Arborist, can automate a significantly broader range of challenging tasks more efficiently than state-of-the-art techniques including WebRobot and Helena.

Recent work has proposed a memory property for parallel programs, called disentanglement, and showed that it is pervasive in a variety of programs, written in different languages, ranging from C/C++ to Parallel ML, and showed that it can be exploited to improve the performance of parallel functional programs. All existing work on disentanglement, however, considers the "fork/join" model for parallelism and does not apply to "futures", the more powerful approach to parallelism. This is not surprising: fork/join parallel programs exhibit a reasonably strict dependency structure (e.g., series-parallel DAGs), which disentanglement exploits. In contrast, with futures, parallel computations become first-class values of the language, and thus can be created, and passed between functions calls or stored in memory, just like other ordinary values, resulting in complex dependency structures, especially in the presence of mutable state. For example, parallel programs with futures can have deadlocks, which is impossible with fork-join parallelism.

In this paper, we are interested in the theoretical question of whether disentanglement may be extended beyond fork/join parallelism, and specifically to futures. We consider a functional language with futures, Input/Output (I/O), and mutable state (references) and show that a broad range of programs written in this language are disentangled. We start by formalizing disentanglement for futures and proving that purely functional programs written in this language are disentangled. We then generalize this result in three directions. First, we consider state (effects) and prove that stateful programs are disentangled if they are race free. Second, we show that race freedom is sufficient but not a necessary condition and non-deterministic programs, e.g. those that use atomic read-modify-operations and some non-deterministic combinators, may also be disentangled. Third, we prove that disentangled task-parallel programs written with futures are free of deadlocks, which arise due to interactions between state and the rich dependencies that can be expressed with futures. Taken together, these results show that disentanglement generalizes to parallel programs with futures and, thus, the benefits of disentanglement may go well beyond fork-join parallelism.

We propose a novel approach to soundly combining linear types with multi-shot effect handlers. circear type systems statically ensure that resources such as file handles and communication channels are used exactly once. Effect handlers provide a rich modular programming abstraction for implementing features ranging from exceptions to concurrency to backtracking. Whereas conventional linear type systems bake in the assumption that continuations are invoked exactly once, effect handlers allow continuations to be discarded (e.g. for exceptions) or invoked more than once (e.g. for backtracking). This mismatch leads to soundness bugs in existing systems such as the programming language Links, which combines linearity (for session types) with effect handlers. We introduce control-flow linearity as a means to ensure that continuations are used in accordance with the linearity of any resources they capture, ruling out such soundness bugs.

We formalise the notion of control-flow linearity in a System F-style core calculus *F*_{eff}^{∘} equipped with linear types, an effect type system, and effect handlers. We define a linearity-aware semantics in order to formally prove that *F*_{eff}^{∘} preserves the integrity of linear values in the sense that no linear value is discarded or duplicated. In order to show that control-flow linearity can be made practical, we adapt circks based on the design of *F*_{eff}^{∘}, in doing so fixing a long-standing soundness bug.

Finally, to better expose the potential of control-flow linearity, we define an ML-style core calculus *Q*_{eff}^{∘}, based on qualified types, which requires no programmer provided annotations, and instead relies entirely on type inference to infer control-flow linearity. Both linearity and effects are captured by qualified types. *Q*_{eff}^{∘} overcomes a number of practical limitations of *F*_{eff}^{∘}, supporting abstraction over linearity, linearity dependencies between type variables, and a much more fine-grained notion of control-flow linearity.

It is widely known that the precision of a program analyzer is closely related to intensional program properties, namely, properties concerning how the program is written. This explains, for instance, the interest in code obfuscation techniques, namely, tools explicitly designed to degrade the results of program analysis by operating syntactic program transformations. Less is known about a possible relation between what the program extensionally computes, namely, its input-output relation, and the precision of a program analyzer. In this paper we explore this potential connection in an effort to isolate program fragments that can be precisely analyzed by abstract interpretation, namely, programs for which there exists a complete abstract interpretation. In the field of static inference of numeric invariants, this happens for programs, or parts of programs, that manifest a monotone (either non-decreasing or non-increasing) behavior. We first formalize the notion of program monotonicity with respect to a given input and a set of numerical variables of interest. A sound proof system is then introduced with judgments specifying whether a program is monotone relatively to a set of variables and a set of inputs. The interest in monotonicity is justified because we prove that the family of monotone programs admits a complete abstract interpretation over a specific class of non-trivial numerical abstractions and inputs. This class includes all non-relational abstract domains that refine interval analysis (i.e., at least as precise as the intervals abstraction) and that satisfy a topological convexity hypothesis.

This paper presents a novel formalisation of algebraic effects with equations in Cubical Agda. Unlike previous work in the literature that employed *setoids* to deal with equations, the library presented here uses quotient types to faithfully encode the type of terms quotiented by laws. Apart from tools for equational reasoning, the library also provides an *effect-generic Hoare logic* for algebraic effects, which enables reasoning about effectful programs in terms of their pre- and post- conditions. A particularly novel aspect is that equational reasoning and Hoare-style reasoning are related by an elimination principle of Hoare logic.

Two-player graph games have found numerous applications, most notably in the synthesis of reactive systems from temporal specifications, but also in verification. The relevance of infinite-state systems in these areas has lead to significant attention towards developing techniques for solving infinite-state games.

We propose novel symbolic semi-algorithms for solving infinite-state games with temporal winning conditions. The novelty of our approach lies in the introduction of an acceleration technique that enhances fixpoint-based game-solving methods and helps to avoid divergence. Classical fixpoint-based algorithms, when applied to infinite-state games, are bound to diverge in many cases, since they iteratively compute the set of states from which one player has a winning strategy. Our proposed approach can lead to convergence in cases where existing algorithms require an infinite number of iterations. This is achieved by acceleration: computing an infinite set of states from which a simpler sub-strategy can be iterated an unbounded number of times in order to win the game. Ours is the first method for solving infinite-state games to employ acceleration. Thanks to this, it is able to outperform state-of-the-art techniques on a range of benchmarks, as evidenced by our evaluation of a prototype implementation.

Rewriting is a principled term transformation technique with uses across theorem
proving and compilation.
In theorem proving, each rewrite is a proof step; in compilation, rewrites optimize a program term.
While developing rewrite sequences manually is possible, this process does not
scale to larger rewrite sequences. Automated rewriting
techniques, like greedy simplification or equality saturation, work well
without requiring human input.
Yet, they do not scale to large search spaces, limiting the complexity of tasks where automated rewriting is effective,
and meaning that just a small increase in term size or rewrite length may result in failure.
This paper proposes a semi-automatic rewriting technique as a means to scale rewriting by allowing
human insight at key decision points.
Specifically, we propose *guided equality saturation* that embraces human guidance when fully automated equality saturation does not scale.
The rewriting is split into two simpler automatic equality saturation steps: from the original term to a human-provided intermediate *guide*, and from the guide to the target. Complex rewriting tasks may require multiple guides, resulting in a sequence of equality saturation steps.
A guide can be a complete term, or a *sketch* containing undefined elements that are instantiated by the equality saturation search.
Such sketches may be far more concise than complete terms.
We demonstrate the generality and effectiveness of guided equality saturation using two case studies.
First, we integrate guided equality saturation in the Lean 4 proof assistant.
Proofs are written in the style of textbook proof sketches, as a series of calculations omitting details and skipping steps.
These proofs conclude in less than a second instead of minutes when compared to unguided equality saturation, and can
find complex proofs that previously had to be done manually.
Second, in the compiler of the Rise array language, where unguided equality saturation fails to perform optimizations within an hour and using 60 GB of memory, guided equality saturation performs the same optimizations with at most 3 guides, within seconds using less than 1 GB memory.

Quantum programs are notoriously difficult to code and verify due to unintuitive quantum knowledge associated with quantum programming. Automated tools relieving the tedium and errors associated with low-level quantum details would hence be highly desirable. In this paper, we initiate the study of program synthesis for quantum unitary programs that recursively define a family of unitary circuits for different input sizes, which are widely used in existing quantum programming languages. Specifically, we present QSynth, the first quantum program synthesis framework, including a new inductive quantum programming language, its specification, a sound logic for reasoning, and an encoding of the reasoning procedure into SMT instances. By leveraging existing SMT solvers, QSynth successfully synthesizes ten quantum unitary programs including quantum adder circuits, quantum eigenvalue inversion circuits and Quantum Fourier Transformation, which can be readily transpiled to executable programs on major quantum platforms, e.g., Q#, IBM Qiskit, and AWS Braket.

Intermediate verification languages like Why3 and Boogie have made it much easier to build program verifiers, transforming the process into a logic compilation problem rather than a proof automation one. Why3 in particular implements a rich logic for program specification with polymorphism, algebraic data types, recursive functions and predicates, and inductive predicates; it translates this logic to over a dozen solvers and proof assistants. Accordingly, it serves as a backend for many tools, including Frama-C, EasyCrypt, and GNATProve for Ada SPARK. But how can we be sure that these tools are correct? The alternate foundational approach, taken by tools like VST and CakeML, provides strong guarantees by implementing the entire toolchain in a proof assistant, but these tools are harder to build and cannot directly take advantage of SMT solver automation. As a first step toward enabling automated tools with similar foundational guarantees, we give a formal semantics in Coq for the logic fragment of Why3. We show that our semantics are useful by giving a correct-by-construction natural deduction proof system for this logic, using this proof system to verify parts of Why3's standard library, and proving sound two of Why3's transformations used to convert terms and formulas into the simpler logics supported by the backend solvers.

We study semantic models of probabilistic programming languages over graphs, and establish a connection to graphons from graph theory and combinatorics. We show that every well-behaved equational theory for our graph probabilistic programming language corresponds to a graphon, and conversely, every graphon arises in this way.

We provide three constructions for showing that every graphon arises from an equational theory. The first is an abstract construction, using Markov categories and monoidal indeterminates. The second and third are more concrete. The second is in terms of traditional measure theoretic probability, which covers 'black-and-white' graphons. The third is in terms of probability monads on the nominal sets of Gabbay and Pitts. Specifically, we use a variation of nominal sets induced by the theory of graphs, which covers Erdős-Rényi graphons. In this way, we build new models of graph probabilistic programming from graphons.

We introduce a novel approach for testing static typing implementations based on the concept of API-driven program synthesis. The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries. Our primary insight is backed up by real-world evidence: a significant number of compiler typing bugs are caused by small test cases that employ APIs from the standard library of the language under test. This is attributed to the inherent complexity of the majority of these APIs, which often exercise a wide range of sophisticated type-related features. The main contribution of our approach is the ability to produce small client programs with increased feature coverage, without bearing the burden of generating the corresponding well-formed API definitions from scratch. To validate diverse aspects of static typing procedures (i.e., soundness, precision of type inference), we also enrich our API-driven approach with fault-injection and semantics-preserving modes, along with their corresponding test oracles.

We evaluate our implemented tool, Thalia on testing the static typing implementations of the compilers for three popular languages, namely, Scala, Kotlin, and Groovy. Thalia has uncovered 84 typing bugs (77 confirmed and 22 fixed), most of which are triggered by test cases featuring APIs that rely on parametric polymorphism, overloading, and higher-order functions. Our comparison with state-of-the-art shows that Thalia yields test programs with distinct characteristics, offering additional and complementary benefits.

Computing the posterior distribution of a probabilistic program is a hard task for which no one-fit-for-all solution exists. We propose Gaussian Semantics, which approximates the exact probabilistic semantics of a bounded program by means of Gaussian mixtures. It is parametrized by a map that associates each program location with the moment order to be matched in the approximation. We provide two main contributions. The first is a universal approximation theorem stating that, under mild conditions, Gaussian Semantics can approximate the exact semantics arbitrarily closely. The second is an approximation that matches up to second-order moments analytically in face of the generally difficult problem of matching moments of Gaussian mixtures with arbitrary moment order. We test our second-order Gaussian approximation (SOGA) on a number of case studies from the literature. We show that it can provide accurate estimates in models not supported by other approximation methods or when exact symbolic techniques fail because of complex expressions or non-simplified integrals. On two notable classes of problems, namely collaborative filtering and programs involving mixtures of continuous and discrete distributions, we show that SOGA significantly outperforms alternative techniques in terms of accuracy and computational time.

The formal analysis of cryptographic protocols traditionally focuses on trace and equivalence properties, for which decision procedures in the symbolic (or Dolev-Yao, or DY) model are known. However, many relevant security properties are expressed as DY hyperproperties that involve quantifications over both execution paths and attacker computations (which are constrained by the attacker's knowledge in the underlying model of computation). DY hyperproperties generalise hyperproperties, for which many decision procedures exist, to the setting of DY models. Unfortunately, the subtle interactions between both forms of quantifications have been an obstacle to lifting decision procedures from hyperproperties to DY hyperproperties.

The central contribution of the paper is the first procedure for deciding DY hyperproperties, in the usual setting where the number of protocol sessions is bounded and where the equational theory modelling cryptography is subterm-convergent. We prove that our decision procedure can decide the validity of any hyperproperty in which quantifications over messages are guarded and quantifications over attacker computations are limited to expressing the attacker's knowledge. We also establish the complexity of the decision problem for several important fragments of the hyperlogic. Further, we illustrate the techniques and scope of our contributions through examples of related hyperproperties.

Parikh’s Theorem is a fundamental result in automata theory with numerous applications in computer science. These include software verification (e.g. infinite-state verification, string constraints, and theory of arrays), verification of cryptographic protocols (e.g. using Horn clauses modulo equational theories) and database querying (e.g. evaluating path-queries in graph databases), among others. Parikh’s Theorem states that the letter-counting abstraction of a language recognized by finite automata or context-free grammars is definable in Linear Integer Arithmetic (a.k.a. Presburger Arithmetic). In fact, there is a linear-time algorithm computing existential Presburger formulas capturing such abstractions, which enables an efficient analysis via SMT-solvers. Unfortunately, real-world applications typically require large alphabets (e.g. Unicode, containing a million of characters) — which are well-known to be not amenable to explicit treatment of the alphabets — or even worse infinite alphabets.

Symbolic automata have proven in the last decade to be an effective algorithmic framework for handling large finite or even infinite alphabets. A symbolic automaton employs an effective boolean algebra, which offers a symbolic representation of character sets (i.e. in terms of predicates) and often lends itself to an exponentially more succinct representation of a language. Instead of letter-counting, Parikh’s Theorem for symbolic automata amounts to counting the number of times different predicates are satisfied by an input sequence. Unfortunately, naively applying Parikh’s Theorem from classical automata theory to symbolic automata yields existential Presburger formulas of exponential size. In this paper, we provide a new construction for Parikh’s Theorem for symbolic automata and grammars, which avoids this exponential blowup: our algorithm computes an existential formula in polynomial-time over (quantifier-free) Presburger and the base theory. In fact, our algorithm extends to the model of parametric symbolic grammars, which are one of the most expressive models of languages over infinite alphabets. We have implemented our algorithm and show it can be used to solve string constraints that are difficult to solve by existing solvers.

Weak-memory models are standard formal specifications of concurrency across hardware, programming languages, and distributed systems. A fundamental computational problem is *consistency testing*: is the observed execution of a concurrent program in alignment with the specification of the underlying system? The problem has been studied extensively across Sequential Consistency (SC) and weak memory, and proven to be NP-complete when some aspect of the input (e.g., number of threads/memory locations) is unbounded. This unboundedness has left a natural question open: are there efficient *parameterized* algorithms for testing?

The main contribution of this paper is a deep hardness result for consistency testing under many popular weak-memory models: the problem remains NP-complete even in its *bounded* setting, where candidate executions contain a bounded number of threads, memory locations, and values. This hardness spreads across several Release-Acquire variants of C11, a popular variant of its Relaxed fragment, popular Causal Consistency models, and the POWER architecture. To our knowledge, this is the first result that fully exposes the hardness of weak-memory testing and proves that the problem *admits no parameterization* under standard input parameters. It also yields a computational separation of these models from SC, x86-TSO, PSO, and Relaxed, for which bounded consistency testing is either known (for SC), or shown here (for the rest), to be in polynomial time.

We introduce two-sided type systems, which are sequent calculi for typing formulas. Two-sided type systems allow for hypothetical reasoning over the typing of compound program expressions, and the refutation of typing formulas. By incorporating a type of all values, these type systems support more refined notions of well-typing and ill-typing, guaranteeing both that well-typed programs don't go wrong and that ill-typed programs don't evaluate - that is, reach a value. This makes two-sided type systems suitable for incorrectness reasoning in higher-order program verification, which we illustrate through an application to precise data-flow typing in a language with constructors and pattern matching. Finally, we investigate the internalisation of the meta-level negation in the system as a complement operator on types. This motivates an alternative semantics for the typing judgement, which guarantees that ill-typed programs don't evaluate, but in which well-typed programs may yet go wrong.

Type systems typically only define the conditions under which an expression is well-typed, leaving ill-typed expressions formally meaningless. This approach is insufficient as the basis for language servers driving modern programming environments, which are expected to recover from simultaneously localized errors and continue to provide a variety of downstream semantic services. This paper addresses this problem, contributing the first comprehensive formal account of total type error localization and recovery: the marked lambda calculus. In particular, we define a gradual type system for expressions with marked errors, which operate as non-empty holes, together with a total procedure for marking arbitrary unmarked expressions. We mechanize the metatheory of the marked lambda calculus in Agda and implement it, scaled up, as the new basis for Hazel, a full-scale live functional programming environment with, uniquely, no meaningless editor states.

The marked lambda calculus is bidirectionally typed, so localization decisions are systematically predictable based on a local flow of typing information. Constraint-based type inference can bring more distant information to bear in discovering inconsistencies but this notoriously complicates error localization. We approach this problem by deploying constraint solving as a type-hole-filling layer atop this gradual bidirectionally typed core. Errors arising from inconsistent unification constraints are localized exclusively to type and expression holes, i.e. the system identifies unfillable holes using a system of traced provenances, rather than localized in an ad hoc manner to particular expressions. The user can then interactively shift these errors to particular downstream expressions by selecting from suggested partially consistent type hole fillings, which returns control back to the bidirectional system. We implement this type hole inference system in Hazel.

Program verifiers for imperative languages such as C may be annotation-based, in which assertions and invariants are put into source files and then checked, or tactic-based, where proof scripts separate from programs are interactively developed in a proof assistant such as Coq. Annotation verifiers have been more automated and convenient, but some interactive verifiers have richer assertion languages and formal proofs of soundness. We present VST-A, an annotation verifier that uses the rich assertion language of VST, leverages the formal soundness proof of VST, but allows users to describe functional correctness proofs intuitively by inserting assertions.

VST-A analyzes control flow graphs, decomposes every C function into control flow paths between assertions, and reduces program verification problems into corresponding straightline Hoare triples. Compared to existing foundational program verification tools like VST and Iris, in VST-A such decompositions and reductions can nonstructural, which makes VST-A more flexible to use.

VST-A's decomposition and reduction is defined in Coq, proved sound in Coq, and computed call-by-value in Coq. The soundness proof for reduction is totally logical, independent of the complicated semantic model (and soundness proof) of VST's Hoare triple. Because of the rich assertion language, not all reduced proof goals can be automatically checked, but the system allows users to prove residual proof goals using the full power of the Coq proof assistant.

Practical checkers based on refinement types use the combination of implicit semantic subtyping and parametric polymorphism to simplify the specification and automate the verification of sophisticated properties of programs. However, a formal metatheoretic accounting of the *soundness* of refinement type systems using this combination has proved elusive. We present λ_{RF}, a core refinement calculus that combines semantic subtyping and parametric polymorphism. We develop a metatheory for this calculus and prove soundness of the type system. Finally, we give two mechanizations of our metatheory. First, we introduce *data propositions*, a novel feature that enables encoding derivation trees for inductively defined judgments as refined data types, and use them to show that LiquidHaskell’s refinement types can be used *for* mechanization. Second, we mechanize our results in Coq, which comes with stronger soundness guarantees than LiquidHaskell, thereby laying the foundations for mechanizing the metatheory *of* LiquidHaskell.

Syntax-guided synthesis has been a prevalent theme in various computer-aided programming systems. However, the domain of bit-vector synthesis poses several unique challenges that have not yet been sufficiently addressed and resolved. In this paper, we propose a novel synthesis approach that incorporates a distinct enumeration strategy based on various factors. Technically, this approach weighs in subexpression recurrence by term-graph-based enumeration, avoids useless candidates by example-guided filtration, prioritizes valuable components identified by large language models. This approach also incorporates a bottom-up deduction step to enhance the enumeration algorithm by considering subproblems that contribute to the deductive resolution. We implement all the enhanced enumeration techniques in our SyGuS solver DryadSynth, which outperforms state-of-the-art solvers in terms of the number of solved problems, execution time, and solution size. Notably, DryadSynth successfully solved 31 synthesis problems for the first time, including 5 renowned Hacker's Delight problems.

Verified compilation of open modules (i.e., modules whose functionality depends on other modules) provides a foundation for end-to-end verification of modular programs ubiquitous in contemporary software. However, despite intensive investigation in this topic for decades,the proposed approaches are still difficult to use in practice as they rely on assumptions about the internal working of compilers which make it difficult for external users to apply the verification results. We propose an approach to verified compositional compilation without such assumptions in the setting of verifying compilation of heterogeneous modules written in first-order languages supporting global memory and pointers. Our approach is based on the memory model of CompCert and a new discovery that a Kripke relation with a notion of memory protection can serve as a uniform and composable semantic interface for the compiler passes. By absorbing the rely-guarantee conditions on memory evolution for all compiler passes into this Kripke Memory Relation and by piggybacking requirements on compiler optimizations onto it, we get compositional correctness theorems for realistic optimizing compilers as refinements that directly relate native semantics of open modules and that are ignorant of intermediate compilation processes. Such direct refinements support all the compositionality and adequacy properties essential for verified compilation of open modules. We have applied this approach to the full compilation chain of CompCert with its Clight source language and demonstrated that our compiler correctness theorem is open to composition and intuitive to use with reduced verification complexity through end-to-end verification of non-trivial heterogeneous modules that may freely invoke each other (e.g.,mutually recursively).

While current bug detection techniques for concurrent software focus on unearthing low-level issues such as data races or deadlocks, they often fall short of discovering more intricate temporal behaviours that can arise even in the absence of such low-level issues. In this paper, we focus on the problem of dynamically analysing concurrent software against high-level temporal specifications such as LTL. Existing techniques for runtime monitoring against such specifications are primarily designed for sequential software and remain inadequate in the presence of concurrency — violations may be observed only in intricate thread interleavings, requiring many re-runs of the underlying software in conjunction with the analysis. Towards this, we study the problem of *predictive runtime monitoring*, inspired by the analogous problem of *predictive data race detection* studied extensively recently. The predictive runtime monitoring question asks, given an execution σ, if it can be soundly *reordered* to expose violations of a specification. In general, this problem may become easily intractable when either the specifications or the notion of reorderings used is complex.

In this paper, we focus on specifications that are given in regular languages. Our notion of reorderings is *trace equivalence*, where an execution is considered a reordering of another if it can be obtained from the latter by successively commuting adjacent independent actions. We first show that, even in this simplistic setting, the problem of predictive monitoring admits a super-linear lower bound of *O*(*n*^{α}), where *n* is the number of events in the execution, and α is a parameter describing the degree of commutativity, and typically corresponds to the number of threads in the execution. As a result, predictive runtime monitoring even in this setting is unlikely to be efficiently solvable, unlike in the non-predictive setting where the problem can be checked using a deterministic finite automaton (and thus, a constant-space streaming linear-time algorithm).

Towards this, we identify a sub-class of regular languages, called *pattern languages* (and their extension *generalized pattern languages*). Pattern languages can naturally express specific ordering of some number of (labelled) events, and have been inspired by popular empirical hypotheses underlying many concurrency bug detection approaches such as the “small bug depth” hypothesis. More importantly, we show that for pattern (and generalized pattern) languages, the predictive monitoring problem can be solved using a constant-space streaming linear-time algorithm. We implement and evaluate our algorithm PatternTrack on benchmarks from the literature and show that it is effective in monitoring large-scale applications.

We introduce SCIO*, a formally secure compilation framework for statically verified programs performing input-output (IO). The source language is an F* subset in which a verified program interacts with its IO-performing context via a higher-order interface that includes refinement types as well as pre- and post-conditions about past IO events. The target language is a smaller F* subset in which the compiled program is linked with an adversarial context that has an interface without refinement types, pre-conditions, or concrete post-conditions. To bridge this interface gap and make compilation and linking secure we propose a formally verified combination of higher-order contracts and reference monitoring for recording and controlling IO operations. Compilation uses contracts to convert the logical assumptions the program makes about the context into dynamic checks on each context-program boundary crossing. These boundary checks can depend on information about past IO events stored in the state of the monitor. But these checks cannot stop the adversarial target context before it performs dangerous IO operations. Therefore linking in SCIO* additionally forces the context to perform all IO actions via a secure IO library, which uses reference monitoring to dynamically enforce an access control policy before each IO operation. We prove in F* that SCIO* soundly enforces a global trace property for the compiled verified program linked with the untrusted context. Moreover, we prove in F* that SCIO* satisfies by construction Robust Relational Hyperproperty Preservation, a very strong secure compilation criterion. Finally, we illustrate SCIO* at work on a simple web server example.

Convex hulls are commonly used to tackle the non-linearity of activation functions in the verification of neural networks. Computing the exact convex hull is a costly task though. In this work, we propose a fast and precise approach to over-approximating the convex hull of the ReLU function (referred to as the ReLU hull), one of the most used activation functions. Our key insight is to formulate a convex polytope that ”wraps” the ReLU hull, by reusing the linear pieces of the ReLU function as the lower faces and constructing upper faces that are adjacent to the lower faces. The upper faces can be efficiently constructed based on the edges and vertices of the lower faces, given that an *n*-dimensional (or simply *n*d hereafter) hyperplane can be determined by an (*n*−1)d hyperplane and a point outside of it. We implement our approach as WraLU, and evaluate its performance in terms of precision, efficiency, constraint complexity, and scalability. WraLU outperforms existing advanced methods by generating fewer constraints to achieve tighter approximation in less time. It exhibits versatility by effectively addressing arbitrary input polytopes and higher-dimensional cases, which are beyond the capabilities of existing methods. We integrate WraLU into PRIMA, a state-of-the-art neural network verifier, and apply it to verify large-scale ReLU-based neural networks. Our experimental results demonstrate that WraLU achieves a high efficiency without compromising precision. It reduces the number of constraints that need to be solved by the linear programming solver by up to half, while delivering comparable or even superior results compared to the state-of-the-art verifiers.

We combine dependent types with linear type systems that soundly and completely capture polynomial time computation. We explore two systems for capturing polynomial time: one system that disallows construction of iterable data, and one, based on the LFPL system of Martin Hofmann, that controls construction via a payment method. Both of these are extended to full dependent types via Quantitative Type Theory, allowing for arbitrary computation in types alongside guaranteed polynomial time computation in terms. We prove the soundness of the systems using a realisability technique due to Dal Lago and Hofmann.

Our long-term goal is to combine the extensional reasoning of type theory with intensional reasoning about the resources intrinsically consumed by programs. This paper is a step along this path, which we hope will lead both to practical systems for reasoning about programs’ resource usage, and to theoretical use as a form of *synthetic computational complexity theory*.

Random generation of well-typed terms lies at the core of effective random testing of compilers for functional languages. Existing techniques have had success following a top-down type-oriented approach to generation that makes choices locally, which suffers from an inherent limitation: the type of an expression is often generated independently from the expression itself. Such generation frequently yields functions with argument types that cannot be used to produce a result in a meaningful way, leaving those arguments unused. Such "use-less" functions can hinder both performance, as the argument generation code is dead but still needs to be compiled, and effectiveness, as a lot of interesting optimizations are tested less frequently.

In this paper, we introduce a novel algorithm that is significantly more effective at generating functions that use their arguments. We formalize both the "local" and the "nonlocal" algorithms as step-relations in an extension of the simply-typed lambda calculus with type and arguments holes, showing how delaying the generation of types for subexpressions by allowing nonlocal generation steps leads to "useful" functions.

Parametricity is a property of the syntax of type theory implying, e.g., that there is only one function having the type of the polymorphic identity function. Parametricity is usually proven externally, and does not hold internally. Internalising it is difficult because once there is a term witnessing parametricity, it also has to be parametric itself and this results in the appearance of higher dimensional cubes. In previous theories with internal parametricity, either an explicit syntax for higher cubes is present or the theory is extended with a new sort for the interval. In this paper we present a type theory with internal parametricity which is a simple extension of Martin-Löf type theory: there are a few new type formers, term formers and equations. Geometry is not explicit in this syntax, but emergent: the new operations and equations only refer to objects up to dimension 3. We show that this theory is modelled by presheaves over the BCH cube category. Fibrancy conditions are not needed because we use span-based rather than relational parametricity. We define a gluing model for this theory implying that external parametricity and canonicity hold. The theory can be seen as a special case of a new kind of modal type theory, and it is the simplest setting in which the computational properties of higher observational type theory can be demonstrated.

An important aspect of building robust systems that execute on dedicated hardware and perhaps in constrained environments is to control and manage the effects performed by program code.

We present ReML, a higher-order statically-typed functional language, which allows programmers to be explicit about the effects performed by program code and in particular effects related to memory management. Allowing programmers to be explicit about effects, the regions in which values reside, and the constraints under which code execute, makes programs robust to changes in the program source code and to compiler updates, including compiler optimisations.

ReML is integrated with a polymorphic inference system that builds on top of region-inference, as it is implemented in the MLKit, a Standard ML compiler that uses region-based memory management as its primary memory management scheme.

We present Wasm-prechk, a superset of WebAssembly (Wasm) that uses indexed types to express and check simple constraints over program values. This additional static reasoning enables safely removing dynamic safety checks from Wasm, such as memory bounds checks. We implement Wasm-prechk as an extension of the Wasmtime compiler and runtime, evaluate the run-time and compile-time performance of Wasm-prechk vs WebAssembly configurations with explicit dynamic checks, and find an average run-time performance gain of 1.71x faster in the widely used PolyBenchC benchmark suite, for a small overhead in binary size (7.18% larger) and type-checking time (1.4% slower). We also prove type and memory safety of Wasm-prechk, prove Wasm safely embeds into Wasm-prechk ensuring backwards compatibility, prove Wasm-prechk type-erases to Wasm, and discuss design and implementation trade-offs.

Quantum Hamiltonian simulation, which simulates the evolution of quantum systems and probes quantum phenomena, is one of the most promising applications of quantum computing. Recent experimental results suggest that Hamiltonian-oriented analog quantum simulation would be advantageous over circuit-oriented digital quantum simulation in the Noisy Intermediate-Scale Quantum (NISQ) machine era. However, programming analog quantum simulators is much more challenging due to the lack of a unified interface between hardware and software. In this paper, we design and implement SimuQ, the first framework for quantum Hamiltonian simulation that supports Hamiltonian programming and pulse-level compilation to heterogeneous analog quantum simulators. Specifically, in SimuQ, front-end users specify the target quantum system with Hamiltonian Modeling Language, and the Hamiltonian-level programmability of analog quantum simulators is specified through a new abstraction called the abstract analog instruction set (AAIS) and programmed in AAIS Specification Language by hardware providers. Through a solver-based compilation, SimuQ generates executable pulse schedules for real devices to simulate the evolution of desired quantum systems, which is demonstrated on superconducting (IBM), neutral-atom (QuEra), and trapped-ion (IonQ) quantum devices. Moreover, we demonstrate the advantages of exposing the Hamiltonian-level programmability of devices with native operations or interaction-based gates and establish a small benchmark of quantum simulation to evaluate SimuQ's compiler with the above analog quantum simulators.

We introduce simple, *universal*, *sound*, and *complete* proof methods for producing machine-verifiable proofs of linearizability and strong linearizability. Universality means that our method works for any object type; soundness means that an algorithm can be proved correct by our method only if it is linearizable (resp. strong linearizable); and completeness means that any linearizable (resp. strong linearizable) implementation can be proved so using our method. We demonstrate the simplicity and power of our method by producing proofs of linearizability for the Herlihy-Wing queue and Jayanti’s single-scanner snapshot, as well as a proof of strong linearizability of the Jayanti-Tarjan union-find object. All three of these proofs are machine-verified by TLAPS (the TLA+ Proof System).

*Commutativity* has proven to be a powerful tool in reasoning about concurrent programs. Recent work has shown that a commutativity-based *reduction* of a program may admit simpler proofs than the program itself. The framework of lexicographical program reductions was introduced to formalize a broad class of reductions which accommodate sequential (thread-local) reasoning as well as synchronous programs. Approaches based on this framework, however, were fundamentally limited to program models with a *fixed/bounded* number of threads. In this paper, we show that it is possible to define an effective parametric family of program reductions that can be used to find simple proofs for *parameterized programs*, i.e., for programs with an unbounded number of threads. We show that reductions are indeed useful for the simplification of proofs for parameterized programs, in a sense that can be made precise: A reduction of a parameterized program may admit a proof which uses *fewer* or *less sophisticated ghost variables*. The reduction may therefore be within reach of an automated verification technique, even when the original parameterized program is not. As our first technical contribution, we introduce a notion of reductions for parameterized programs such that the reduction *R* of a parameterized program *P* is again a parameterized program (the thread template of *R* is obtained by source-to-source transformation of the thread template of *P*). Consequently, existing techniques for the verification of parameterized programs can be directly applied to *R* instead of *P*. Our second technical contribution is that we define an appropriate family of *pairwise preference orders* which can be effectively used as a parameter to produce different lexicographical reductions. To determine whether this theoretical foundation amounts to a usable solution in practice, we have implemented the approach, based on a recently proposed framework for parameterized program verification. The results of our preliminary experiments on a representative set of examples are encouraging.

Bayesian networks are graphical first-order probabilistic models that allow for a compact representation of large probability distributions, and for efficient inference, both exact and approximate. We introduce a higher-order programming language, in the idealized form of a lambda-calculus, which we prove sound and complete w.r.t. Bayesian networks: each Bayesian network can be encoded as a term, and conversely each (possibly higher-order and recursive) program of ground type compiles into a Bayesian network. The language allows for the specification of recursive probability models and hierarchical structures. Moreover, we provide a compositional and cost-aware semantics which is based on factors, the standard mathematical tool used in Bayesian inference. Our results rely on advanced techniques rooted into linear logic, intersection types, rewriting theory, and Girard's geometry of interaction, which are here combined in a novel way.

Gradual verification, which supports explicitly partial specifications and verifies them with a combination of static and dynamic checks, makes verification more incremental and provides earlier feedback to developers. While an abstract, weakest precondition-based approach to gradual verification was previously proven sound, the approach did not provide sufficient guidance for implementation and optimization of the required run-time checks. More recently, gradual verification was implemented using symbolic execution techniques, but the soundness of the approach (as with related static checkers based on implicit dynamic frames) was an open question. This paper puts practical gradual verification on a sound footing with a formalization of symbolic execution, optimized run-time check generation, and run time execution. We prove our approach is sound; our proof also covers a core subset of the Viper tool, for which we are aware of no previous soundness result. Our formalization enabled us to find a soundness bug in an implemented gradual verification tool and describe the fix necessary to make it sound.

Datalog has gained prominence in program analysis due to its expressiveness and ease of use. Its generic fixpoint resolution algorithm over relational domains simplifies the expression of many complex analyses. The performance and scalability issues of early Datalog approaches have been addressed by tools such as Soufflé through specialized code generation. Still, while pure Datalog is expressive enough to support a wide range of analyses, there is a growing need for extensions to accommodate increasingly complex analyses. This has led to the development of various extensions, such as Flix, Datafun, and Formulog, which enhance Datalog with features like arbitrary lattices and SMT constraints.

Most of these extensions recognize the need for full interoperability between Datalog and a full-fledged programming language, a functionality that high-performance systems like Soufflé lack. Specifically, in most cases, they construct languages from scratch with first-class Datalog support, allowing greater flexibility. However, this flexibility often comes at the cost of performance due to the conflicting requirements of prioritizing modularity and abstraction over efficiency. Consequently, achieving both flexibility and compilation to highly-performant specialized code poses a significant challenge.

In this work, we reconcile the competing demands of expressiveness and performance with Flan, a Datalog compiler fully embedded in Scala that leverages multi-stage programming to generate specialized code for enhanced performance. Our approach combines the flexibility of Flix with Soufflé’s performance, offering seamless integration with the host language that enables the addition of powerful extensions while generating specialized code for the entire computation. Flan’s simple operator interface allows the addition of an extensive set of features, including arbitrary aggregates, user-defined functions, and lattices, with multiple execution strategies such as binary and multi-way joins, supported by different indexing structures like specialized trees and hash tables, with minimal effort. We evaluate our system on a variety of benchmarks and compare it to established Datalog engines. Our results demonstrate competitive performance and speedups in the range of 1.4× to 12.5× compared to state-of-the-art systems for workloads of practical importance.

Model-checking is one of the most powerful techniques for verifying systems and programs, which since the pioneering results by Knapik et al., Ong, and Kobayashi, is known to be applicable to functional programs with higher-order types against properties expressed by formulas of monadic second-order logic. What happens when the program in question, in addition to higher-order functions, also exhibits algebraic effects such as probabilistic choice or global store? The results in the literature range from those, mostly positive, about nondeterministic effects, to those about probabilistic effects, in the presence of which even mere reachability becomes undecidable. This work takes a fresh and general look at the problem, first of all showing that there is an elegant and natural way of viewing higher-order programs producing algebraic effects as ordinary higher-order recursion schemes. We then move on to consider effect handlers, showing that in their presence the model checking problem is bound to be undecidable in the general case, while it stays decidable when handlers have a simple syntactic form, still sufficient to capture so-called generic effects. Along the way, we hint at how a general specification language could look like, this way justifying some of the results in the literature, and deriving new ones.

Software contracts empower programmers to describe functional properties of components. When it comes to constraining effects, though, the literature offers only one-off solutions for various effects. It lacks a universal principle. This paper presents the design of an effectful contract system in the context of effect handlers. A key metatheorem shows that contracts cannot unduly interfere with a program's execution. An implementation of this design, along with an evaluation of its generality, demonstrates that the theory can guide practice.

Gradual typing has emerged as a popular design point in programming languages, attracting significant interests from both academia and industry. Programmers in gradually typed languages are free to utilize static and dynamic typing as needed. To make such languages sound, runtime checks mediate the boundary of typed and untyped code. Unfortunately, such checks can incur significant runtime overhead on programs that heavily mix static and dynamic typing. To combat this overhead without necessitating changes to the underlying implementations of languages, we present discriminative typing. Discriminative typing works by optimistically inferring types for functions and implementing an optimized version of the function based on this type. To preserve safety it also implements an un-optimized version of the function based purely on the provided annotations. With two versions of each function in hand, discriminative typing translates programs so that the optimized functions are called as frequently as possible while also preserving program behaviors.

We have implemented discriminative typing in Reticulated Python and have evaluated its performance compared to guarded Reticulated Python. Our results show that discriminative typing improves the performance across 95% of tested programs, when compared to Reticulated, and achieves more than 4× speedup in more than 56% of these programs. We also compare its performance against a previous optimization approach and find that discriminative typing improved performance across 93% of tested programs, with 30% of these programs receiving speedups between 4 to 25 times. Finally, our evaluation shows that discriminative typing remarkably reduces the overhead of gradual typing on many mixed type configurations of programs.

In addition, we have implemented discriminative typing in Grift and evaluated its performance. Our evaluation demonstrations that DT significantly improves performance of Grift

We study the interaction of structural subtyping with parametric polymorphism and recursively defined type constructors. Although structural subtyping is undecidable in this setting, we describe a notion of parametricity for type constructors and then exploit it to define *parametric subtyping*, a conceptually simple, decidable, and expressive fragment of structural subtyping that strictly generalizes *rigid subtyping*. We present and prove correct an effective saturation-based decision procedure for parametric subtyping, demonstrating its applicability using a variety of examples. We also provide an implementation of this decision procedure as an artifact.

Characterization of bugs and attack vectors is in many practical scenarios as important as their finding. Recently, Girol et. al. have introduced the concept of robust reachability, which ensures a perfect reproducibility of the reported violations by distinguishing inputs that are under the control of the attacker (controlled inputs) from those that are not (uncontrolled inputs), and proposed first automated analysis for it. While it is a step toward distinguishing severe bugs from benign ones, it fails for example to describe violations that are mostly reproducible, i.e., when triggering conditions are likely to happen, meaning that they happen for all uncontrolled inputs but a few corner cases. To address this issue, we propose to leverage theory-agnostic abduction techniques to generate constraints on the uncontrolled program inputs that ensure that a target property is robustly satisfied. Our proposal comes with an extension of robust reachability that is generic on the type of trace property and on the technology used to verify the properties. We show that our approach is complete w.r.t its inference language, and we additionally discuss strategies for the efficient exploration of the inference space. We demonstrate the feasibility of the method and its practical ability to refine the notion of robust reachability with an implementation that uses robust reachability oracles to generate constraints on standard benchmarks from software verification and security analysis. We illustrate the use of our implementation to a vulnerability characterization problem in the context of fault injection attacks. Our method overcomes a major limitation of the initial proposal of robust reachability, without complicating its definition. From a practical view, this is a step toward new verification tools that are able to characterize program violations through high-level feedback.

Regular expressions have been extended with lookaround assertions, which are subdivided into lookahead and lookbehind assertions. These constructs are used to refine when a match for a pattern occurs in the input text based on the surrounding context. Current implementation techniques for lookaround involve backtracking search, which can give rise to running time that is super-linear in the length of input text. In this paper, we first consider a formal mathematical semantics for lookaround, which complements the commonly used operational understanding of lookaround in terms of a backtracking implementation. Our formal semantics allows us to establish several equational properties for simplifying lookaround assertions. Additionally, we propose a new algorithm for matching regular expressions with lookaround that has time complexity *O*(*m* · *n*), where *m* is the size of the regular expression and *n* is the length of the input text. The algorithm works by evaluating lookaround assertions in a bottom-up manner. Our algorithm makes use of a new notion of nondeterministic finite automata (NFAs), which we call oracle-NFAs. These automata are augmented with epsilon-transitions that are guarded by oracle queries that provide the truth values of lookaround assertions at every position in the text. We provide an implementation of our algorithm that incorporates three performance optimizations for reducing the work performed and memory used. We present an experimental comparison against PCRE and Java’s regex library, which are state-of-the-art regex engines that support lookaround assertions. Our experimental results show that, in contrast to PCRE and Java, our implementation does not suffer from super-linear running time and is several times faster.

We consider imperative programs that involve both randomization and pure nondeterminism. The central question is how to find a strategy resolving the pure nondeterminism such that the so-obtained determinized program satisfies a given quantitative specification, i.e., bounds on expected outcomes such as the expected final value of a program variable or the probability to terminate in a given set of states. We show how memoryless and deterministic (MD) strategies can be obtained in a semi-automatic fashion using deductive verification techniques. For loop-free programs, the MD strategies resulting from our weakest precondition-style framework are correct by construction. This extends to loopy programs, provided the loops are equipped with suitable loop invariants - just like in program verification. We show how our technique relates to the well-studied problem of obtaining strategies in countably infinite Markov decision processes with reachability-reward objectives. Finally, we apply our technique to several case studies.