SOAP 2022: Proceedings of the 11th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis

Full Citation in the ACM Digital Library

SESSION: Papers

BinFPE: accurate floating-point exception detection for GPU applications

When modern heterogeneous HPC systems perform numerical computations, floating-point exceptional quantities such as NaN and infinity in the GPU context, remain insufficiently handled. This is because commonly used GPUs and the CUDA language have no inherent exception detection capabilities. Existing compiler-based approaches for this problem are tied to a given compiler and cannot detect exceptions generated by binaries and precompiled libraries. This paper contributes BinFPE, a unique tool that addresses these challenges. BinFPE uses the NVBit dynamic binary instrumentation framework to check the machine registers after each calculation to recognize exceptions, and conveys this information to the CPU for final reporting. We demonstrate the effectiveness of BinFPE on 42 CUDA programs, reporting previously unreported exceptions. We also present the limitations of BinFPE and our perspective on building GPU tools via binary instrumentation.

Modeling code manipulation in JIT compilers

Just-in-Time (JIT) compilers are widely used to improve the performance of interpreter-based language implementations by creating optimized code at runtime. However, bugs in the JIT compiler’s code manipulation and optimization can result in the generation of incorrect code. Such bugs can be difficult to diagnose and fix, and can result in exploitable vulnerabilities. Unfortunately, existing approaches to automatic bug localization do not carry over well to such bugs. This paper discusses a different approach to analyzing JIT compiler optimization behaviors, based on using dynamic analysis to construct abstract models of the JIT compiler’s optimizer and back end. By comparing the models obtained for buggy and non-buggy executions of the JIT compiler, we can pinpoint the components of the JIT compiler’s internal representation that have been affected by the bug; this can then be mapped back to identify the buggy code. Our experiments with two real bugs for Google V8 JIT compiler, TurboFan, show the utility and practicality of our approach.

Statically detecting data leakages in data science code

Data leakage is a well-known problem in machine learning. Data leakage occurs when information from outside the training dataset is used to create a model. This phenomenon renders a model excessively optimistic or even useless in the real world since the model tends to leverage greatly on the unfairly acquired information. To date, detection of data leakages occurs post-mortem using runtime methods. In this paper, we develop a static data leakage analysis to detect several instances of data leakages during development time. Our analysis is constructed to be lightweight so that it can be performed within interactive data science notebooks. We have integrated our analysis into the NBLyzer static analyzer framework and show its utility on real world benchmarks. To the best of our knowledge, we propose the first static detection of data science data leakages.

Ensuring determinism in blockchain software with GoLiSA: an industrial experience report

Ensuring determinism is mandatory when writing blockchain software. When determinism is not met it can lead to serious implications in the blockchain network while compromising the software development, release, and patching processes. In the industrial context, it is widespread to adopt general-purpose languages, such as Go, for developing blockchain solutions. However, it is not surprising that non-deterministic behaviors may arise, being these programming languages not originally designed for blockchain purposes. In this paper, we present an experience report on ensuring determinism in blockchain software with GoLiSA, a static analyzer based on abstract interpretation for Go applications, in an industrial context. In particular, we ran GoLiSA on Commercio.network, a blockchain-based solution for exchanging electronic documents in a legally binding way. Thanks to GoLiSA, non-trivial bugs got detected and the analysis performed made it possible to identify the critical points where to apply the fixes.

ADA: a tool for visualizing the architectural overview of open-source repositories

Writing highly maintainable and efficient software code is becoming increasingly difficult, especially while following the rapid, agile development process and working in a distributed team. One of the key indicators of that inefficient software design is a high degree of code coupling, which leads to unwanted side-effects during refactoring and acts as a burden during future development. To alleviate these problems, we developed a visualization tool, ADA, that statically analyzes an open-source repository and seeks to address the issue of code coupling by providing developers with a powerful graphic representation. ADA showcases the relationships and the degree of inter-connectivity between the classes. ADA will ultimately guide developers to instantly locate the coupled area and assist them in decoupling it.

Abstract interpretation of Michelson smart-contracts

Static analysis of smart-contracts is becoming more widespread on blockchain platforms. Analyzers rely on techniques like symbolic execution or model checking, but few of them can provide strong soundness properties and guarantee the analysis termination at the same time. As smart-contracts often manipulate economic assets, proving numerical properties beyond the absence of runtime errors is also desirable. Smart-contract execution models differ considerably from mainstream programming languages and vary from one blockchain to another, making state-of-the-art analyses hard to adapt. For instance, smart-contract calls may modify a persistent storage impacting subsequent calls. This makes it difficult for tools to infer invariants %and high-level security properties required to formally ensure the absence of exploitable vulnerabilities.

The Michelson smart-contract language, used in the Tezos blockchain, is strongly typed, stack-based, and has a strict execution model leaving few opportunities for implicit runtime errors. We present a work in progress static analyzer for Michelson based on Abstract Interpretation and implemented within MOPSA, a modular static analyzer. Our tool supports the Michelson semantic features, including inner calls to external contracts. It can prove the absence of runtime errors and infer invariants on the persistent storage over an unbounded number of calls. It is also being extended to prove high-level numerical and security properties.

Towards an implementation of differential dynamic logic in PVS

This paper describes an ongoing effort to embed and verify differential dynamic logic (dL) in the Prototype Verification System (PVS). dL is a logic for specifying and formally reasoning about hybrid systems, i.e., systems that employ both continuous and discrete dynamics. There are several benefits of this effort. First, the embedding of dL in PVS offers an independent formal verification of the semantics and inference rules of dL. Second, the embedding is fully operational within PVS, giving PVS practitioners the ability to use dL in the formal specification and verification of hybrid systems. Third, the rich specification language, type system, and powerful interactive prover of PVS can be used on dL objects. In addition to the embedding and verification of dL, a custom extension for Visual Studio Code has been developed, so that a stylized dL syntax can be used to specify hybrid programs and their properties.