Computer Science - Programming Languages Publications (50)


Computer Science - Programming Languages Publications

The need for modern data analytics to combine relational, procedural, and map-reduce-style functional processing is widely recognized. State-of-the-art systems like Spark have added SQL front-ends and relational query optimization, which promise an increase in expressiveness and performance. But how good are these extensions at extracting high performance from modern hardware platforms? While Spark has made impressive progress, we show that for relational workloads, there is still a significant gap compared with best-of-breed query engines. Read More

In model-driven engineering, models abstract the relevant features of software artefacts and model transformations act on them automating complex tasks of the development process. It is, thus, crucially important to provide pragmatic, reliable methods to verify that model transformations guarantee the correctness of generated models in order to ensure the quality of the final end product. In this paper, we build on an object-oriented algebraic encoding of metamodels and models as defined in the standard Meta-Object Facility and in tools, such as the Eclipse Modeling Framework, to specify a domain-specific language for representing the action part of model transformations. Read More

We present a weakest-precondition-style calculus for reasoning about the expected values (pre-expectations) of \emph{mixed-sign unbounded} random variables after execution of a probabilistic program. The semantics of a while-loop is well-defined as the limit of iteratively applying a functional to a zero-element just as in the traditional weakest pre-expectation calculus, even though a standard least fixed point argument is not applicable in this context. A striking feature of our semantics is that it is always well-defined, even if the expected values do not exist. Read More

Multiphase ranking functions ($\mathit{M{\Phi}RFs}$) were proposed as a means to prove the termination of a loop in which the computation progresses through a number of "phases", and the progress of each phase is described by a different linear ranking function. Our work provides new insights regarding such functions for loops described by a conjunction of linear constraints (single-path loops). We provide a complete polynomial-time solution to the problem of existence and of synthesis of $\mathit{M{\Phi}RF}$ of bounded depth (number of phases), when variables range over rational or real numbers; a complete solution for the (harder) case that variables are integer, with a matching lower-bound proof, showing that the problem is coNP-complete; and a new theorem which bounds the number of iterations for loops with $\mathit{M{\Phi}RFs}$. Read More

In process algebras such as ACP, parallel processes are considered to be interleaved in an arbitrary way. In the case of multi-threading as found in contemporary programming languages, parallel processes are actually interleaved according to some interleaving strategy. Interleaving strategies are also known as process-scheduling policies. Read More

The multiway rendezvous introduced in Theoretical CSP is a powerful paradigm to achieve synchronization and communication among a group of (possibly more than two) processes. We illustrate the advantages of this paradigm on the production cell benchmark, a model of a real metal processing plant, for which we propose a compositional software controller, which is written in LNT and LOTOS, and makes intensive use of the multiway rendezvous. Read More

Model-based verification allows to express behavioral correctness conditions like the validity of execution states, boundaries of variables or timing at a high level of abstraction and affirm that they are satisfied by a software system. However, this requires expressive models which are difficult and cumbersome to create and maintain by hand. This paper presents a framework that automatically derives behavioral models from real-sized Java programs. Read More

We identify multirole logic as a new form of logic in which conjunction/disjunction is interpreted as an ultrafilter on the power set of some underlying set (of roles) and the notion of negation is generalized to endomorphisms on this underlying set. We formalize both multirole logic (MRL) and linear multirole logic (LMRL) as natural generalizations of classical logic (CL) and classical linear logic (CLL), respectively, and also present a filter-based interpretation for intuitionism in multirole logic. Among various meta-properties established for MRL and LMRL, we obtain one named multiparty cut-elimination stating that every cut involving one or more sequents (as a generalization of a (binary) cut involving exactly two sequents) can be eliminated, thus extending the celebrated result of cut-elimination by Gentzen. Read More

Writing correct programs for weak memory models such as the C11 memory model is challenging because of the weak consistency guarantees these models provide. The first program logics for the verification of such programs have recently been proposed, but their usage has been limited thus far to manual proofs. Automating proofs in these logics via first-order solvers is non-trivial, due to reasoning features such as higher-order assertions, modalities and rich permission resources. Read More

We present a data-driven approach to the problem of inductive computer program synthesis. Our method learns a probabilistic model for real-world programs from a corpus of existing code. It uses this model during synthesis to automatically infer a posterior distribution over sketches, or syntactic models of the problem to be synthesized. Read More

Programming language-design and run-time-implementation require detailed knowledge about the programs that users want to implement. Acquiring this knowledge is hard, and there is little tool support to effectively estimate whether a proposed tradeoff actually makes sense in the context of real world applications. Ideally, knowledge about behaviour of "typical" programs is 1) easily obtainable, 2) easily reproducible, and 3) easily sharable. Read More

Formal models of games help us account for and predict behavior, leading to more robust and innovative designs. While the games research community has proposed many formalisms for both the "game half" (game models, game description languages) and the "human half" (player modeling) of a game experience, little attention has been paid to the interface between the two, particularly where it concerns the player expressing her intent toward the game. We describe an analytical and computational toolbox based on programming language theory to examine the phenomenon sitting between control schemes and game rules, which we identify as a distinct player intent language for each game. Read More

We present a marriage of functional and structured imperative programming that embeds in pure lambda calculus. We describe how we implement the core of this language in a monadic DSL which is structurally equivalent to our intended source language and which, when evaluated, generates pure lambda terms in continuation-passing-style. Read More

Jolie is a service-oriented programming language which comes with the formal specification of its type system. However, there is no tool to ensure that programs in Jolie are well-typed. In this paper we provide the results of building a type checker for Jolie as a part of its syntax and semantics formal model. Read More

Interoperability is the ability of a programming language to work with systems based on different languages and paradigms. These days, many widely used high-level language impementations provide access to external functionalities. In this paper, we present some ideas on CLR interoperability focusing on the kind of constructs desirable by a programmer to this regard. Read More

Relational program verification is a variant of program verification where one can reason about two programs and as a special case about two executions of a single program on different inputs. Relational program verification can be used for reasoning about a broad range of properties, including equivalence and refinement, and specialized notions such as continuity, information flow security or relative cost. In a higher-order setting, relational program verification can be achieved using relational refinement type systems, a form of refinement types where assertions have a relational interpretation. Read More

Program synthesis from incomplete specifications (e.g. input-output examples) has gained popularity and found real-world applications, primarily due to its ease-of-use. Read More

Software tracing techniques are well-established and used by instrumentation tools to extract run-time information for program analysis and debugging. Dynamic binary instrumentation as one tool instruments program binaries to extract information. Unfortunately, instrumentation causes perturbation that is unacceptable for time-sensitive applications. Read More

An important step toward adoption of formal methods in software development is support for mainstream programming languages. Unfortunately, these languages are often rather complex and come with substantial standard libraries. However, by choosing a suitable intermediate language, most of the complexity can be delegated to existing execution-oriented (as opposed to verification-oriented) compiler frontends and standard library implementations. Read More

Rascal is a programming language that aims to simplify software language engineering such as program analysis and transformation. In this report, I present a formal syntax and semantics for core Rascal constructs, and propose key theorems to prove for the formal semantics. Read More

JavaScript systems are becoming increasingly complex and large. To tackle the challenges involved in implementing these systems, the language is evolving to include several constructions for programming- in-the-large. For example, although the language is prototype-based, the latest JavaScript standard, named ECMAScript 6 (ES6), provides native support for implementing classes. Read More

As of today the programming language of the vast majority of the published source code is manually specified or programmatically assigned based on the sole file extension. In this paper we show that the source code programming language identification task can be fully automated using machine learning techniques. We first define the criteria that a production-level automatic programming language identification solution should meet. Read More

The use of a necessity-like modality in a typed $\lambda$-calculus can be used as a device for separating the calculus in two separate regions. These can be thought of as intensional vs. extensional data: data in the first region, the modal one, are available as code, and their description can be examined, whereas data in the second region are only available as values up to ordinary equality. Read More

A new minimal notation is presented for encoding tree data structures. Tree Notation may be useful as a base document format for domain specific languages in places where JSON or XML is currently used. Read More

Session types are behavioural types for guaranteeing that some programs are free from basic communication errors. Recent work has shown that the notion of asynchronous subtyping for session types is undecidable. However, it is not clear what the possible alternatives for making such relation decidable are. Read More

Relational properties relate multiple runs of one or more programs. They characterize many useful notions of security, program refinement, and equivalence for programs with diverse computational effects, and have received much attention in the recent literature. Rather than designing and developing tools for special classes of relational properties, as typically proposed in the literature, we advocate the relational verification of effectful programs within general purpose proof assistants. Read More

We present Low*, a language for low-level programming and verification, and its application to high-assurance optimized cryptographic libraries. Low* is a shallow embedding of a small, sequential, well-behaved subset of C in F*, a dependently- typed variant of ML aimed at program verification. Departing from ML, Low* does not involve any garbage collection or implicit heap allocation; instead, it has a structured memory model \`a la CompCert, and it provides the control required for writing efficient low-level security-critical code. Read More

Data sharing among partners---users, organizations, companies---is crucial for the advancement of data analytics in many domains. Sharing through secure computation and differential privacy allows these partners to perform private computations on their sensitive data in controlled ways. However, in reality, there exist complex relationships among members. Read More

We present an extension for regular typestates, called Beyond- Regular Typestate(BR-Typestate), which is expressive enough to model non-regular properties of programs and protocols over data. We model the BR-Typestate system over a dependently typed, state based, impera- tive core language, and we prove its soundness and tractability. We have implemented a prototype typechecker for the language, and we show how several important, real world non-regular properties of programs and protocols can be verified. Read More

This paper proposes Monte Carlo Action Programming, a programming language framework for autonomous systems that act in large probabilistic state spaces with high branching factors. It comprises formal syntax and semantics of a nondeterministic action programming language. The language is interpreted stochastically via Monte Carlo Tree Search. Read More

This guide is intended to knit together, and extend, the existing PP and C documentation on PDL internals. It draws heavily from prior work by the authors of the code. Special thanks go to Christian Soeller, and Tuomas Lukka, who together with Glazebrook conceived and implemented PDL and PP; and to Chris Marshall, who has led the PDL development team through several groundbreaking releases and to new levels of usability. Read More

Static verification of source code correctness is a major milestone towards software reliability. The dynamic type system of the Jolie programming language, at the moment, allows avoidable run-time errors. A static type system for the language has been exhaustively and formally defined on paper, but still lacks an implementation. Read More

What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals. On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy---more scalably than maximum likelihood discriminants and with comparable accuracy. Read More

We present PORTHOS, the first tool that discovers porting bugs in performance-critical code. PORTHOS takes as input a program, the memory model of the source architecture for which the program has been developed, and the memory model of the targeted architecture. If the code is not portable, PORTHOS finds a porting bug in the form of an unexpected execution - an execution that is consistent with the target but inconsistent with the source memory model. Read More

Software tends to be highly configurable, but most applications are hardly context aware. For example, a web browser provides many settings to configure printers and proxies, but nevertheless it is unable to dynamically adapt to a new workplace. In this paper we aim to empirically demonstrate that by dynamic and automatic reconfiguration of unmodified software we can systematically introduce context awareness. Read More

In this paper, we import tensor index notation including Einstein summation notation into programming by introducing two kinds of functions, tensor functions and scalar functions. Tensor functions are functions that contract the tensors given as an argument, and scalar functions are the others. As with ordinary functions, when a tensor function obtains a tensor as an argument, the tensor function treats the tensor as it is as a tensor. Read More

We present a novel algorithm that synthesizes imperative programs for introductory programming courses. Given a set of input-output examples and a partial program, our algorithm generates a complete program that is consistent with every example. Our key idea is to combine enumerative program synthesis and static analysis, which aggressively prunes out a large search space while guaranteeing to find, if any, a correct solution. Read More

Precise analysis of pointer information plays an important role in many static analysis techniques and tools today. The precision, however, must be balanced against the scalability of the analysis. This paper focusses on improving the precision of standard context and flow insensitive alias analysis algorithms at a low scalability cost. Read More

In this paper, we propose the Templet -- a runtime system for actor programming of high performance computing in C++. We provide a compact source code of the runtime system, which uses standard library of C++11 only. We demonstrate how it differs from the classic implementations of the actor model. Read More

With the range and sensitivity of algorithmic decisions expanding at a break-neck speed, it is imperative that we aggressively investigate whether programs are biased. We propose a novel probabilistic program analysis technique and apply it to quantifying bias in decision-making programs. Specifically, we (i) present a sound and complete automated verification technique for proving quantitative properties of probabilistic programs; (ii) show that certain notions of bias, recently proposed in the fairness literature, can be phrased as quantitative correctness properties; and (iii) present FairSquare, the first verification tool for quantifying program bias, and evaluate it on a range of decision-making programs. Read More

We present a denotational account of dynamic allocation of potentially cyclic memory cells using a monad on a functor category. We identify the collection of heaps as an object in a different functor category equipped with a monad for adding hiding/encapsulation capabilities to the heaps. We derive a monad for full ground references supporting effect masking by applying a state monad transformer to the encapsulation monad. Read More

In program algebra, an algebraic theory of single-pass instruction sequences, three congruences on instruction sequences are paid attention to: instruction sequence congruence, structural congruence, and behavioural congruence. Sound and complete axiom systems for the first two congruences were already given in early papers on program algebra. The current paper is the first one that is concerned with an axiom system for the third congruence. Read More

We argue that the implementation and verification of compilers for functional programming languages are greatly simplified by employing a higher-order representation of syntax known as Higher-Order Abstract Syntax or HOAS. The underlying idea of HOAS is to use a meta-language that provides a built-in and logical treatment of binding related notions. By embedding the meta-language within a larger programming or reasoning framework, it is possible to absorb the treatment of binding structure in the object language into the meta-theory of the system, thereby greatly simplifying the overall implementation and reasoning processes. Read More

Affine $\lambda$-terms are $\lambda$-terms in which each bound variable occurs at most once and linear $\lambda$-terms are $\lambda$-terms in which each bound variables occurs once. and only once. In this paper we count the number of closed affine $\lambda$-terms of size $n$, closed linear $\lambda$-terms of size $n$, affine $\beta$-normal forms of size $n$ and linear $\beta$-normal forms of ise $n$, for different ways of measuring the size of $\lambda$-terms. Read More

We propose a formal approach for relating abstract separation logic library specifications with the trace properties they enforce on interactions between a client and a library. Separation logic with abstract predicates enforces a resource discipline that constrains when and how calls may be made between a client and a library. Intuitively, this can enforce a protocol on the interaction trace. Read More

The astrophysics community uses different tools for computational tasks such as complex systems simulations, radiative transfer calculations or big data. Programming languages like Fortran, C or C++ are commonly present in these tools and, generally, the language choice was made based on the need for performance. However, this comes at a cost: safety. Read More

Linearizability is the standard correctness criterion concurrent data structures such as stacks and queues. It allows to establish observational refinement between a concurrent implementation and an atomic reference implementation.Proving linearizability requires identifying linearization points for each method invocation along all possible computations, leading to valid sequential executions, or alternatively, establishing forward and backward simulations. Read More

Dynamic languages often employ reflection primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible because its essential data structures, i.e. Read More

Affiliations: 1Graduate School of Information Sciences, Tohoku University, Sendai, Japan, 2Graduate School of Information Sciences, Tohoku University, Sendai, Japan, 3Graduate School of Information Sciences, Tohoku University, Sendai, Japan

We have implemented an optimization that specializes type-generic array accesses after inlining of polymorphic functions in the native-code OCaml compiler. Polymorphic array operations (read and write) in OCaml require runtime type dispatch because of ad hoc memory representations of integer and float arrays. It cannot be removed even after being monomorphized by inlining because the intermediate language is mostly untyped. Read More


Synchronous programming languages emerged in the 1980s as tools for implementing reactive systems, which interact with events from physical environments and often must do so under strict timing constraints. In this report, we encode inside ATS various real-time primitives in an experimental synchronous language called Prelude, where ATS is a statically typed language with an ML-like functional core that supports both dependent types (of DML-style) and linear types. We show that the verification requirements imposed on these primitives can be formally expressed in terms of dependent types in ATS. Read More