This short course will introduce students to LLVM and run through the main steps of implementing a just-in-time compiler for a small dynamic language like, say, R. We will discuss the choice of intermediate representation, compilation to LLVMIR, static analysis and optimization, runtime services and garbage collection. The lecture will include hands-on components, so install LLVM before coming to Bertinoro!
Many of today’s programming languages are broken. Poor performance, lack of features and hard-to-reason-about semantics can cost dearly in software maintenance and inefficient execution. An important reason for this brokenness is that much of language design is implementation-driven. The difficulties in implementation and insufficient understanding of concepts bake bad designs into the language itself. Concurrency, architectural details and garbage collection are three fundamental concerns that contribute much to the complexities of implementing managed languages. In this lecture I will describe the micro virtual machine, a thin abstraction designed specifically to relieve implementers of managed languages of the most fundamental implementation challenges that currently impede good design. The micro virtual machine targets abstractions over memory (garbage collection), architecture (compiler backend), and concurrency. I will motivate the micro virtual machine and give an account of the design and initial experience of a concrete instance, which we call Mu, built over a four year period. Our goal is to remove an important barrier to performant and semantically sound managed language design and implementation.
Writing JIT compilers for dynamic languages by hand is an arduous and heroic process, particularly for dynamic languages with “interesting” semantics. Meta-tracing is an approach for reducing that effort; VM authors only write an interpreter and some hints. In this hands-on session students will implement a small dynamic language from scratch as well as a JITusing the RPython meta-tracing language that underlies the PyPy project.
Whenever the need to compile a new dynamically typed language arises, an appealing option is to repurpose an existing statically typed language Just-In-Time (JIT) compiler. This lecture will examine the benefits and limitations of the repurposed JIT approach. How far can JITs be stretched?
Building any kind of high performance garbage collector for a complete language implementation is hard. High performance requires careful attention to detail and bugs are notoriously difficult to discover. Add to the mix, copying and full concurrency (on-the-fly collection) makes the task even harder. In this talk, I'll explain how we built a high performance, replication copying, fully concurrent collector for Java, focusing in particular on how we obtained confidence that the collector was correct.
This lecture will cover topics related to static analysis, abstract interpteration and control flow analysis.
In this lecture I will first give a gentle introduce to abstract interpretation, compare it with other formal methods, and explain why abstract interpretation is so successful in industry (spoiler alert: it focuses on the properties of interest and it scales up). I will then illustrate the applications of abstract interpretation at Facebook to analyze tens of millions of lines of code with a very false positive ratio.
This talk will focus on methods for testing compilers to find defects early -- which are readily applicable to rapidly evolving compiler code bases, such as the gcc and LLVM open source compiler projects. I'll cover the major techniques that have been investigated in recent research literature: differential testing, equivalence modulo inputs testing, and program enumeration, explaining what these are, what their associated benefits and drawbacks are, and how they complement each other. I will finish by giving an overview of an ongoing project, GLFuzz, which is a technique for testing compilers associated with OpenGL, focussing both on technical challenges and experiences from talking to GPU vendors.
Julia is a multimethod-based language with an unusually powerful dispatch system that has been useful for organizing large numerical libraries. Although Julia is dynamically typed, it is also designed to support static analysis and generation of efficient code. In the first part of this talk, I'll discuss our design considerations and the techniques needed to make the design work in practice. In the second part, I'll cover lower-level implementation issues, including our experience with LLVM, and general gotchas encountered when interfacing with real-world systems software.
It's no secret that Scala is a powerful language; whether you want to use it as a host language to embed your own DSLs in or you want to lift as much of your computation to the type level as possible. What many people don't know, however, is that some of these language features can be combined to allow you to make modest extensions to the compiler as libraries. In this talk, I'll walk you through some of Scala's more advanced features that can enable you to build your own modest compiler extensions, mostly as libraries!
Encore is a new, open source, experimental object-oriented programming language developed in the UPSCALE EU project. Encore's starting point is the actor model with its inherent support for concurrent programming. Encore goes beyond the actor model, adding futures and active object-style asynchronous structured programming. This lecture will focus on how Encore extends the actor model with so-called hot objects, and how that integration hits all parts of language design: surface syntax, type system, scheduling and garbage collection.
Pony is a new, actor-based, strongly typed programming language designed to make it easy to write efficient, safe concurrent programs. We will discuss the rationale for the development of Pony, give a short introduction to the actor paradigm, introduce the type system, and show how it supports the development of datarace-free programs which allow actors to share mutable state, and message passing without copying. We will develop some small demonstrator examples. We will also discuss causality: what it is a and what it buys for the programmer. We will outline salient parts of Pony’s implementation: the runtime organization, message queues, the scheduler, and garbage collection, and will show how the type system and causality were employed to allow these the be efficiently implemented.
Web services replicate application state and logic across multiple replicas within and across data centers. Replication is intended not only to improve application throughput and reduce user-perceived latency, but also to tolerate partial failures. Strong consistency models such as serializability, while easy to understand and specify, mask the realities underlying large-scale distributed systems with respect to non-uniform latency, availability, and network partitions. Web services, which aim to provide an "always-on" experience, overwhelmingly favor availability and partition tolerance over strong consistency. To this end, several weak consistency models such as eventual consistency have been deployed. Weakly consistent systems complicate program reasoning by allowing concurrent conflicting updates and by exposing inconsistencies that may violate high-level program invariants. We will study the semantics of such systems, and explore abstractions and formalisms that can be brought to bear to tame their complexity.