Jan Vitek
· ProfessorVerifiedNortheastern University · Software Engineering
Active 1975–2026
About
Jan Vitek is a professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston. His research has led to advances in the theory and practice of modern programming systems, including languages such as Objective-C, JavaScript, and data analytics languages like R, with applications in information security, memory management, and real-time safety critical systems. Vitek led the team that developed the first real-time Java virtual machine to be deployed on a drone designed by Boeing. Prior to joining Northeastern University, he was a professor and faculty scholar at Purdue University, as well as a co-founder of Fiji Systems and 0xdata. He holds a leadership role in the programming language community as the former chair of the ACM Special Interest Group on Programming Languages, and has served as vice-president of AITO and of the IFIP WG 2.4 on Software Technology. Vitek also chairs the steering committee of the PLDI conference and has been involved in the steering committees of several other prominent conferences.
Research topics
- Computer Science
- Parallel computing
- Operating system
- Programming language
Selected publications
A Typed Intermediate Representation for Dynamic Languages
ACM Transactions on Programming Languages and Systems · 2026-04-25
articleSenior authorDynamic programming languages pose significant challenges for optimizing compilers due to features such as dynamic typing, late binding, reflection, copy-on-write, and delayed evaluation. To generate efficient code, compilers must speculate on which dynamic features will be exercised and produce specialized code based on these assumptions. This paper presents the design of a statically typed, high-level intermediate representation that makes dynamic behaviors explicit and amenable to static analysis. Our IR combines gradual typing with ownership tracking, and explicitly represents promises, multiple function versions, and contextual dispatch. Together, these features directly support optimizations such as specialization, inlining, scope elision, and copy elimination. We formalize a core calculus, called FIŘ, that captures the essential features required for these optimizations. We provide an operational semantics, a type system, and flow and reflection analyses, and we prove the soundness of the type system.
Characterizing Type Feedback in Just-in-Time Compilation Artifact
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-13
otherOpen accessSenior authorType Inference for Functional and Imperative Dynamic Languages
Proceedings of the ACM on Programming Languages · 2026-04-10
articleOpen accessSenior authorIn this paper, we formalize a type system based on set-theoretic types for dynamic languages that support both functional and imperative programming paradigms. We adapt prior work in the typing of overloaded and generic functions to support an impure lambda-calculus, focusing on imperative features commonly found in dynamic languages such as JavaScript, Python, and Julia. We introduce a general notion of parametric opaque data types using set-theoretic types, enabling precise modeling of mutable data structures while promoting modularity, clarity, and readability. Finally, we compare our approach to existing work and evaluate our prototype implementation on a range of examples.
Characterizing Type Feedback in Just-in-Time Compilation Artifact
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-13
otherOpen accessSenior author2025-07-29
articleOpen accessSenior authorCopy-and-Patch Just-in-Time Compiler for R
2025-10-09
articleOpen accessSenior authorCopy-and-patch is a technique for building baseline just-in-time compilers from existing interpreters. It has been successfully applied to languages such as Lua and Python. This paper reports on our experience using this technique to implement a compiler for the R programming language. We describe how this new compiler integrates with the GNU R virtual machine, present the key optimizations we implemented, and evaluate the feasibility of this approach for R. Copy-and-patch also allows extensions such as integration of the feedback recording required by multi-tier compilation. Our evaluation on 57 programs demonstrates very fast compilation times (980 bytecode instructions per millisecond), reasonable performance gains (1.15x–1.91x speedup over GNU R), and manageable implementation complexity.
Decidable Subtyping of Existential Types for Julia
Proceedings of the ACM on Programming Languages · 2024-06-20 · 2 citations
articleOpen accessSenior authorJulia is a modern scientific-computing language that relies on multiple dispatch to implement generic libraries. While the language does not have a static type system, method declarations are decorated with expressive type annotations to determine when they are applicable. To find applicable methods, the implementation uses subtyping at run-time. We show that Julia’s subtyping is undecidable, and we propose a restriction on types to recover decidability by stratifying types into method signatures over value types—where the former can freely use bounded existential types but the latter are restricted to use-site variance. A corpus analysis suggests that nearly all Julia programs written in practice already conform to this restriction.
2024-10-17 · 1 citations
articleSenior authorJust-in-time compilers enhance the performance of future invocations of a function by generating code tailored to past behavior. To achieve this, compilers use a data structure, often referred to as a feedback vector, to record information about each function’s invocations. However, over time, feedback vectors tend to become less precise, leading to lower-quality code – a phenomenon known as feedback vector pollution. This paper examines feedback vector pollution within the context of a compiler for the R language. We provide data, discuss an approach to reduce pollution in practice, and implement a proof-of-concept implementation of this approach. The preliminary results of the implementation indicate ∼30% decrease in polluted compilations and ∼37% decrease in function pollution throughout our corpus.
Reusing Just-in-Time Compiled Code
Proceedings of the ACM on Programming Languages · 2023-10-16 · 8 citations
articleOpen accessSenior authorMost code is executed more than once. If not entire programs then libraries remain unchanged from one run to the next. Just-in-time compilers expend considerable effort gathering insights about code they compiled many times, and often end up generating the same binary over and over again. We explore how to reuse compiled code across runs of different programs to reduce warm-up costs of dynamic languages. We propose to use speculative contextual dispatch to select versions of functions from an off-line curated code repository . That repository is a persistent database of previously compiled functions indexed by the context under which they were compiled. The repository is curated to remove redundant code and to optimize dispatch. We assess practicality by extending Ř, a compiler for the R language, and evaluating its performance. Our results suggest that the approach improves warmup times while preserving peak performance.
Deoptless: speculation with dispatched on-stack replacement and specialized continuations
2022-06-02
articleOpen accessSenior authorJust-in-time compilation provides significant performance improvements for programs written in dynamic languages. These benefits come from the ability of the compiler to speculate about likely cases and generate optimized code for these. Unavoidably, speculations sometimes fail and the optimizations must be reverted. In some pathological cases, this can leave the program stuck with suboptimal code. In this paper we propose deoptless, a technique that replaces deoptimization points with dispatched specialized continuations. The goal of deoptless is to take a step towards providing users with a more transparent performance model in which mysterious slowdowns are less frequent and grave.
Recent grants
CSR: CC: Small: Collaborative Research: Language and Runtime Support for Large-Scale Data Analytics
NSF · $137k · 2014–2015
CT-ER: Controlled Declassification with Software Transactional Memory
NSF · $250k · 2007–2010
SHF: Small: Program Analysis for Data Science
NSF · $500k · 2019–2025
SI2-SSE: A Tracing Virtual Machine for Statistical Computing
NSF · $489k · 2010–2013
SHF: PROJECT DARWIN_ Towards Principled Language Evolution
NSF · $1.1M · 2016–2022
Frequent coauthors
- 27 shared
Filip Pizlo
Apple (United States)
- 26 shared
Christian Damsgaard Jensen
Technical University of Denmark
- 25 shared
Gerhard Goos
Lancaster University
- 25 shared
Jan Van Leeuwen
Netherlands Institute for Radio Astronomy
- 22 shared
Suresh Jagannathan
- 22 shared
David Pichardie
- 21 shared
Olivier Flückiger
Northeastern University
- 21 shared
Francesco Zappa Nardelli
Labs
Khoury College of Computer SciencesPI
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jan Vitek
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup