Michael Franz
· Distinguished Professor and Director of UCI's Secure Systems and Software LaboratoryVerifiedUniversity of California, Irvine · Computer Science
Active 1975–2025
About
Professor Michael Franz is a Distinguished Professor and the Director of UCI's Secure Systems and Software Laboratory at the UC Irvine Donald Bren School of Information & Computer Sciences. He is an early pioneer in the areas of mobile code and dynamic compilation, having created an early just-in-time compilation system and contributed to the theory and practice of continuous compilation and optimization. Franz co-invented the trace compilation technology that eventually became the JavaScript engine in Mozilla's Firefox browser. His current research emphasizes Software Systems, focusing on compiler, virtual machine, and related system-level techniques aimed at making software safer, faster, or both. His work also encompasses areas such as Computer Security, Trustworthy Computing, and Software Engineering. Dr. Franz has graduated 35 Ph.D. students as their primary advisor and has published more than 140 peer-reviewed research papers. He has secured over $24 million in federal grants, with more than $15 million as the sole principal investigator, and has received significant industry funding in recognition of his research innovations. He holds degrees from the Swiss Federal Institute of Technology, ETH Zurich, including a Doctor of Technical Sciences in Computer Science and a Dipl. Informatik-Ingenieur. Franz is a Fellow of the AAAS, ACM, IEEE, and an Inaugural Fellow of IFIP.
Research topics
- Computer Science
- Operating system
- Computer Security
- Engineering
- Arithmetic
- Programming language
- Embedded system
- Computer network
- Mathematics
Selected publications
2025-05-04 · 1 citations
articleSenior authorWe present an optimized implementation of GPT-2 training (fine-tuning) that harnesses AMD's Neural Processing Unit (NPU) for improved throughput and power-efficiency. We use a low-level programming framework, enabling a close mapping of our application to the hardware.<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup><sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>Thank you to Joseph Melber, Kristof Denolf, and Phil James-Roxby from Advanced Micro Devices Inc., for their guidance, help and contributions.
ArXiv.org · 2025-04-03
preprintOpen accessSenior authorThere has been a growing interest in executing machine learning (ML) workloads on the client side for reasons of customizability, privacy, performance, and availability. In response, hardware manufacturers have begun to incorporate so-called Neural Processing Units (NPUs) into their processors for consumer devices. Such dedicated hardware optimizes both power efficiency and throughput for common machine learning tasks. AMD's NPU, part of their Ryzen AI processors, is one of the first such accelerators integrated into a chip with an x86 processor. AMD supports bare-metal programming of their NPU rather than limiting programmers to pre-configured libraries. In this paper, we explore the potential of using a bare-metal toolchain to accelerate the weight fine-tuning of a large language model, GPT-2, entirely on the client side using the AMD NPU. Fine-tuning on the edge allows for private customization of a model to a specific use case. To the best of our knowledge, this is the first time such an accelerator has been used to perform training on the client side. We offload time-intensive matrix multiplication operations from the CPU onto the NPU, achieving a speedup of over 2.8x for these operations. This improves end-to-end performance of the model in terms of throughput (1.7x and 1.2x speedup in FLOPS/s on mains and battery power, respectively) and energy efficiency (1.4x improvement in FLOPS/Ws on battery power). We detail our implementation approach and present an in-depth exploration of the NPU hardware and bare-metal tool-flow.
I’ll Be There for You! Perpetual Availability in the A<sup>8</sup> MVX System
2024-12-09 · 2 citations
articleMulti-variant execution (MVX) is a low-friction approach to increase the security of critical software applications. MVX systems execute multiple diversified implementations of the same software in lockstep on the same inputs, while monitoring each variant’s behavior. MVX systems can detect attacks quickly and with high probability, because low-level vulnerabilities are unlikely to manifest in precisely the same manner across sufficiently diversified variants. Existing MVX systems terminate execution when they detect a divergence in behavior between variants.In this paper, we present A<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">8</sup>,<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> which we believe is the first full-scale survivable MVX system that not only detects attacks as they happen, but is also able to recover from them. Our implementation is comprised of two parts, an MVX portion that leverages the natural heterogeneity of variants running on diverse platforms (ARM64 and x86_64), and a checkpoint/restore portion that periodically creates snapshots of the variants’ states and forces variants to roll back to those snapshots upon detection of any irregular behavior. In this way, A<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">8</sup> achieves availability even in the face of continuous remote attacks.We consider several design choices and evaluate their security and performance trade-offs using microbenchmarks. Chiefly among these, we devise a system call interposition and monitor implementation approach that provides secure isolation of the MVX monitor, minimal kernel changes (small privileged TCB), and low overheads – a combination not before seen in the context of MVX. We also perform a real-world evaluation of our system on two popular web servers, lighttpd and nginx, and the database server redis, which are able to maintain 53%-71% of their throughput compared to native execution.
IEEE Security & Privacy · 2024-06-14 · 3 citations
articleSenior authorProbabilistic memory safety combines randomization and replication in the hope that attacks will lead to observable differences across the replicas and hence be detected. It has evolved from simple heap-data protection to full-fledged survivability, harnessing checkpoint/restore facilities and hardware heterogeneity.
What You Trace is What You Get: Dynamic Stack-Layout Recovery for Binary Recompilation
2024-04-22 · 4 citations
articleOpen accessSenior authorUsers of proprietary and/or legacy programs without vendor support are denied the significant advances in compiler technologies of the past decades. Adapting these technologies to operate directly on binaries without source code is often infeasible. Binary recompilers attempt to bridge this gap by "lifting" binary executables to compiler-level intermediate representations (IR) and "lowering" them back down to executable form, enabling application of the full range of analyses and transformations available in modern compiler infrastructures. Past approaches could not recover local variables in lifted programs with sufficient precision, which is a necessary prerequisite for many compiler-related applications, including performance optimization. They have relied on heuristics failing on certain input programs, or on conservative over-approximations yielding imprecise results.
Polynima: Practical Hybrid Recompilation for Multithreaded Binaries
2024-04-18 · 3 citations
articleOpen accessSenior authorThe maintenance of software distributed in its binary form can become challenging over time, due to the lack of vendor support or obsolete build environments. This can be costly when dealing with critical security vulnerabilities that are difficult to fix on a binary level. Moreover, advances in compiler technologies of the past decades remain unavailable to the users of such legacy binaries for performing optimizations and transformations. Binary recompilers aim to bridge this divide by "lifting" binary executables to compiler-level intermediate representations (IR) and "lowering" them back again. But, current recompilers fail on that promise as they rely on unsound heuristics or impose high tracing overheads. Crucially, no existing recompiler addresses the specific challenges imposed by multithreaded programs that are ubiquitous in the modern software space.
The Ticket Price Matters in Sharding Blockchain
Lecture notes in computer science · 2023-01-01
book-chapter2023-11-30
articleOpen accessSenior authorDifferential throughput estimation, i.e., predicting the performance impact of software changes, is critical when developing applications that rely on accurate timing bounds, such as automotive, avionic, or industrial control systems. However, developers often lack access to the target hardware to perform on-device measurements, and hence rely on instruction throughput estimation tools to evaluate performance impacts.
DFI: An Interprocedural Value-Flow Analysis Framework that Scales to Large Codebases
arXiv (Cornell University) · 2022-09-06
preprintOpen accessSenior authorContext- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime requirements. This paper proposes a new scalable approach to compute value-flows via graph reachability. To this end, we develop a new graph structure as an extension of LLVM IR that contains two additional operations which significantly simplify the modeling of pointer aliasing. Further, by processing nodes in the opposite direction of SSA def-use chains, we are able to minimize the tree width of the resulting graph. This allows us to employ efficient tree traversal algorithms in order to resolve graph reachability. We present a value-flow analysis framework,DFI, implementing our approach. We compare DFI against two state-of-the-art value-flow analysis frameworks, Phasar and SVF, to extract value-flows from 4 real-world software projects. Given 32GB of memory, Phasar and SVF are unable to complete analysis of larger projects such as OpenSSL or FFmpeg, while DFI is able to complete all evaluations. For the subset of benchmarks that Phasar and SVF do handle, DFI requires significantly less memory (1.5% of Phasar's, 6.4% of SVF's memory footprint on average) and runs significantly faster (23x speedup over Phasar, 57x compared to SVF). Our analysis shows that, in contrast to previous approaches, DFI's memory and runtime requirements scale almost linearly with the number of analyzed instructions.
Improving cross-platform binary analysis using representation learning via graph alignment
2022-07-15 · 25 citations
articleCross-platform binary analysis requires a common representation of binaries across platforms, on which a specific analysis can be performed. Recent work proposed to learn low-dimensional, numeric vector representations (i.e., embeddings) of disassembled binary code, and perform binary analysis in the embedding space. Unfortunately, however, existing techniques fall short in that they are either (i) specific to a single platform producing embeddings not aligned across platforms, or (ii) not designed to capture the rich contextual information available in a disassembled binary.
Recent grants
NSF · $406k · 2006–2010
NSF · $619k · 2015–2018
NSF · $500k · 2016–2020
NSF · $2.0M · 2002–2006
Practical Language-Based Security, From The Ground Up
NSF · $300k · 2002–2005
Frequent coauthors
- 47 shared
Per Larsen
- 37 shared
Stefan Brunthaler
- 33 shared
Andreas Gal
- 22 shared
Stijn Volckaert
- 19 shared
Christian Wimmer
- 14 shared
Ahmad‐Reza Sadeghi
Technical University of Darmstadt
- 13 shared
Lucas Davi
- 13 shared
Jeffery von Ronne
Google (United States)
Awards & honors
- Fellow of the American Association for the Advancement of Sc…
- Fellow of the Association for Computing Machinery (ACM)
- Fellow of the Institute of Electrical and Electronics Engine…
- Inaugural Fellow of the International Federation for Informa…
- 2020 ACM Thacker Breakthrough in Computing Award
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Michael Franz
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup