Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Henry Hoffmann

Henry Hoffmann

· Associate Professor of Computer ScienceVerified

University of Chicago · Computer Science

Active 1930–2026

h-index42
Citations7.2k
Papers20152 last 5y
Funding$1.9M
See your match with Henry Hoffmann — sign in to PhdFit.Sign in

About

Henry Hoffmann is the Liew Family Chair for the Department of Computer Science at the University of Chicago. He received the President's Award for Early Career Scientists and Engineers (PECASE) in 2019 and was granted early tenure in 2018. Hoffmann is a member of the ASPLOS Hall of Fame and leads the Self-aware computing group (SEEC project), conducting research on adaptive techniques for power, energy, accuracy, and performance management in computing systems. His work focuses on building self-aware computing systems that understand high-level goals and automatically adapt their behavior to meet those goals optimally, integrating disciplines such as operating systems, computer architecture, control theory, and machine learning. Hoffmann has spent the last 18 years working on multicore architectures and system software in both academia and industry, including contributions to the Raw processor at MIT and Tilera Corporation, where his implementation of the BDTI Communications Benchmark on Tilera's 64-core TILE64 processor achieved the highest certified performance of any programmable processor. His research areas include self-aware and adaptive computing, computer systems, and computer architecture, with a focus on addressing the challenges of modern computer systems that must balance multiple, often competing, goals such as high performance and low energy consumption.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Computer Security
  • Programming language
  • Software engineering
  • Real-time computing
  • Data science
  • Distributed computing
  • Multimedia
  • Simulation
  • Human–computer interaction
  • Operating system
  • Computer network

Selected publications

  • WASL: Harmonizing Uncoordinated Adaptive Modules in Multi-Tenant Cloud Systems

    2026-04-23

    articleOpen accessSenior author

    Modern cloud applications increasingly rely on adaptive control modules, such as dynamic resource tuning or system reconfiguration, to meet strict quality-of-service (QoS) objectives. However, when multiple independently developed adaptation modules are colocated on a shared infrastructure, their uncoordinated behavior causes interference leading to QoS violations. Existing approaches require centralized control or inter-module communication, violating modularity and limiting adoption in multi-tenant environments.

  • Keeper: Automated Testing and Fixing of Machine Learning Software—RCR Report

    ACM Transactions on Software Engineering and Methodology · 2025-06-05

    article

    This artifact aims to provide source code, benchmark suite, results, and materials used in our study “Keeper: Automated Testing and Fixing of Machine Learning Software” [ 3 ]. We developed an automated testing and fixing tool Keeper and its IDE plugin for ML software. It automatically detects software defects and attempts to change how ML APIs are used to alleviate software misbehavior. This artifact provides guidelines to set up and execute Keeper and also guidelines to interpret our evaluation results. We hope this artifact can motivate and help future research to further tackle ML API misuses. All related data are available online.

  • A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation

    Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11

    articleOpen accessSenior author

    Recent advancements in graph representation learning have shifted attention towards dynamic graphs, which exhibit evolving topologies and features over time. The increased use of such graphs creates a paramount need for generative models suitable for applications such as data augmentation, obfuscation, and anomaly detection. However, there are few generative techniques that handle continuously changing temporal graph data; existing work largely relies on augmenting static graphs with additional temporal information to model dynamic interactions between nodes. In this work, we propose a fundamentally different approach: We instead directly model interactions as a joint probability of an edge forming between two nodes at a given time. This allows us to autoregressively generate new synthetic dynamic graphs in a largely assumption free, scalable, and inductive manner. We formalize this approach as DG-Gen, a generative framework for continuous time dynamic graphs, and demonstrate its effectiveness over five datasets. Our experiments demonstrate that DG-Gen not only generates higher fidelity graphs compared to traditional methods but also significantly advances link prediction tasks.

  • Quality Measures for Dynamic Graph Generative Models

    ArXiv.org · 2025-03-03

    preprintOpen accessSenior author

    Deep generative models have recently achieved significant success in modeling graph data, including dynamic graphs, where topology and features evolve over time. However, unlike in vision and natural language domains, evaluating generative models for dynamic graphs is challenging due to the difficulty of visualizing their output, making quantitative metrics essential. In this work, we develop a new quality metric for evaluating generative models of dynamic graphs. Current metrics for dynamic graphs typically involve discretizing the continuous-evolution of graphs into static snapshots and then applying conventional graph similarity measures. This approach has several limitations: (a) it models temporally related events as i.i.d. samples, failing to capture the non-uniform evolution of dynamic graphs; (b) it lacks a unified measure that is sensitive to both features and topology; (c) it fails to provide a scalar metric, requiring multiple metrics without clear superiority; and (d) it requires explicitly instantiating each static snapshot, leading to impractical runtime demands that hinder evaluation at scale. We propose a novel metric based on the \textit{Johnson-Lindenstrauss} lemma, applying random projections directly to dynamic graph data. This results in an expressive, scalar, and application-agnostic measure of dynamic graph similarity that overcomes the limitations of traditional methods. We also provide a comprehensive empirical evaluation of metrics for continuous-time dynamic graphs, demonstrating the effectiveness of our approach compared to existing methods. Our implementation is available at https://github.com/ryienh/jl-metric.

  • A 22nm Coarse-Grained Reconfigurable Array with Novel Features for Machine Learning and Digital Signal Processing

    2025-10-28

    article

    This article presents a highly compact Coarse-Grained Reconfigurable Array (CGRA) specialized for processing Digital Signal Processing (DSP) and Machine Learning (ML) operations with an outstanding micro-architectural efficiency. The CGRA consists of high functionality Processing Elements (PEs) supported by strategically placed interconnections and bidirectional data buffers made of programmable cyclic registers. These novel features accelerate large length correlations, Fast Fourier Transforms and other DSP/ML related functions. It is a resource compact CGRA with very small dimensions, i.e., <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$4 \times 4$</tex> PEs and synthesized using a 22nm CMOS technology. The design of CGRA has an AMBA interface making it an industry standard coprocessor for a system-on-chip. The novelty presented in this paper is an accepted United States patent.

  • Lupe: Integrating the Top-down Approach with DNN Execution on Ultra-Low-Power Devices

    2025-05-04 · 1 citations

    articleOpen accessSenior author

    Executing deep neural networks (DNNs) on ultra-low-power (ULP) microcontrollers creates enormous opportunities for new intelligent edge applications. However, manually writing optimized DNN programs for ULP devices is time consuming and error prone due to the difficulty of managing on-device accelerators. Many prior works address this problem by creating special libraries that tailor common DNN building blocks for unique accelerators of ULP devices. This is a bottom-up approach, as developers build DNNs by assembling library calls. Unfortunately, the encapsulation overhead inherent in this approach greatly reduces accelerator utilization and overall performance. Instead, we advocate for a top-down approach. We present Lupe, a code generation framework, that converts high-level DNN algorithm descriptions to ULP-optimized code. Lupe provides top-down intermittent support that significantly reduces overhead while maintaining intermittent safety. We demonstrate Lupe's benefits on an MSP430 [54], achieving 12.36× and 2.22× average speedup over two prior works across a variety of DNN models in continuous power. Moreover, Lupe reduces the average intermittent runtime costs of prior works by 96.65% and 71.15%, respectively.

  • Position Paper: Voltage Change is not Energy Consumption

    2025-05-06

    articleOpen accessSenior author

    Energy-harvesting sensors utilize local, ambient energy resources to operate and thus eliminate the need for batteries. A key challenge for such systems is avoiding power failures during application execution. Energy-aware runtimes avoid such failures by reasoning about the task's energy consumption and the current energy available to the system. However, the energy consumption estimates profiled by prior work fail to account for incoming energy, producing incorrect energy consumption estimates which could lead to power failures and missed deadlines. This work analyzes the impact of incoming energy on the profiled energy consumption and argues that future energy-aware runtimes must be mindful of harvested energy when profiling a task's energy consumption.

  • SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding

    ArXiv.org · 2025-06-12

    preprintOpen access

    Low-latency decoding for large language models (LLMs) is crucial for applications like chatbots and code assistants, yet generating long outputs remains slow in single-query settings. Prior work on speculative decoding (which combines a small draft model with a larger target model) and tensor parallelism has each accelerated decoding. However, conventional approaches fail to apply both simultaneously due to imbalanced compute requirements (between draft and target models), KV-cache inconsistencies, and communication overheads under small-batch tensor-parallelism. This paper introduces SwiftSpec, a system that targets ultra-low latency for LLM decoding. SwiftSpec redesigns the speculative decoding pipeline in an asynchronous and disaggregated manner, so that each component can be scaled flexibly and remove draft overhead from the critical path. To realize this design, SwiftSpec proposes parallel tree generation, tree-aware KV cache management, and fused, latency-optimized kernels to overcome the challenges listed above. Across 5 model families and 6 datasets, SwiftSpec achieves an average of 1.75x speedup over state-of-the-art speculative decoding systems and, as a highlight, serves Llama3-70B at 348 tokens/s on 8 Nvidia Hopper GPUs, making it the fastest known system for low-latency LLM serving at this scale.

  • WatchHAR: Real-time On-device Human Activity Recognition System for Smartwatches

    2025-10-11 · 1 citations

    preprintOpen access

    Despite advances in practical and multimodal fine-grained Human Activity Recognition (HAR), a system that runs entirely on smartwatches in unconstrained environments remains elusive. We present WatchHAR, an audio and inertial-based HAR system that operates fully on smartwatches, addressing privacy and latency issues associated with external data processing. By optimizing each component of the pipeline, WatchHAR achieves compounding performance gains. We introduce a novel architecture that unifies sensor data preprocessing and inference into an end-to-end trainable module, achieving 5x faster processing while maintaining over 90% accuracy across more than 25 activity classes. WatchHAR outperforms state-of-the-art models for event detection and activity classification while running directly on the smartwatch, achieving 9.3 ms processing time for activity event detection and 11.8 ms for multimodal activity classification. This research advances on-device activity recognition, realizing smartwatches' potential as standalone, privacy-aware, and minimally-invasive continuous activity tracking devices.

  • KVMSR+UDWeave: Extreme-Scaling with Fine-grained Parallelism on the UpDown Graph Supercomputer

    2025-11-07 · 1 citations

    article

    Programming irregular graph applications is challenging on today’s scalable supercomputers. We describe a novel programming model, KVMSR+UDWeave, that supports extreme scaling by exposing fine-grained parallelism. By enabling the expression of maximum parallelism, it opens the door for extreme scaling, even on both small and large graph problems.

Recent grants

Frequent coauthors

Education

  • Ph.D., Electrical Engineering and Computer Science

    Massachusetts Institute of Technology (MIT)

    2011
  • Other

    Massachusetts Institute of Technology (MIT)

Awards & honors

  • President's Award for Early Career Scientists and Engineers…
  • ASPLOS Hall of Fame
  • DOE Early Career Award (2015)
  • Most Influential Paper Award, SEAMS (2025)
  • Test of Time Award Honorable Mention, IEEE Micro (2021)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Henry Hoffmann

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup