Swarat Chaudhuri

· ProfessorVerified

University of Texas at Austin · Computer Science

Active 2003–2026

h-index33

Citations5.0k

Papers19572 last 5y

Funding$2.1M

Faculty page

See your match with Swarat Chaudhuri — sign in to PhdFit.Sign in

About

Swarat Chaudhuri is a professor specializing in Artificial Intelligence, Formal Methods, Programming Languages, and Compilers. His research focuses on advancing the theoretical foundations and practical applications of these areas, contributing to the development of more reliable, efficient, and secure computing systems. As a faculty member at UT Austin, he is involved in teaching and mentoring students, as well as leading research initiatives that push the frontiers of computer science in these domains.

Research topics

Computer Science
Programming language
Artificial Intelligence
Machine Learning
Embedded system
Theoretical computer science
Algorithm
Engineering

Selected publications

Formal Reasoning Meets LLMs: Toward AI for Mathematics and Verification
Communications of the ACM · 2026-02-10
articleOpen access
Formal reasoning has long been fundamental to various fields of computer science, including AI in its early days. In contrast, large language models (LLMs)—the cornerstone of modern AI—perform reasoning through autoregressive next-word prediction, without grounding their outputs in formal systems. This “informal” approach can learn world knowledge and reasoning patterns from large-scale data without rigid formalization. It offers significantly greater flexibility than traditional formal reasoning and has achieved promising results on many benchmarks. However, LLMs heavily rely on data and do not guarantee the soundness of reasoning. In this article, we highlight recent efforts to integrate modern LLMs with formal methods, an approach that seeks to harness the strengths of both paradigms. Such integration has the potential to lead to major advancements in AI-driven mathematics, formal verification, and the verifiable generation of computer systems.
Publisher DOI
Evaluating Agentic Optimization on Large Codebases
ArXiv.org · 2026-03-16
articleOpen access
Large language model (LLM) coding agents increasingly operate at the repository level, motivating benchmarks that evaluate their ability to optimize entire codebases under realistic constraints. Existing code benchmarks largely rely on synthetic tasks, binary correctness signals, or single-objective evaluation, limiting their ability to assess holistic optimization behavior. We introduce FormulaCode, a benchmark for evaluating agentic optimization on large, real-world codebases with fine-grained, multi-objective performance metrics. FormulaCode comprises 957 performance bottlenecks mined from scientific Python repositories on GitHub, each paired with expert-authored patches and, on average, 264.6 community-maintained performance workloads per task, enabling the holistic ability of LLM agents to optimize codebases under realistic correctness and performance constraints. Our evaluations reveal that repository-scale, multi-objective optimization remains a major challenge for frontier LLM agents. Project website at: https://formula-code.github.io
Publisher OA PDF
An Improved Last-Iterate Convergence Rate for Anchored Gradient Descent Ascent
arXiv (Cornell University) · 2026-04-04
preprintOpen accessSenior author
We analyze the last-iterate convergence of the Anchored Gradient Descent Ascent algorithm for smooth convex-concave min-max problems. While previous work established a last-iterate rate of $\mathcal{O}(1/t^{2-2p})$ for the squared gradient norm, where $p \in (1/2, 1)$, it remained an open problem whether the improved exact $\mathcal{O}(1/t)$ rate is achievable. In this work, we resolve this question in the affirmative. This result was discovered autonomously by an AI system capable of writing formal proofs in Lean. The Lean proof can be accessed at https://github.com/google-deepmind/formal-conjectures/pull/3675/commits/a13226b49fd3b897f4c409194f3bcbeb96a08515
Publisher DOI
Coadaptive Value Alignment
2026-05-24
article
The integration of autonomous agents into human society is a grand challenge for AI. In order to achieve widespread acceptance, agents must conform to the values of people with whom they interact. Current approaches treat the value alignment problem as a unidirectional interaction where the aim is to imbue an agent's actions with human values. Our Coadaptive Value Alignment paradigm acknowledges that human perceptions, expectations, and values continuously evolve in response to agent actions. We conceptualize human-agent interaction as an adaptive loop where the agent actively models and intentionally influences the human's perception, rather than just acting according to static human values. For instance, unlike a traditional agent that simply maximizes speed, an adaptive agent could detect a drop in user trust and strategically sacrifice short-term efficiency to repair the relationship. This perspective transforms value alignment into a multi-agent challenge where all actors must identify and adhere to a shared, implicit social contract. The opportunity to create a virtuous cycle of self-improvement is accompanied by the risk of negative reinforcement, which could result in undesired behaviors. We outline the core framework components, present a research road map for the MAS community, and propose that this dynamic perspective is critical for creating truly collaborative social partners.
Publisher DOI
Canopy: Property-Driven Learning for Congestion Control
2026-04-24 · 2 citations
preprintOpen access
Learning-based congestion controllers offer better adaptability compared to traditional heuristics. However, the unreliability of learning techniques can cause learning-based controllers to behave poorly, creating a need for formal guarantees. While methods for formally verifying learned congestion controllers exist, these methods offer binary feedback that cannot optimize the controller toward better behavior. We improve this state-of-the-art via Canopy, a new property-driven framework that integrates learning with formal reasoning in the learning loop. Canopy uses novel quantitative certification with an abstract interpreter to guide the training process, rewarding models, and evaluating robust and safe model performance on worst-case inputs. Our evaluation demonstrates that unlike state-of-the-art learned controllers, Canopy-trained controllers provide both adaptability and worst-case reliability across a range of network conditions.
Publisher DOI
Do CFLOBDDs Actually Make Use of Linear Structure?
ArXiv.org · 2026-05-15
articleOpen access
Binary Decision Diagrams (BDDs) are a widely used data structure for efficient Boolean function representation. Context-Free-Language Ordered Binary Decision Diagrams (CFLOBDDs) are a recently introduced hierarchical data structure that can, in the best case, exhibit exponential compression over BDDs and double-exponential compression over decision trees. Roughly speaking, a CFLOBDD is a finite, acyclic, non-recursive hierarchical finite-state machine (HFSM) (with some additional restrictions). In this paper, we investigate the role of \emph{linear structure} in CFLOBDDs -- a property that connects them to Nested-Word Automata (NWAs) and Visibly Pushdown Automata (VPAs) -- and examine whether CFLOBDDs actively exploit this structure beyond their well-studied hierarchical properties. We demonstrate that linear structure, in conjunction with hierarchical structure, plays a crucial role in enabling CFLOBDDs to achieve efficient function compression. Furthermore, we show that removing linearity from CFLOBDDs leads to a significant blowup in representation size, resulting in degraded performance in the domain of quantum-circuit simulation.
Publisher OA PDF
An Improved Last-Iterate Convergence Rate for Anchored Gradient Descent Ascent
arXiv (Cornell University) · 2026-04-04
articleOpen accessSenior author
We analyze the last-iterate convergence of the Anchored Gradient Descent Ascent algorithm for smooth convex-concave min-max problems. While previous work established a last-iterate rate of $\mathcal{O}(1/t^{2-2p})$ for the squared gradient norm, where $p \in (1/2, 1)$, it remained an open problem whether the improved exact $\mathcal{O}(1/t)$ rate is achievable. In this work, we resolve this question in the affirmative. This result was discovered autonomously by an AI system capable of writing formal proofs in Lean. The Lean proof can be accessed at https://github.com/google-deepmind/formal-conjectures/pull/3675/commits/a13226b49fd3b897f4c409194f3bcbeb96a08515
Publisher OA PDF
CSLib: The Lean Computer Science Library
ArXiv.org · 2026-02-04
articleOpen access
We introduce CSLib, an open-source framework for proving computer-science-related theorems and writing formally verified code in the Lean proof assistant. CSLib aims to be for computer science what Lean's Mathlib is for mathematics. Mathlib has been tremendously impactful: it is a key reason for Lean's popularity within the mathematics research community, and it has also played a critical role in the training of AI systems for mathematical reasoning. However, the base of computer science knowledge in Lean is currently quite limited. CSLib will vastly enhance this knowledge base and provide infrastructure for using this knowledge in real-world verification projects. By doing so, CSLib will (1) enable the broad use of Lean in computer science education and research, and (2) facilitate the manual and AI-aided engineering of large-scale formally verified systems.
Publisher OA PDF
Advancing Mathematics Research with AI-Driven Formal Proof Search
ArXiv.org · 2026-05-21
articleOpen accessSenior author
Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the first large-scale evaluation of this method's ability to solve open problems. Our most capable agent autonomously resolved 9 of 353 open Erdős problems at the per-problem cost of a few hundred dollars, proved 44/492 OEIS conjectures, and is being deployed in combinatorics, optimization, graph theory, algebraic geometry, and quantum optics research. A basic agent alternating LLM-based generation with Lean-based verification replicated the Erdős successes but proved costlier on the hardest problems. These findings demonstrate the power of AI-aided formal proof search and shed light on the agent designs that enable it.
Publisher OA PDF
FormulaCode: Evaluating Agentic Optimization on Large Codebases
arXiv (Cornell University) · 2026-03-16
preprintOpen access
Large language model (LLM) coding agents increasingly operate at the repository level, motivating benchmarks that evaluate their ability to optimize entire codebases under realistic constraints. Existing code benchmarks largely rely on synthetic tasks, binary correctness signals, or single-objective evaluation, limiting their ability to assess holistic optimization behavior. We introduce FormulaCode, a benchmark for evaluating agentic optimization on large, real-world codebases with fine-grained, multi-objective performance metrics. FormulaCode comprises 957 performance bottlenecks mined from scientific Python repositories on GitHub, each paired with expert-authored patches and, on average, 264.6 community-maintained performance workloads per task, enabling the holistic ability of LLM agents to optimize codebases under realistic correctness and performance constraints. Our evaluations reveal that repository-scale, multi-objective optimization remains a major challenge for frontier LLM agents. Project website at: https://formula-code.github.io
Publisher DOI

Recent grants

SHF: Medium: Collaborative Research: Chorus: Dynamic Isolation in Shared-Memory Parallelism
NSF · $510k · 2011–2015
SHF: Medium: Collaborative Research: Bridging Automated Formal Reasoning and Continuous Optimization for Provably Safe Deep Learning
NSF · $500k · 2020–2024
CAREER: Robustness Analysis of Uncertain Programs: Theory, Algorithms, and Tools
NSF · $345k · 2011–2017
SHF: Medium: Collaborative Research: Marrying Program Analysis and Numerical Search
NSF · $600k · 2012–2018
CAREER: Robustness Analysis of Uncertain Programs: Theory, Algorithms, and Tools
NSF · $160k · 2010–2012

Frequent coauthors

Lydia E. Kavraki
23 shared
Işıl Dillig
The University of Texas at Austin
20 shared
Chris Jermaine
19 shared
Abhinav Verma
18 shared
Neil T. Dantam
Colorado School of Mines
15 shared
Thomas Reps
14 shared
Yisong Yue
California Institute of Technology
13 shared
Greg Durrett
12 shared

Education

Ph.D., Computer Science
University of California, Berkeley
2002
M.S., Computer Science
University of California, Berkeley
1998
B.S., Electrical Engineering
University of Calcutta
1996

Awards & honors

NSF CAREER award
Google Research Award
ACM SIGPLAN John Reynolds Doctorate Dissertation Award
multiple distinguished paper awards

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Swarat Chaudhuri

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you