Mahesh Viswanathan

· ProfessorVerified

University of Illinois Urbana-Champaign · Computer Science

Active 1975–2026

h-index39

Citations6.4k

Papers27148 last 5y

Funding$2.0M

Faculty page

See your match with Mahesh Viswanathan — sign in to PhdFit.Sign in

About

Mahesh Viswanathan is a professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. He holds a Ph.D. in Computer and Information Science from the University of Pennsylvania, obtained in 2000. His research areas include Programming Languages, Formal Methods, Software Engineering, and Theory and Algorithms. He has taught a variety of courses related to computer science, such as Discrete Structures, Intro to Computer Systems, Algorithms, Logic in Computer Science, Formal Models of Computation, and Applied Machine Learning. His work focuses on advancing understanding in these fields, contributing to the development of computing education and research in formal methods and algorithms.

Research topics

Computer Science
Computer Security
Data Mining
Mathematics
Statistics
Theoretical computer science
Algorithm

Selected publications

Efficient Dynamic Algorithms to Predict Short Races
ArXiv.org · 2026-03-03
articleOpen accessSenior author
We introduce and study the problem of detecting short races in an observed trace. Specifically, for a race type $R$, given a trace $σ$ and window size $w$, the task is to determine whether there exists an $R$-race $(e_1, e_2)$ in $σ$ such that the subtrace starting with $e_1$ and ending with $e_2$ contains at most $w$ events. We present a monitoring framework for short-race prediction and instantiate the framework for happens-before and sync-preserving races, yielding efficient detection algorithms. Our happens-before algorithm runs in the same time as FastTrack but uses space that scales with $\log w$ as opposed to $\log |σ|$. For sync-preserving races, our algorithm runs faster and consumes significantly less space than SyncP. Our experiments validate the effectiveness of these short-race detection algorithms: they run more efficiently, use less memory, and detect significantly more races under the same budget, offering a reasonable balance between resource usage and predictive power.
Publisher OA PDF
Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach
arXiv (Cornell University) · 2026-05-12
preprintOpen accessSenior author
[Abridged] Production LLM deployments receive feedback from a non-random fraction of users: thumbs sit mostly in the tails of the satisfaction distribution, and a naive average over them can land 40-50 percentage points away from true system quality. We treat this as a topic- and sentiment- stratified selection-bias problem and propose a three-agent hierarchical Bayesian pipeline that does not require ground-truth labels on individual interactions. A Topic Clustering Agent partitions the stream via UMAP + HDBSCAN over text embeddings; a Bias Modeling Agent fits a two-stage hierarchical Beta-Binomial under NUTS, inferring per-topic selection rates $s_c$ and quality $q_c$ with partial pooling; a Synthesis Agent reweights $q_c$ by true topic prevalence $\hatπ_c = n_c/N$ to report a bias-corrected aggregate posterior $\bar Q = \sum_c \hatπ_c q_c$ with credible interval, plus drift signals for online recalibration. Validation uses UltraFeedback (N=10,232 retained interactions, $C=18$ clusters, $Q^\star=0.6249$) with simulated topic- and sentiment-dependent selection biases. We compare five Bayesian variants against Naive and IPW baselines. A mild prior on the feedback channel (typical positive-feedback rate and negative-to-positive ratio, both readable from any production dashboard without labels) keeps Hierarchical-Informed within 4-13 pp of $Q^\star$ as the bias ratio sweeps from 1:1 to 30:1, with 95% credible intervals covering $Q^\star$ in 50/50 random-seed replicates at $κ_{\max}=10$. Without channel-side priors, every weak-prior variant misses $Q^\star$ by 22-33 pp: the per-cluster sufficient statistics admit a one-parameter family of equally good fits, and the prior on the bias channel (not on latent quality) is what breaks the degeneracy.
Publisher DOI
Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach
ArXiv.org · 2026-05-12
articleOpen accessSenior author
[Abridged] Production LLM deployments receive feedback from a non-random fraction of users: thumbs sit mostly in the tails of the satisfaction distribution, and a naive average over them can land 40-50 percentage points away from true system quality. We treat this as a topic- and sentiment- stratified selection-bias problem and propose a three-agent hierarchical Bayesian pipeline that does not require ground-truth labels on individual interactions. A Topic Clustering Agent partitions the stream via UMAP + HDBSCAN over text embeddings; a Bias Modeling Agent fits a two-stage hierarchical Beta-Binomial under NUTS, inferring per-topic selection rates $s_c$ and quality $q_c$ with partial pooling; a Synthesis Agent reweights $q_c$ by true topic prevalence $\hatπ_c = n_c/N$ to report a bias-corrected aggregate posterior $\bar Q = \sum_c \hatπ_c q_c$ with credible interval, plus drift signals for online recalibration. Validation uses UltraFeedback (N=10,232 retained interactions, $C=18$ clusters, $Q^\star=0.6249$) with simulated topic- and sentiment-dependent selection biases. We compare five Bayesian variants against Naive and IPW baselines. A mild prior on the feedback channel (typical positive-feedback rate and negative-to-positive ratio, both readable from any production dashboard without labels) keeps Hierarchical-Informed within 4-13 pp of $Q^\star$ as the bias ratio sweeps from 1:1 to 30:1, with 95% credible intervals covering $Q^\star$ in 50/50 random-seed replicates at $κ_{\max}=10$. Without channel-side priors, every weak-prior variant misses $Q^\star$ by 22-33 pp: the per-cluster sufficient statistics admit a one-parameter family of equally good fits, and the prior on the bias channel (not on latent quality) is what breaks the degeneracy.
Publisher OA PDF
Efficient Dynamic Algorithms to Predict Short Races
arXiv (Cornell University) · 2026-03-03
preprintOpen accessSenior author
We introduce and study the problem of detecting short races in an observed trace. Specifically, for a race type $R$, given a trace $σ$ and window size $w$, the task is to determine whether there exists an $R$-race $(e_1, e_2)$ in $σ$ such that the subtrace starting with $e_1$ and ending with $e_2$ contains at most $w$ events. We present a monitoring framework for short-race prediction and instantiate the framework for happens-before and sync-preserving races, yielding efficient detection algorithms. Our happens-before algorithm runs in the same time as FastTrack but uses space that scales with $\log w$ as opposed to $\log |σ|$. For sync-preserving races, our algorithm runs faster and consumes significantly less space than SyncP. Our experiments validate the effectiveness of these short-race detection algorithms: they run more efficiently, use less memory, and detect significantly more races under the same budget, offering a reasonable balance between resource usage and predictive power.
Publisher DOI
Efficient Timestamping for Sampling-based Race Detection
ArXiv.org · 2025-04-09
preprintOpen accessSenior author
Dynamic race detection based on the happens before (HB) partial order has now become the de facto approach to quickly identify data races in multi-threaded software. Most practical implementations for detecting these races use timestamps to infer causality between events and detect races based on these timestamps. Such an algorithm updates timestamps (stored in vector clocks) at every event in the execution, and is known to induce excessive overhead. Random sampling has emerged as a promising algorithmic paradigm to offset this overhead. It offers the promise of making sound race detection scalable. In this work we consider the task of designing an efficient sampling based race detector with low overhead for timestamping when the number of sampled events is much smaller than the total events in an execution. To solve this problem, we propose (1) a new notion of freshness timestamp, (2) a new data structure to store timestamps, and (3) an algorithm that uses a combination of them to reduce the cost of timestamping in sampling based race detection. Further, we prove that our algorithm is close to optimal -- the number of vector clock traversals is bounded by the number of sampled events and number of threads, and further, on any given dynamic execution, the cost of timestamping due to our algorithm is close to the amount of work any timestamping-based algorithm must perform on that execution, that is it is instance optimal. Our evaluation on real world benchmarks demonstrates the effectiveness of our proposed algorithm over prior timestamping algorithms that are agnostic to sampling.
Publisher OA PDF DOI
Approximate Algorithms for Verifying Differential Privacy with Gaussian Distributions
2025-11-19
articleOpen accessSenior author
The verification of differential privacy algorithms that employ Gaussian distributions is little understood. This paper tackles the challenge of verifying such programs by introducing a novel approach to approximating probability distributions of loop-free programs that sample from both discrete and continuous distributions with computable probability density functions, including Gaussian and Laplace. We establish that verifying $(ε,δ)$-differential privacy for these programs is \emph{almost decidable}, meaning the problem is decidable for all values of $δ$ except those in a finite set. Our verification algorithm is based on computing probabilities to any desired precision by combining integral approximations, and tail probability bounds. The proposed methods are implemented in the tool, DipApprox, using the FLINT library for high-precision integral computations, and incorporate optimizations to enhance scalability. We validate {\ourtool} on fundamental privacy-preserving algorithms, such as Gaussian variants of the Sparse Vector Technique and Noisy Max, demonstrating its effectiveness in both confirming privacy guarantees and detecting violations.
Publisher OA PDF DOI
Checking δ-Satisfiability of Reals with Integrals
Proceedings of the ACM on Programming Languages · 2025-04-09 · 1 citations
articleOpen accessSenior author
Many synthesis and verification problems can be reduced to determining the truth of formulas over the real numbers. These formulas often involve constraints with integrals in them. To this end, we extend the framework of δ-decision procedures with techniques for handling integrals of user-specified real functions. We implement this decision procedure in the tool ∫dReal, which is built on top of dReal. We evaluate ∫dReal on a suite of problems that include formulas verifying the fairness of algorithms and the privacy and the utility of privacy mechanisms and formulas that synthesize parameters for the desired utility of privacy mechanisms. The performance of the tool in these experiments demonstrates the effectiveness of ∫dReal.
Publisher DOI
The Decision Problem for Regular First Order Theories
Proceedings of the ACM on Programming Languages · 2025-01-07
articleOpen accessSenior author
The Entscheidungsproblem , or the classical decision problem, asks whether a given formula of first-order logic is satisfiable. In this work, we consider an extension of this problem to regular first-order theories , i.e., (infinite) regular sets of formulae. Building on the elegant classification of syntactic classes as decidable or undecidable for the classical decision problem, we show that some classes (specifically, the EPR and Gurevich classes), which are decidable in the classical setting, become undecidable for regular theories. On the other hand, for each of these classes, we identify a subclass that remains decidable in our setting, leaving a complete classification as a challenge for future work. Finally, we observe that our problem generalises prior work on automata-theoretic verification of uninterpreted programs and propose a semantic class of existential formulae for which the problem is decidable.
Publisher DOI
Optimized Cloud Storage Framework Utilizing Key Aggregate Searchable Encryption for Collaborative Data Sharing
International Journal of Science and Research (IJSR) · 2025-09-25
articleOpen accessSenior author
Distributing encrypted data through cloud storage significantly enhances security in remote communication services. Symmetric-key encryption ensures that only users possessing valid keys can access and decrypt the information. Given the diversity of subscription models, efficient key management is essential for effective access control in broadcast services. The proposed solution utilizes a key tree structure (KTR) to manage key distribution, accommodating complex subscription preferences and user behaviors. This approach supports all subscription operations in wireless broadcast environments. Instead of maintaining separate key sets for each application, users need only a single key set for all their subscribed services. The KTR mechanism determines the minimum number of keys that must be updated to maintain broadcast security while minimizing rekeying overhead.
Publisher OA PDF DOI
Efficient Timestamping for Sampling-Based Race Detection
Proceedings of the ACM on Programming Languages · 2025-06-10
articleOpen accessSenior author
Dynamic race detection based on the happens before (HB) partial order has now become the de facto approach to quickly identify data races in multi-threaded software. Most practical implementations for detecting these races use timestamps to infer causality between events and detect races based on these timestamps. Such an algorithm updates timestamps (stored in vector clocks) at every event in the execution, and is known to induce excessive overhead. Random sampling has emerged as a promising algorithmic paradigm to offset this overhead. It offers the promise of making sound race detection scalable. In this work we consider the task of designing an efficient sampling based race detector with low overhead for timestamping when the number of sampled events is much smaller than the total events in an execution. To solve this problem, we propose (1) a new notion of freshness timestamp, (2) a new data structure to store timestamps, and (3) an algorithm that uses a combination of them to reduce the cost of timestamping in sampling based race detection. Further, we prove that our algorithm is close to optimal --- the number of vector clock traversals is bounded by the number of sampled events and number of threads, and further, on any given dynamic execution, the cost of timestamping due to our algorithm is close to the amount of work any timestamping-based algorithm must perform on that execution, that is it is instance optimal. Our evaluation on real world benchmarks demonstrates the effectiveness of our proposed algorithm over prior timestamping algorithms that are agnostic to sampling.
Publisher DOI

Recent grants

SHF: Small: New Algorithmic Paradigms in Dynamic Analysis of Multithreaded Software
NSF · $250k · 2020–2025
SHF: Small: Verifying Open Concurrent Real Time Systems
NSF · $476k · 2010–2014
Monitoring and Checking of Distributed Systems with respect to Formal Specifications
NSF · $270k · 2004–2008
CAREER: Next Generation Model Checking
NSF · $400k · 2005–2011
TWC: Medium: Collaborative: Automated Formal Analysis of Security Protocols with Private Coin Tosses
NSF · $595k · 2013–2018

Frequent coauthors

Rohit Chadha
University of Missouri
59 shared
Umang Mathur
36 shared
A. Prasad Sistla
University of Illinois Chicago
29 shared
Pavithra Prabhakar
26 shared
Nima Roohi
Amazon (United States)
24 shared
Geir E. Dullerud
University of Illinois Urbana-Champaign
23 shared
Sayan Mitra
University of Illinois Urbana-Champaign
20 shared
Dileep Kini
Capital University
17 shared

Labs

Siebel School of Computing and Data SciencePI

Education

Ph.D., Computer Science
University of Illinois at Urbana-Champaign
2000
M.S., Computer Science
University of Illinois at Urbana-Champaign
1996
B.S., Electrical and Electronics Engineering
University of Madras
1992

Awards & honors

Celebration of Excellence 2021
Celebration of Excellence 2022
Celebration of Excellence 2023
Celebration of Excellence 2024
Celebration of Excellence 2025

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Mahesh Viswanathan

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you