Anil Vullikanti

Verified

University of Virginia · Computer Science

Active 2008–2026

h-index35

Citations5.2k

Papers316188 last 5y

Funding$5.9M

Faculty page

See your match with Anil Vullikanti — sign in to PhdFit.Sign in

About

Anil Vullikanti is a Professor in the Department of Computer Science and the Biocomplexity Institute at the University of Virginia. His research interests broadly encompass randomized algorithms, combinatorial optimization, distributed computing, dynamical systems, network science, machine learning, and artificial intelligence. He applies these areas to fields such as epidemiology, public health, and the modeling, analysis, and protection of critical infrastructures. Vullikanti has a notable academic background, having completed his B. Tech at the Indian Institute of Technology, Kanpur, in 1993, and his PhD at the Indian Institute of Science, Bangalore, in 1999. His professional experience includes postdoctoral work at the Max Planck Institute for Computer Science and Los Alamos National Laboratory, where he was also a technical staff member from 2003 to 2005. Prior to his current position, he was at Virginia Tech from 2005 to 2018. His contributions to the field have been recognized through nominations for best paper awards at prominent conferences such as Supercomputing 2016 and AAAI 2013. Vullikanti has received several awards, including the College of Engineering Faculty Fellow Award at Virginia Tech, the Excellence in Research Award from the Biocomplexity Institute of Virginia Tech, the DOE Early Career Award, and the NSF CAREER Award.

Research topics

Computer Science
Medicine
Virology
Political Science
Machine Learning
Computer Security
Data Mining
Artificial Intelligence
Economics
Development economics
Algorithm
Mathematical optimization
Data science
Economic growth
Mathematics
Internal medicine
Environmental health
Demography

Selected publications

NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs
arXiv (Cornell University) · 2026-02-20
articleOpen accessSenior author
Mechanistic models encode scientific knowledge about dynamical systems and are widely used in downstream scientific and policy applications. Recent work has explored LLM-based agentic frameworks to automatically construct mechanistic models from data; however, existing problem settings substantially oversimplify real-world conditions, leaving it unclear whether LLM-generated mechanistic models are reliable in practice. To address this gap, we introduce the Neural-Integrated Mechanistic Modeling (NIMM) evaluation framework, which evaluates LLM-generated mechanistic models under realistic settings with partial observations and diversified task objectives. Our evaluation reveals fundamental challenges in current baselines, ranging from model effectiveness to code-level correctness. Motivated by these findings, we design NIMMgen, an agentic framework for neural-integrated mechanistic modeling that enhances code correctness and practical validity through iterative refinement. Experiments across three datasets from diversified scientific domains demonstrate its strong performance. We also show that the learned mechanistic models support counterfactual intervention simulation.
Publisher OA PDF
Scenario-aware control of multipathway spread processes: Application to biological invasions
PNAS Nexus · 2026-02-23
articleOpen access
Optimal control of spread processes over networks is a challenging problem, even for simple diffusion models. Real-world processes-such as infectious disease outbreaks and biological invasions-often involve multiple spread pathways and time-varying network dynamics. In this work, we address the problem of region-wide interventions, where the goal is to select an optimal set of regions (groups of nodes) in a network to minimize spread, subject to budget constraints, intervention delays, and a given spread scenario which reflects prior knowledge of the process-such as initial infection locations, parameter estimates, and other context-specific assumptions. We present a general approach based on integer linear programming and sample average approximation, applicable across a broad class of diffusion models. We also establish theoretical performance guarantees for our method within the bicriteria approximation framework. To demonstrate its effectiveness, we apply the approach to model the spread of a representative agricultural pest. Our method yields near-optimal solutions and consistently outperforms standard baselines. The results emphasize the value of scenario-specific intervention strategies, showing that early action can significantly reduce spread under limited budgets and produce stable outcomes even under model uncertainty.
Publisher DOI
Prediction of Hospital Associated Infections During Continuous Hospital Stays
Proceedings of the AAAI Conference on Artificial Intelligence · 2026-03-14
articleOpen access
The US Centers for Disease Control and Prevention (CDC), in 2019, designated Methicillin-resistant Staphylococcus aureus (MRSA) as a serious antimicrobial resistance threat. The risk of acquiring MRSA and suffering life-threatening consequences due to it remains especially high for hospitalized patients due to a unique combination of factors, including: co-morbid conditions, immuno suppression, antibiotic use, and risk of contact with contaminated hospital workers and equipment. In this paper, we present a novel generative probabilistic model, GenHAI, for modeling sequences of MRSA test results outcomes for patients during a single hospitalization. This model can be used to answer many important questions from the perspectives of hospital administrators for mitigating the risk of MRSA infections. Our model is based on the probabilistic programming paradigm, and can be used to approximately answer a variety of predictive, causal, and counterfactual questions. We demonstrate the efficacy of our model by comparing it against discriminative and generative machine learning models using two real-world datasets.
Publisher DOI
NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs
Open MIND · 2026-02-20
preprintSenior author
Mechanistic models encode scientific knowledge about dynamical systems and are widely used in downstream scientific and policy applications. Recent work has explored LLM-based agentic frameworks to automatically construct mechanistic models from data; however, existing problem settings substantially oversimplify real-world conditions, leaving it unclear whether LLM-generated mechanistic models are reliable in practice. To address this gap, we introduce the Neural-Integrated Mechanistic Modeling (NIMM) evaluation framework, which evaluates LLM-generated mechanistic models under realistic settings with partial observations and diversified task objectives. Our evaluation reveals fundamental challenges in current baselines, ranging from model effectiveness to code-level correctness. Motivated by these findings, we design NIMMgen, an agentic framework for neural-integrated mechanistic modeling that enhances code correctness and practical validity through iterative refinement. Experiments across three datasets from diversified scientific domains demonstrate its strong performance. We also show that the learned mechanistic models support counterfactual intervention simulation.
DOI
A health and economic evaluation of the spatial spillover effect from measles resurgence
Scientific Reports · 2025-10-14
articleOpen accessSenior author
The administration of the Measles, Mumps, and Rubella (MMR) vaccination has had a substantial impact on controlling the spread of measles on a global scale. Nevertheless, the COVID-19 pandemic caused major disruptions to normal immunization schedules, causing the omission or delay of routine immunizations. Expanding on previous research that simulated measles outbreaks using a detailed agent-based model, this study integrates epidemiological forecasts with spatial econometrics analysis. Our objective is to quantify the household-level direct and indirect health and economic impact of measles outbreaks caused by reduction in MMR vaccine uptake. A network-based SEIR (susceptible-exposed-infected-recovered) model is used to simulate the transmission of measles over a synthetic social contact network of Virginia, under various scenarios. Household-level costs of measles outbreak, encompassing MMR vaccine expenses, treatment costs, and productivity losses, are estimated from the simulation results. A Generalized Spatial Autoregressive (GSAR) model is used to estimate the spatial 'spillover effect' on neighboring counties. Our findings indicate that reduced MMR vaccination rates are associated with increased measles cases and related economic costs, which are intensified by disease transmissibility and moderated by home quarantine. The GSAR model, with spatial lag coefficients, shows significant spatial interdependencies. A small decrease in vaccination rate in an urban region like Richmond, Virginia, has significant economic and epidemiological spillover effect, while similar reductions in rural regions like Highland County, Virginia, have a negligible impact. A decline in MMR vaccination rate has ramifications for both disease incidence and the economy, presenting diverse consequences influenced by regional disparities. Policymakers should acknowledge the interconnectedness of health and economic outcomes across regions. This research underscores the necessity of implementing broad, region-wide policy measures in response to fluctuations in vaccination rates, prioritizing overarching strategies over localized interventions.
Publisher OA PDF DOI
Identifying and forecasting importation and asymptomatic spreaders of multi-drug resistant organisms in hospital settings
npj Digital Medicine · 2025-03-07 · 3 citations
articleOpen access
Healthcare-associated infections (HAIs) from multi-drug resistant organisms (MDROs) pose a significant challenge for healthcare systems. Patients can arrive at hospitals already infected ("importation") or acquire infections during their stay ("nosocomial infection"). Many cases, often asymptomatic, complicate rapid identification due to testing limitations and delays. Although recent advancements in mathematical modeling and machine learning have aimed to identify at-risk patients, these methods face challenges: transmission models often overlook valuable electronic health record (EHR) data, while machine learning approaches typically lack mechanistic insights into underlying processes. To address these issues, we propose NeurABM, a novel framework that integrates neural networks and agent-based models (ABM) to leverage the strengths of both methods. NeurABM simultaneously learns a neural network for patient-level importation predictions and an ABM for infection identification. Our findings show that NeurABM significantly outperforms existing methods, marking a breakthrough in accurately identifying importation cases and forecasting future nosocomial infections in clinical practice.
Publisher OA PDF DOI
Knowledge-Augmented Large Language Model for Multimodal EHR-Based Risk Prediction: Development and Validation Study (Preprint)
JMIR AI · 2025-11-24
articleOpen accessSenior author
<sec> <title>BACKGROUND</title> Accurate clinical outcome prediction using Electronic Health Records (EHRs) is crucial for patient care and resource allocation. EHRs include both structured data and rich, unstructured clinical notes. However, prior machine learning methods struggle with the multi-modality, long context of notes, and severe class imbalance in clinical tasks. </sec> <sec> <title>OBJECTIVE</title> To introduce and evaluate KAMELEON (Knowledge-Augmented Multimodal EHR LEarning for Outcome predictioN), a unified, two-stage hybrid framework that integrates diverse EHR modalities and external biomedical knowledge to enhance clinical risk prediction </sec> <sec> <title>METHODS</title> This study used the publicly available, de-identified MIMIC-III dataset, which includes structured and unstructured data for over 40,000 Intensive Care Unit (ICU) patients. The two tasks studied were 30-day readmission (approximately 4% positive rate) and in-hospital mortality prediction (approximately 13% positive rate). Train-test splits were patient-disjoint (80:20). Performance was evaluated against general and medical Large Language Models (LLMs) and structured baselines. Key metrics included the Area Under the Receiver Operating Characteristic curve (AUROC), Area Under the Precision-Recall Curve (AUPRC), and Macro F1-score. </sec> <sec> <title>RESULTS</title> The KAMELEON framework consistently outperformed all existing baselines. • 30-Day Readmission. The KAMELEON-BalancedRF model achieved an AUROC of 0.845 and a Sensitivity (Recall) of 0.79. Ablation analysis demonstrated the critical role of the LLM-generated reasoning, with its removal causing the AUROC to drop from 0.844 to 0.7 and sensitivity to fall by over 80%. • In-Hospital Mortality: The KAMELEON-XGBoost model achieved an AUROC of 0.92 and an AUPRC of 0.650. Unstructured-only models showed limited ability to discern mortality, with AUROC values near chance (around 0.51–0.53). </sec> <sec> <title>CONCLUSIONS</title> KAMELEON is the first systematic framework to enhance LLMs for healthcare prediction through graph-guided knowledge retrieval combined with structured machine learning. The framework demonstrates superior performance across both prediction tasks, highlighting the synergistic value of combining diverse data modalities and LLM reasoning for robust clinical risk estimation. </sec>
Publisher DOI
UFID: A Unified Framework for Black-box Input-level Backdoor Detection on Diffusion Models
Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11 · 8 citations
articleOpen accessSenior author
Diffusion models are vulnerable to backdoor attacks, where malicious attackers inject backdoors by poisoning certain training samples during the training stage. This poses a significant threat to real-world applications in the Model-as-a-Service (MaaS) scenario, where users query diffusion models through APIs or directly download them from the internet. To mitigate the threat of backdoor attacks under MaaS, black-box input-level backdoor detection has drawn recent interest, where defenders aim to build a firewall that filters out backdoor samples in the inference stage, with access only to input queries and the generated results from diffusion models. Despite some preliminary explorations on the traditional classification tasks, these methods cannot be directly applied to the generative tasks due to two major challenges: (1) more diverse failures and (2) a multi-modality attack surface. In this paper, we propose a black-box input-level backdoor detection framework on diffusion models, called UFID. Our defense is motivated by an insightful causal analysis: Backdoor attacks serve as the confounder, introducing a spurious path from input to target images, which remains consistent even when we perturb the input samples with Gaussian noise. We further validate the intuition with theoretical analysis. Extensive experiments across different datasets on both conditional and unconditional diffusion models show that our method achieves superb performance on detection effectiveness and run-time efficiency.
Publisher OA PDF DOI
Sample Complexity of Linear Regression Models for Opinion Formation in Networks
Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11
articleOpen access
Consider public health officials aiming to spread awareness about a new vaccine in a community interconnected by a social network. How can they distribute information with minimal resources, so as to avoid polarization and ensure community-wide convergence of opinion? To tackle such challenges, we initiate the study of sample complexity of opinion formation in networks. Our framework is built on the recognized opinion formation game, where we regard each agent’s opinion as a data-derived model, unlike previous works that treat opinions as data-independent scalars. The opinion model for every agent is initially learned from its local samples and evolves game-theoretically as all agents communicate with neighbors and revise their models towards an equilibrium. Our focus is on the sample complexity needed to ensure that the opinions converge to an equilibrium such that every agent’s final model has low generalization error. Our paper has two main technical results. First, we present a novel polynomial time optimization framework to quantify the total sample complexity for arbitrary networks, when the underlying learning problem is (generalized) linear regression. Second, we leverage this optimization to study the network gain which measures the improvement of sample complexity when learning over a network compared to that in isolation. Towards this end, we derive network gain bounds for various network classes including cliques, star graphs, and random regular graphs. Additionally, our framework provides a method to study sample distribution within the network, suggesting that it is sufficient to allocate samples inversely to the degree. Empirical results on both synthetic and real-world networks strongly support our theoretical findings.
Publisher OA PDF DOI
CALYPSO: Forecasting and Analyzing MRSA Infection Patterns with Community and Healthcare Transmission Dynamics
ArXiv.org · 2025-08-19
preprintOpen accessSenior author
Methicillin-resistant Staphylococcus aureus (MRSA) is a critical public health threat within hospitals as well as long-term care facilities. Better understanding of MRSA risks, evaluation of interventions and forecasting MRSA rates are important public health problems. Existing forecasting models rely on statistical or neural network approaches, which lack epidemiological interpretability, and have limited performance. Mechanistic epidemic models are difficult to calibrate and limited in incorporating diverse datasets. We present CALYPSO, a hybrid framework that integrates neural networks with mechanistic metapopulation models to capture the spread dynamics of infectious diseases (i.e., MRSA) across healthcare and community settings. Our model leverages patient-level insurance claims, commuting data, and healthcare transfer patterns to learn region- and time-specific parameters governing MRSA spread. This enables accurate, interpretable forecasts at multiple spatial resolutions (county, healthcare facility, region, state) and supports counterfactual analyses of infection control policies and outbreak risks. We also show that CALYPSO improves statewide forecasting performance by over 4.5% compared to machine learning baselines, while also identifying high-risk regions and cost-effective strategies for allocating infection prevention resources.
Publisher OA PDF

Recent grants

ICES: Large: Collaborative Research: The Role of Space, Time and Information in Controlling Epidemics
NSF · $295k · 2012–2016
BIGDATA: Collaborative Research: F: Efficient Distributed Computation of Large-Scale Graph Problems in Epidemiology and Contagion Dynamics
NSF · $736k · 2016–2019
Detection and characterization of critical under-immunized hotspots
NIH · $3.3M · 2014–2024
CAREER: Cross-layer optimization in Cognitive Radio Networks in the Physical interference model based on SINR constraints: Algorithmic Foundations
NSF · $466k · 2009–2015
Collaborative Research: NECO: A Market-Driven Approach to Dynamic Spectrum Sharing
NSF · $490k · 2008–2012

Frequent coauthors

Madhav Marathe
251 shared
Achla Marathe
133 shared
Henning Mortveit
University of Virginia
121 shared
Bryan Lewis
Biocom
114 shared
Jiangzhuo Chen
University of Virginia
110 shared
Srinivasan Venkatramanan
Biocom
92 shared
Samarth Swarup
University of Virginia
80 shared
Abhijin Adiga
Warwick Hospital
59 shared

Awards & honors

College of Engineering Faculty Fellow Award, Virginia Tech 2…
Excellence in Research Award, Biocomplexity Institute of Vir…
DOE Early Career Award 2010
NSF CAREER Award 2009

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Anil Vullikanti

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you