
Anindya Bhadra
· Professor of StatisticsVerifiedPurdue University · Statistics
Active 2010–2025
About
Anindya Bhadra is a Professor of Statistics at Purdue University. His research interests include Bayesian methods for high-dimensional and complex data, computational statistics, and statistical applications in biology, specifically in genomics, infectious disease epidemiology, and nutrition. He holds a Ph.D. in Statistics from the University of Michigan, earned in 2010, and a B.Tech. (Honors) in Electronics and Electrical Communications Engineering from the Indian Institute of Technology, Kharagpur, obtained in 2004. Bhadra has received numerous awards and honors, including being named a Purdue University Faculty Scholar in 2023, serving as an associate editor for several prominent statistical journals, and receiving the Young Statistical Scientist Award from the International Indian Statistical Association in 2021. His professional contributions include advancing statistical methodologies and their applications in biological sciences, with a focus on high-dimensional data analysis and computational techniques.
Research topics
- Medicine
- Internal medicine
- Machine Learning
- Environmental health
- Gerontology
- Biology
- Endocrinology
- Computer Science
- Artificial Intelligence
- Food science
- Demography
- Physical medicine and rehabilitation
Selected publications
ArXiv.org · 2025-09-30
preprintOpen accessSenior authorWe consider the problem of fully Bayesian posterior estimation and uncertainty quantification in undirected Gaussian graphical models via Markov chain Monte Carlo (MCMC) under recently-developed element-wise graphical priors, such as the graphical horseshoe. Unlike the conjugate Wishart family, these priors are non-conjugate; but have the advantage that they naturally allow one to encode a prior belief of sparsity in the off-diagonal elements of the precision matrix, without imposing a structure on the entire matrix. Unfortunately, for a graph with $p$ nodes and with $n$ samples, the state-of-the-art MCMC approaches for the element-wise priors achieve a per iteration complexity of ${O}(p^4),$ which is prohibitive when $p\gg n$. In this regime, we develop a suitably reparameterized MCMC with per iteration complexity of ${O}(p^3)$, providing a one order of magnitude improvement, and consequently bringing the per iteration computational cost at par with the conjugate Wishart family, which is also ${O}(p^3)$ due to a use of the classical Bartlett decomposition, but this decomposition does not apply outside the Wishart family. Importantly, the proposed benefit is obtained solely due to our reparameterization in an MCMC scheme targeting the true posterior, that reverses the recently developed telescoping block decomposition of Bhadra et al. (2024), in a suitable sense. There is no variational or any other approximate Bayesian computation scheme considered in this paper that compromises targeting the true posterior. Simulations and the analysis of a breast cancer data set confirm both the correctness and better algorithmic scaling of the proposed reverse telescoping sampler.
Mathematical Geosciences · 2025-03-25
articleSenior authorElectronic Journal of Statistics · 2025-01-01
articleOpen accessSenior authorIn nonparametric Bayesian approaches, Gaussian stochastic processes can serve as priors on real-valued function spaces. Existing literature on the posterior convergence rates under Gaussian process priors shows that it is possible to achieve optimal or near-optimal posterior contraction rates if the smoothness of the Gaussian process matches that of the target function. Among those priors, Gaussian process with a parametric Matérn covariance function is particularly notable in that its degree of smoothness can be determined by a dedicated smoothness parameter. Ma and Bhadra (2023) recently introduced a new family of covariance functions called the Confluent Hypergeometric (CH) class that simultaneously possess two parameters: one controls the tail index of the polynomially decaying covariance function, and the other parameter controls the degree of mean-squared smoothness analogous to the Matérn class. In this paper, we show that with proper choice of rescaling parameters in the Matérn and CH covariance functions, it is possible to obtain the minimax optimal posterior contraction rate for η-regular functions for nonparametric regression model with fixed design. Unlike the previous results for unrescaled cases, the smoothness parameter of the covariance function need not equal η for achieving the optimal minimax rate, for either rescaled Matérn or rescaled CH covariances, illustrating a key benefit of rescaling. We also consider a fully Bayesian treatment of the rescaling parameters and show the resulting posterior distributions still contract at the minimax-optimal rate. The resultant hierarchical Bayesian procedure is fully adaptive to the unknown true smoothness. The theoretical properties of the rescaled and hierarchical Matérn and CH classes are further verified via extensive simulations and an illustration on a geospatial data set is presented.
ArXiv.org · 2025-10-04
preprintOpen accessSenior authorBayesian inference for doubly-intractable pairwise exponential graphical models typically involves variations of the exchange algorithm or approximate Markov chain Monte Carlo (MCMC) samplers. However, existing methods for both classes of algorithms require either perfect samplers or sequential samplers for complex models, which are often either not available, or suffer from poor mixing, especially in high dimensions. We develop a method that does not require perfect or sequential sampling, and can be applied to both classes of methods: exact and approximate MCMC. The key to our approach is to utilize the tractable independence model underlying the intractable probabilistic graphical model for the purpose of constructing a finite sample unbiased Monte Carlo (and not MCMC) estimate of the Metropolis--Hastings ratio. This innovation turns out to be crucial for scalability in high dimensions. The method is demonstrated on the Ising model. Gradient-based alternatives to construct a proposal, such as Langevin and Hamiltonian Monte Carlo approaches, also arise as a natural corollary to our general procedure, and are demonstrated as well.
Robust Bayesian graphical regression models for assessing tumor heterogeneity in proteomic networks
Biometrics · 2025-01-07 · 1 citations
articleGraphical models are powerful tools to investigate complex dependency structures in high-throughput datasets. However, most existing graphical models make one of two canonical assumptions: (i) a homogeneous graph with a common network for all subjects or (ii) an assumption of normality, especially in the context of Gaussian graphical models. Both assumptions are restrictive and can fail to hold in certain applications such as proteomic networks in cancer. To this end, we propose an approach termed robust Bayesian graphical regression (rBGR) to estimate heterogeneous graphs for non-normally distributed data. rBGR is a flexible framework that accommodates non-normality through random marginal transformations and constructs covariate-dependent graphs to accommodate heterogeneity through graphical regression techniques. We formulate a new characterization of edge dependencies in such models called conditional sign independence with covariates, along with an efficient posterior sampling algorithm. In simulation studies, we demonstrate that rBGR outperforms existing graphical regression models for data generated under various levels of non-normality in both edge and covariate selection. We use rBGR to assess proteomic networks in lung and ovarian cancers to systematically investigate the effects of immunogenic heterogeneity within tumors. Our analyses reveal several important protein-protein interactions that are differentially associated with the immune cell abundance; some corroborate existing biological knowledge, whereas others are novel findings.
Precision matrix estimation under the horseshoe-like prior–penalty dual
Electronic Journal of Statistics · 2024-01-01 · 12 citations
articleOpen accessSenior authorPrecision matrix estimation in a multivariate Gaussian model is fundamental to network estimation. Although there exist both Bayesian and frequentist approaches to this, it is difficult to obtain good Bayesian and frequentist properties under the same prior–penalty dual. To bridge this gap, our contribution is a novel prior–penalty dual that closely approximates the graphical horseshoe prior and penalty, and performs well in both Bayesian and frequentist senses. A chief difficulty with the graphical horseshoe prior is a lack of closed form expression of the density function, which we overcome in this article. In terms of theory, we establish posterior convergence rate of the precision matrix that matches the convergence rate of the frequentist graphical lasso estimator, in addition to the frequentist consistency of the MAP estimator at the same rate. In addition, our results also provide theoretical justifications for previously developed approaches that have been unexplored so far, e.g. for the graphical horseshoe prior. Computationally efficient EM and MCMC algorithms are developed respectively for the penalized likelihood and fully Bayesian estimation problems. In numerical experiments, the horseshoe-based approaches echo their superior theoretical properties by comprehensively outperforming the competing methods. A protein–protein interaction network estimation in B-cell lymphoma is considered to validate the proposed methodology.
Deep Kernel Posterior Learning under Infinite Variance Prior Weights
arXiv (Cornell University) · 2024-10-02
preprintOpen accessSenior authorNeal (1996) proved that infinitely wide shallow Bayesian neural networks (BNN) converge to Gaussian processes (GP), when the network weights have bounded prior variance. Cho & Saul (2009) provided a useful recursive formula for deep kernel processes for relating the covariance kernel of each layer to the layer immediately below. Moreover, they worked out the form of the layer-wise covariance kernel in an explicit manner for several common activation functions. Recent works, including Aitchison et al. (2021), have highlighted that the covariance kernels obtained in this manner are deterministic and hence, precludes any possibility of representation learning, which amounts to learning a non-degenerate posterior of a random kernel given the data. To address this, they propose adding artificial noise to the kernel to retain stochasticity, and develop deep kernel inverse Wishart processes. Nonetheless, this artificial noise injection could be critiqued in that it would not naturally emerge in a classic BNN architecture under an infinite-width limit. To address this, we show that a Bayesian deep neural network, where each layer width approaches infinity, and all network weights are elliptically distributed with infinite variance, converges to a process with $α$-stable marginals in each layer that has a conditionally Gaussian representation. These conditional random covariance kernels could be recursively linked in the manner of Cho & Saul (2009), even though marginally the process exhibits stable behavior, and hence covariances are not even necessarily defined. We also provide useful generalizations of the recent results of Loría & Bhadra (2024) on shallow networks to multi-layer networks, and remedy the computational burden of their approach. The computational and statistical benefits over competing approaches stand out in simulations and in demonstrations on benchmark data sets.
Lifetime Data Analysis · 2024-05-28 · 3 citations
article1st authorMerging two cultures: Deep and statistical learning
Wiley Interdisciplinary Reviews Computational Statistics · 2024-03-01 · 3 citations
reviewOpen access1st authorAbstract Our goal is to provide a review of deep learning methods which provide insight into structured high‐dimensional data. Merging the two cultures of algorithmic and statistical learning sheds light on model construction and improved prediction and inference, leveraging the duality and trade‐off between the two. Prediction, interpolation, and uncertainty quantification can be achieved using probabilistic methods at the output layer of the model. Rather than using shallow additive architectures common to most statistical models, deep learning uses layers of semi‐affine input transformations to provide a predictive rule. Applying these layers of transformations leads to a set of attributes (or, features) to which probabilistic statistical methods can be applied. Thus, the best of both worlds can be achieved: scalable prediction rules fortified with uncertainty quantification where sparse regularization finds the features. We review the duality between shallow and wide models such as principal components regression, and partial least squares and deep but skinny architectures such as autoencoders, multilayer perceptrons, convolutional neural net, and recurrent neural net. The connection with data transformations is of practical importance for finding good network architectures. By incorporating probabilistic components at the output level, the predictive uncertainty is allowed. We illustrate this idea by comparing plain Gaussian processes (GP) with partial least squares + Gaussian process (PLS + GP) and deep learning + Gaussian process (DL + GP). This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Deep Learning
Maximum a posteriori estimation in graphical models using local linear approximation
Stat · 2024-04-30 · 3 citations
articleOpen accessSenior authorAbstract Sparse structure learning in high‐dimensional Gaussian graphical models is an important problem in multivariate statistical inference, since the sparsity pattern naturally encodes the conditional independence relationship among variables. However, maximum a posteriori (MAP) estimation is challenging under hierarchical prior models, and traditional numerical optimization routines or expectation–maximization algorithms are difficult to implement. To this end, our contribution is a novel local linear approximation scheme that circumvents this issue using a very simple computational algorithm. Most importantly, the condition under which our algorithm is guaranteed to converge to the MAP estimate is explicitly stated and is shown to cover a broad class of completely monotone priors, including the graphical horseshoe. Further, the resulting MAP estimate is shown to be sparse and consistent in the ‐norm. Numerical results validate the speed, scalability and statistical performance of the proposed method.
Recent grants
Bayesian Global-Local Shrinkage in High Dimensions
NSF · $100k · 2016–2019
Developments in Gaussian Processes and Beyond: Applications in Geostatistics and Deep Learning
NSF · $120k · 2020–2023
Frequent coauthors
- 67 shared
Edward L. Ionides
University of Michigan–Ann Arbor
- 55 shared
Mercedes Pascual
New York University
- 53 shared
Karina Laneri
Balseiro Institute
- 53 shared
Menno J. Bouma
- 49 shared
Ramesh C. Dhiman
- 45 shared
Jyotishka Datta
Virginia Tech
- 37 shared
Nicholas G. Polson
- 35 shared
Heather A. Eicher‐Miller
Purdue University West Lafayette
Education
- 2004
Other, Electronics and Electrical Communications Engineering
Indian Institute of Technology, Kharagpur
- 2010
Ph.D., Statistics
University of Michigan
Awards & honors
- University Faculty Scholar, Purdue University, 2023
- Young Statistical Scientist Award, International Indian Stat…
- Research Fellowship, Statistical and Applied Mathematical Sc…
- Seed for Success, Purdue University, 2017
- Department of Statistics Outstanding Assistant Professor Und…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Anindya Bhadra
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup