
Arash Ali Amini
· Associate ProfessorVerifiedUniversity of California, Los Angeles · Statistics
Active 2010–2026
About
Arash Ali Amini is an Associate Professor in the Department of Statistics & Data Science at UCLA. His research focuses on high-dimensional inference, machine learning, optimization, and networks. He is involved in advancing statistical methods and computational techniques within these areas, contributing to the development of data science and statistical theory. His academic role includes teaching and mentoring students, and he is actively engaged in departmental activities and research initiatives at UCLA.
Research topics
- Computer Science
- Artificial Intelligence
- Mathematics
- Machine Learning
- Data Mining
- Statistics
- Astrophysics
- Medicine
- Astronomy
- Physics
- Algorithm
- Theoretical computer science
- Mathematical optimization
- Internal medicine
Selected publications
Exact Combinatorial Multi-Class Graph Cuts for Semi-Supervised Learning
Proceedings of the AAAI Conference on Artificial Intelligence · 2026-03-14
articleOpen accessSenior authorSemi-supervised learning (SSL) on graphs is critical in applications where labeled data are scarce and costly, yet existing graph-based methods often degrade under extreme label sparsity or class imbalance, yielding trivial or unstable solutions. We introduce \textbf{CombCut}, the first exact combinatorial optimization framework for multi-class graph-based semi-supervised learning that operates directly on binary one-hot assignments, without any convex relaxation or heuristic volume constraints. By employing a minorization–maximization (MM) scheme, CombCut transforms each step into a structured linear assignment problem solved efficiently via network-flow algorithms. Total unimodularity guarantees integral iterates, and our theoretical analysis establishes both monotonic ascent of the true discrete objective and convergence of every limit point to a Karush–Kuhn–Tucker (KKT) stationary solution of the original combinatorial problem. Our approach requires no hyperparameter tuning and scales near-linearly in the number of vertices. Empirical evaluation on MNIST, Fashion-MNIST, and CIFAR-10 with as few as 1–5 labels per class shows that CombCut excels in worst-case labeling scenarios, significantly outperforming state-of-the-art graph-SSL baselines and yielding more stable and accurate label propagation under severe supervision constraints.
Bayesian Community Detection for Networks with Covariates
Bayesian Analysis · 2024-03-15 · 5 citations
articleOpen accessThe increasing prevalence of network data in a vast variety of fields and the need to extract useful information out of them have spurred fast developments in related models and algorithms. Among the various learning tasks with network data, community detection, the discovery of node clusters or “communities,” has arguably received the most attention in the scientific community. In many real-world applications, the network data often come with additional information in the form of node or edge covariates that should ideally be leveraged for inference. In this paper, we add to a limited literature on community detection for networks with covariates by proposing a Bayesian stochastic block model with a covariate-dependent random partition prior. Under our prior, the covariates are explicitly expressed in specifying the prior distribution on the cluster membership. Our model has the flexibility of modeling uncertainties of all the parameter estimates including the community membership. Importantly, and unlike the majority of existing methods, our model has the ability to learn the number of the communities via posterior inference without having to assume it to be known. Our model can be applied to community detection in both dense and sparse networks, with both categorical and continuous covariates, and our MCMC algorithm is very efficient with good mixing properties. We demonstrate the superior performance of our model over existing models in a comprehensive simulation study and an application to two real datasets.
Federated Learning of Generalized Linear Causal Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence · 2024-03-26 · 11 citations
articleOpen accessCausal discovery, the inference of causal relations among variables from data, is a fundamental problem of science. Nowadays, due to an increased awareness of data privacy concerns, there has been a shift towards distributed data collection, processing and storage. To meet the pressing need for distributed causal discovery, we propose a novel federated DAG learning method called distributed annealing on regularized likelihood score (DARLS) to learn a causal graph from data stored on multiple clients. DARLS simulates an annealing process to search over the space of topological sorts, where the optimal graphical structure compatible with a sort is found by distributed optimization. This distributed optimization relies on multiple rounds of communication between local clients and a central server to estimate the graphical structure. We establish its convergence to the solution obtained by an oracle with access to all the data. To the best of our knowledge, DARLS is the first distributed method for learning causal graphs with such finite-sample oracle guarantees. To establish the consistency of DARLS, we also derive new identifiability results for causal graphs parameterized by generalized linear models, which could be of independent interest. Through extensive simulation studies and a real-world application, we show that DARLS outperforms existing federated learning methods and is comparable to oracle methods on pooled data, demonstrating its great advantages in estimating causal networks from distributed data.
Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks
Bayesian Analysis · 2023-01-05 · 28 citations
articleOpen access1st authorCorrespondingMultiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for developing inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by proposing a novel and efficient Bayesian model for community detection in multiplex networks. A key feature of our approach is the ability to model varying communities at different network layers. In contrast, many existing models assume the same communities for all layers. Moreover, our model automatically picks up the necessary number of communities at each layer (as validated by real data examples). This is appealing, since deciding the number of communities is a challenging aspect of community detection, and especially so in the multiplex setting, if one allows the communities to change across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a hierarchical Dirichlet prior to model community labels across layers, allowing dependency in their structure. Given the community labels, a stochastic block model (SBM) is assumed for each layer. We develop an efficient slice sampler for sampling the posterior distribution of the community labels as well as the link probabilities between communities. In doing so, we address some unique challenges posed by coupling the complex likelihood of SBM with the hierarchical nature of the prior on the labels. An extensive empirical validation is performed on simulated and real data, demonstrating the superior performance of the model over single-layer alternatives, as well as the ability to uncover interesting structures in real networks.
Adjusted chi-square test for degree-corrected block models
The Annals of Statistics · 2023-12-01 · 5 citations
articleSenior authorWe propose a goodness-of-fit test for degree-corrected stochastic block models (DCSBM). The test is based on an adjusted chi-square statistic for measuring equality of means among groups of n multinomial distributions with d1,…,dn observations. In the context of network models, the number of multinomials, n, grows much faster than the number of observations, di, corresponding to the degree of node i, hence the setting deviates from classical asymptotics. We show that a simple adjustment allows the statistic to converge in distribution, under null, as long as the harmonic mean of {di} grows to infinity. When applied sequentially, the test can also be used to determine the number of communities. The test operates on a compressed version of the adjacency matrix, conditional on the degrees, and as a result is highly scalable to large sparse networks. We incorporate a novel idea of compressing the rows based on a (K+1)-community assignment when testing for K communities. This approach increases the power in sequential applications without sacrificing computational efficiency, and we prove its consistency in recovering the number of communities. Since the test statistic does not rely on a specific alternative, its utility goes beyond sequential testing and can be used to simultaneously test against a wide range of alternatives outside the DCSBM family. In particular, we prove that the test is consistent against a general family of latent-variable network models with community structure. We show the effectiveness of the approach by extensive numerical experiments with simulated and real data. In particular, applying the test to the Facebook-100 data set, a collection of one hundred social networks, we find that a DCSBM with a small number of communities (say <25) is far from a good fit in almost all cases.
Finding quadruply imaged quasars with machine learning – I. Methods
Monthly Notices of the Royal Astronomical Society · 2022 · 12 citations
- Artificial Intelligence
- Physics
- Astrophysics
ABSTRACT Strongly lensed quadruply imaged quasars (quads) are extraordinary objects. They are very rare in the sky and yet they provide unique information about a wide range of topics, including the expansion history and the composition of the Universe, the distribution of stars and dark matter in galaxies, the host galaxies of quasars, and the stellar initial mass function. Finding them in astronomical images is a classic ‘needle in a haystack’ problem, as they are outnumbered by other (contaminant) sources by many orders of magnitude. To solve this problem, we develop state-of-the-art deep learning methods and train them on realistic simulated quads based on real images of galaxies taken from the Dark Energy Survey, with realistic source and deflector models, including the chromatic effects of microlensing. The performance of the best methods on a mixture of simulated and real objects is excellent, yielding area under the receiver operating curve in the range of 0.86–0.89. Recall is close to 100 per cent down to total magnitude i ∼ 21 indicating high completeness, while precision declines from 85 per cent to 70 per cent in the range i ∼ 17–21. The methods are extremely fast: training on 2 million samples takes 20 h on a GPU machine, and 108 multiband cut-outs can be evaluated per GPU-hour. The speed and performance of the method pave the way to apply it to large samples of astronomical sources, bypassing the need for photometric pre-selection that is likely to be a major cause of incompleteness in current samples of known quads.
Distributed Learning of Generalized Linear Causal Networks
arXiv (Cornell University) · 2022-01-23 · 2 citations
preprintOpen accessWe consider the task of learning causal structures from data stored on multiple machines, and propose a novel structure learning method called distributed annealing on regularized likelihood score (DARLS) to solve this problem. We model causal structures by a directed acyclic graph that is parameterized with generalized linear models, so that our method is applicable to various types of data. To obtain a high-scoring causal graph, DARLS simulates an annealing process to search over the space of topological sorts, where the optimal graphical structure compatible with a sort is found by a distributed optimization method. This distributed optimization relies on multiple rounds of communication between local and central machines to estimate the optimal structure. We establish its convergence to a global optimizer of the overall score that is computed on all data across local machines. To the best of our knowledge, DARLS is the first distributed method for learning causal graphs with such theoretical guarantees. Through extensive simulation studies, DARLS has shown competing performance against existing methods on distributed data, and achieved comparable structure learning accuracy and test-data likelihood with competing methods applied to pooled data across all local machines. In a real-world application for modeling protein-DNA binding networks with distributed ChIP-Sequencing data, DARLS also exhibits higher predictive power than other methods, demonstrating a great advantage in estimating causal networks from distributed data.
nett: Network Analysis and Community Detection
2022-11-09 · 1 citations
datasetOpen access1st authorCorrespondingFeatures tools for the network data analysis and community detection. Provides multiple methods for fitting, model selection and goodness-of-fit testing in degree-corrected stochastic blocks models. Most of the computations are fast and scalable for sparse networks, esp. for Poisson versions of the models. Implements the following: Amini, Chen, Bickel and Levina (2013) <<a href="https://doi.org/10.1214%2F13-AOS1138" target="_top">doi:10.1214/13-AOS1138</a>> Bickel and Sarkar (2015) <<a href="https://doi.org/10.1111%2Frssb.12117" target="_top">doi:10.1111/rssb.12117</a>> Lei (2016) <<a href="https://doi.org/10.1214%2F15-AOS1370" target="_top">doi:10.1214/15-AOS1370</a>> Wang and Bickel (2017) <<a href="https://doi.org/10.1214%2F16-AOS1457" target="_top">doi:10.1214/16-AOS1457</a>> Zhang and Amini (2020) <<a href="https://doi.org/10.48550/arXiv.2012.15047" target="_top">doi:10.48550/arXiv.2012.15047</a>> Le and Levina (2022) <<a href="https://doi.org/10.1214%2F21-EJS1971" target="_top">doi:10.1214/21-EJS1971</a>>.
Performance evaluation of automotive dealerships using grouped mixture of regressions
Expert Systems with Applications · 2022-11-17 · 2 citations
articleSenior authorSpectrally-truncated kernel ridge regression and its free lunch
Electronic Journal of Statistics · 2021-01-01
preprintOpen access1st authorCorrespondingKernel ridge regression (KRR) is a well-known and popular nonparametric regression approach with many desirable properties, including minimax rate-optimality in estimating functions that belong to common reproducing kernel Hilbert spaces (RKHS). The approach, however, is computationally intensive for large data sets, due to the need to operate on a dense $n \times n$ kernel matrix, where $n$ is the sample size. Recently, various approximation schemes for solving KRR have been considered, and some analyzed. Some approaches such as Nyström approximation and sketching have been shown to preserve the rate optimality of KRR. In this paper, we consider the simplest approximation, namely, spectrally truncating the kernel matrix to its largest $r < n$ eigenvalues. We derive an exact expression for the maximum risk of this truncated KRR, over the unit ball of the RKHS. This result can be used to study the exact trade-off between the level of spectral truncation and the regularization parameter. We show that, as long as the RKHS is infinite-dimensional, there is a threshold on $r$, above which, the spectrally-truncated KRR surprisingly outperforms the full KRR in terms of the minimax risk, where the minimum is taken over the regularization parameter. This strengthens the existing results on approximation schemes, by showing that not only one does not lose in terms of the rates, truncation can in fact improve the performance, for all finite samples (above the threshold). Moreover, we show that the implicit regularization achieved by spectral truncation is not a substitute for Hilbert norm regularization. Both are needed to achieve the best performance.
Recent grants
CAREER: High-Dimensional Statistical Models for Unsupervised Learning
NSF · $400k · 2020–2026
Frequent coauthors
- 9 shared
Zahra S. Razaee
Cedars-Sinai Medical Center
- 8 shared
Qing Zhou
University of California, Los Angeles
- 5 shared
XuanLong Nguyen
- 5 shared
Elizaveta Levina
- 5 shared
Bryon Aragam
- 4 shared
Alyson K. Fletcher
University of California, Los Angeles
- 4 shared
Linfan Zhang
University of California, Los Angeles
- 4 shared
Parthe Pandit
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Arash Ali Amini
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup