Zaid Harchaoui
· ProfessorVerifiedUniversity of Washington · Statistics
Active 2004–2026
About
Zaid Harchaoui is a professor at the University of Washington in the Department of Statistics. His research focuses on statistical and machine learning methods, with recognition from various professional organizations including the International Statistical Institute and the Neural Information Processing Systems (NeurIPS). He has received multiple awards for his contributions to the field, such as the Criteo Faculty Research Award and the Google Faculty Research Award. Harchaoui is also an elected member of the ISI and serves as a council member of the French Machine Learning Society. His academic and professional achievements highlight his significant role in advancing statistical and machine learning research.
Research topics
- Artificial Intelligence
- Computer Science
- Natural Language Processing
- Data Mining
- Philosophy
- Linguistics
Selected publications
Stochastic optimization on matrices and a graphon McKean–Vlasov limit
The Annals of Applied Probability · 2026-02-01
article1st authorCorrespondingLangevin diffusion approximation to same marginal Schrödinger bridge
Journal of Functional Analysis · 2026-03-31
articleA Generalization Theory for Zero-Shot Prediction
ArXiv.org · 2025-07-12
preprintOpen accessSenior authorA modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be used for prediction on a downstream task for which no labeled data is available. We present a theoretical framework to better understand this approach, called zero-shot prediction. We identify the target quantities that zero-shot prediction aims to learn, or learns in passing, and the key conditional independence relationships that enable its generalization ability.
Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation
ArXiv.org · 2025-08-28
preprintOpen accessSenior authorWe develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible matrices with joint learning of a prediction model through backpropagation. We illustrate the proposed algorithm on synthetic and real data experiments. In particular, owing to the additional supervision, we observe an increased success rate of the non-convex optimization and the improved interpretability of the independent components.
Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty
Methods in Ecology and Evolution · 2025-12-08 · 2 citations
articleOpen accessSenior authorAbstract Animal‐worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to classify animal behaviours across taxa and research questions. However, the widespread adoption of these methods faces challenges from imbalanced training data, unquantified uncertainties in model outputs, shifts in model performance across contexts and noisy classifications in continuous data streams, where predicted behaviours change abruptly within a sequence. To address these challenges, we introduce an open‐source approach for classifying animal behaviour from raw acceleration data. Our approach integrates machine learning and statistical inference techniques to evaluate and mitigate class imbalances, changes in model performance across ecological settings and noisy classifications. Importantly, we extend predictions from single behaviour classifications to prediction sets: sets of behaviour labels guaranteed to contain the true behaviour with a pre‐specified probability, in a framework analogous to the use of prediction intervals in statistical analyses. We evaluate our approach via simulation and highlight its utility using data collected from a free‐ranging large carnivore, African wild dogs ( Lycaon pictus ), in the Okavango Delta, Botswana. We demonstrate significantly improved predictions along with associated uncertainty metrics in African wild dog behaviour classification, particularly for rare and ecologically important behaviours such as feeding, where correct classifications more than doubled following quality checks and data rebalancing introduced in our pipeline. Our approach is applicable across taxa and represents a key step towards advancing the burgeoning use of machine learning to remotely observe around‐the‐clock behaviours of free‐ranging animals. Future work could include the integration of multiple data streams, such as accelerometer, audio and GPS data, for model training and could be incorporated directly into our pipeline.
Min-Max Optimization with Dual-Linear Coupling
ArXiv.org · 2025-07-08
preprintOpen accessSenior authorWe study a class of convex-concave min-max problems in which the coupled component of the objective is linear in at least one of the two decision vectors. We identify such problem structure as interpolating between the bilinearly and nonbilinearly coupled problems, motivated by key applications in areas such as distributionally robust optimization and convex optimization with functional constraints. Leveraging the considered nonlinear-linear coupling of the primal and the dual decision vectors, we develop a general algorithmic framework leading to fine-grained complexity bounds exploiting separability properties of the problem, whenever present. The obtained complexity bounds offer potential improvements over state-of-the-art scaling with $\sqrt{n}$ or $n$ in some of the considered problem settings, which even include bilinearly coupled problems, where $n$ is the dimension of the dual decision vector. On the algorithmic front, our work provides novel strategies for combining randomization with extrapolation and multi-point anchoring in the mirror descent-style updates in the primal and the dual, which we hope will find further applications in addressing related optimization problems. %
Generative AI as a tool to accelerate the field of ecology
Nature Ecology & Evolution · 2025-01-29 · 29 citations
reviewLangevin Diffusion Approximation to Same Marginal Schrödinger Bridge
arXiv (Cornell University) · 2025-05-12
preprintOpen accessWe introduce a novel approximation to the same marginal Schrödinger bridge using the Langevin diffusion. As $\varepsilon \downarrow 0$, it is known that the barycentric projection (also known as the entropic Brenier map) of the Schrödinger bridge converges to the Brenier map, which is the identity. Our diffusion approximation is leveraged to show that, under suitable assumptions, the difference between the two is $\varepsilon$ times the gradient of the marginal log density (i.e., the score function), in $\mathbf{L}^2$. More generally, we show that the family of Markov operators, indexed by $\varepsilon > 0$, derived from integrating test functions against the conditional density of the static Schrödinger bridge at temperature $\varepsilon$, admits a derivative at $\varepsilon=0$ given by the generator of the Langevin semigroup. Hence, these operators satisfy an approximate semigroup property at low temperatures.
2025-07-03
peer-reviewSenior authorFrom Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
arXiv (Cornell University) · 2024-06-24 · 2 citations
preprintOpen accessSenior authorOne of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during inference. This survey focuses on these inference-time approaches. We explore three areas under a unified mathematical formalism: token-level generation algorithms, meta-generation algorithms, and efficient generation. Token-level generation algorithms, often called decoding algorithms, operate by sampling a single token at a time or constructing a token-level search space and then selecting an output. These methods typically assume access to a language model's logits, next-token distributions, or probability scores. Meta-generation algorithms work on partial or full sequences, incorporating domain knowledge, enabling backtracking, and integrating external information. Efficient generation methods aim to reduce token costs and improve the speed of generation. Our survey unifies perspectives from three research communities: traditional natural language processing, modern LLMs, and machine learning systems.
Recent grants
TRIPODS+X:RES: Safe Imitation Learning for Robotics
NSF · $600k · 2018–2022
Frequent coauthors
- 131 shared
Cordelia Schmid
- 66 shared
Julien Mairal
- 54 shared
Jérôme Revaud
- 45 shared
Vincent Roulet
- 41 shared
Yury Maximov
Los Alamos National Laboratory
- 40 shared
Massih-Reza Amini
- 37 shared
Philippe Weinzaepfel
- 32 shared
Jérôme Malick
Awards & honors
- Council Member, French Machine Learning Society (2022)
- Criteo Faculty Research Award, Criteo AI Lab (2017)
- Elected Member, International Statistical Institute (ISI) (2…
- Google Faculty Research Award, Google (2018)
- Outstanding Paper Award, Neural Information Processing Syste…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Zaid Harchaoui
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup