Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Zaid Harchaoui

Zaid Harchaoui

· ProfessorVerified

University of Washington · Statistics

Active 2004–2026

h-index62
Citations15.9k
Papers285101 last 5y
Funding$600k
See your match with Zaid Harchaoui — sign in to PhdFit.Sign in

About

Zaid Harchaoui is a professor at the University of Washington in the Department of Statistics. His research focuses on statistical and machine learning methods, with recognition from various professional organizations including the International Statistical Institute and the Neural Information Processing Systems (NeurIPS). He has received multiple awards for his contributions to the field, such as the Criteo Faculty Research Award and the Google Faculty Research Award. Harchaoui is also an elected member of the ISI and serves as a council member of the French Machine Learning Society. His academic and professional achievements highlight his significant role in advancing statistical and machine learning research.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Natural Language Processing
  • Data Mining
  • Philosophy
  • Linguistics

Selected publications

  • Stochastic optimization on matrices and a graphon McKean–Vlasov limit

    The Annals of Applied Probability · 2026-02-01

    article1st authorCorresponding
  • Langevin diffusion approximation to same marginal Schrödinger bridge

    Journal of Functional Analysis · 2026-03-31

    article
  • A Generalization Theory for Zero-Shot Prediction

    ArXiv.org · 2025-07-12

    preprintOpen accessSenior author

    A modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be used for prediction on a downstream task for which no labeled data is available. We present a theoretical framework to better understand this approach, called zero-shot prediction. We identify the target quantities that zero-shot prediction aims to learn, or learns in passing, and the key conditional independence relationships that enable its generalization ability.

  • Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation

    ArXiv.org · 2025-08-28

    preprintOpen accessSenior author

    We develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible matrices with joint learning of a prediction model through backpropagation. We illustrate the proposed algorithm on synthetic and real data experiments. In particular, owing to the additional supervision, we observe an increased success rate of the non-convex optimization and the improved interpretability of the independent components.

  • Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty

    Methods in Ecology and Evolution · 2025-12-08 · 2 citations

    articleOpen accessSenior author

    Abstract Animal‐worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to classify animal behaviours across taxa and research questions. However, the widespread adoption of these methods faces challenges from imbalanced training data, unquantified uncertainties in model outputs, shifts in model performance across contexts and noisy classifications in continuous data streams, where predicted behaviours change abruptly within a sequence. To address these challenges, we introduce an open‐source approach for classifying animal behaviour from raw acceleration data. Our approach integrates machine learning and statistical inference techniques to evaluate and mitigate class imbalances, changes in model performance across ecological settings and noisy classifications. Importantly, we extend predictions from single behaviour classifications to prediction sets: sets of behaviour labels guaranteed to contain the true behaviour with a pre‐specified probability, in a framework analogous to the use of prediction intervals in statistical analyses. We evaluate our approach via simulation and highlight its utility using data collected from a free‐ranging large carnivore, African wild dogs ( Lycaon pictus ), in the Okavango Delta, Botswana. We demonstrate significantly improved predictions along with associated uncertainty metrics in African wild dog behaviour classification, particularly for rare and ecologically important behaviours such as feeding, where correct classifications more than doubled following quality checks and data rebalancing introduced in our pipeline. Our approach is applicable across taxa and represents a key step towards advancing the burgeoning use of machine learning to remotely observe around‐the‐clock behaviours of free‐ranging animals. Future work could include the integration of multiple data streams, such as accelerometer, audio and GPS data, for model training and could be incorporated directly into our pipeline.

  • Min-Max Optimization with Dual-Linear Coupling

    ArXiv.org · 2025-07-08

    preprintOpen accessSenior author

    We study a class of convex-concave min-max problems in which the coupled component of the objective is linear in at least one of the two decision vectors. We identify such problem structure as interpolating between the bilinearly and nonbilinearly coupled problems, motivated by key applications in areas such as distributionally robust optimization and convex optimization with functional constraints. Leveraging the considered nonlinear-linear coupling of the primal and the dual decision vectors, we develop a general algorithmic framework leading to fine-grained complexity bounds exploiting separability properties of the problem, whenever present. The obtained complexity bounds offer potential improvements over state-of-the-art scaling with $\sqrt{n}$ or $n$ in some of the considered problem settings, which even include bilinearly coupled problems, where $n$ is the dimension of the dual decision vector. On the algorithmic front, our work provides novel strategies for combining randomization with extrapolation and multi-point anchoring in the mirror descent-style updates in the primal and the dual, which we hope will find further applications in addressing related optimization problems. %

  • Generative AI as a tool to accelerate the field of ecology

    Nature Ecology & Evolution · 2025-01-29 · 29 citations

    review
  • Langevin Diffusion Approximation to Same Marginal Schrödinger Bridge

    arXiv (Cornell University) · 2025-05-12

    preprintOpen access

    We introduce a novel approximation to the same marginal Schrödinger bridge using the Langevin diffusion. As $\varepsilon \downarrow 0$, it is known that the barycentric projection (also known as the entropic Brenier map) of the Schrödinger bridge converges to the Brenier map, which is the identity. Our diffusion approximation is leveraged to show that, under suitable assumptions, the difference between the two is $\varepsilon$ times the gradient of the marginal log density (i.e., the score function), in $\mathbf{L}^2$. More generally, we show that the family of Markov operators, indexed by $\varepsilon > 0$, derived from integrating test functions against the conditional density of the static Schrödinger bridge at temperature $\varepsilon$, admits a derivative at $\varepsilon=0$ given by the generator of the Langevin semigroup. Hence, these operators satisfy an approximate semigroup property at low temperatures.

  • Author response for "Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty"

    2025-07-03

    peer-reviewSenior author
  • From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

    arXiv (Cornell University) · 2024-06-24 · 2 citations

    preprintOpen accessSenior author

    One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during inference. This survey focuses on these inference-time approaches. We explore three areas under a unified mathematical formalism: token-level generation algorithms, meta-generation algorithms, and efficient generation. Token-level generation algorithms, often called decoding algorithms, operate by sampling a single token at a time or constructing a token-level search space and then selecting an output. These methods typically assume access to a language model's logits, next-token distributions, or probability scores. Meta-generation algorithms work on partial or full sequences, incorporating domain knowledge, enabling backtracking, and integrating external information. Efficient generation methods aim to reduce token costs and improve the speed of generation. Our survey unifies perspectives from three research communities: traditional natural language processing, modern LLMs, and machine learning systems.

Recent grants

Frequent coauthors

  • Cordelia Schmid

    131 shared
  • Julien Mairal

    66 shared
  • Jérôme Revaud

    54 shared
  • Vincent Roulet

    45 shared
  • Yury Maximov

    Los Alamos National Laboratory

    41 shared
  • Massih-Reza Amini

    40 shared
  • Philippe Weinzaepfel

    37 shared
  • Jérôme Malick

    32 shared

Awards & honors

  • Council Member, French Machine Learning Society (2022)
  • Criteo Faculty Research Award, Criteo AI Lab (2017)
  • Elected Member, International Statistical Institute (ISI) (2…
  • Google Faculty Research Award, Google (2018)
  • Outstanding Paper Award, Neural Information Processing Syste…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Zaid Harchaoui

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup