Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Nicholas Bambos

Nicholas Bambos

· Richard W. Weiland Professor in the School of Engineering and Professor of Electrical EngineeringVerified

Stanford University · Management Science and Engineering

Active 1989–2026

h-index36
Citations5.5k
Papers38451 last 5y
Funding
See your match with Nicholas Bambos — sign in to PhdFit.Sign in

About

Nicholas Bambos is the Richard W. Weiland Professor in the School of Engineering at Stanford University, with a joint appointment in the Department of Electrical Engineering and the Department of Management Science & Engineering. He has served as the Fortinet Founders Department Chair of the Management Science & Engineering Department from 2016 to 2020. He heads the Computer Systems Performance Engineering Lab (Perf-Lab) at Stanford, which involves doctoral students and industry visitors engaged in various research projects. Bambos was also the Director of the Stanford Networking Research Center from 1999 to 2005, overseeing a research initiative of about $30 million. His research interests encompass architecture and high-performance engineering of computer systems and networks, as well as data analytics with a focus on medical and healthcare analytics. His contributions span networking and the Internet, cloud computing, data centers, multimedia streaming, computer security, and digital health. His methodological expertise includes network control, online task scheduling, routing, distributed processing, machine learning, and artificial intelligence. Bambos earned his Ph.D. in Electrical Engineering & Computer Sciences from the University of California at Berkeley in 1989. Prior to Stanford, he was an assistant and then tenured associate professor at UCLA. He has published over 300 peer-reviewed research publications and graduated more than 40 doctoral students who have moved into leadership roles across academia, industry, finance, and startups. Bambos has received numerous awards, including best research paper awards, the Cisco Systems Faculty Development Chair, the David Morgenthaler Faculty Scholar, the IBM Faculty Award, and the National Young Investigator Award from the NSF. He has served on editorial boards, scientific committees, and technical review panels, and has been involved as a consultant, co-founder of startups, and expert witness in legal cases involving information technologies.

Research topics

  • Computer Science
  • Machine Learning
  • Mathematics
  • Geometry
  • Chemistry
  • Engineering
  • Internal medicine
  • Applied mathematics
  • Mathematical analysis
  • Mathematical optimization
  • Surgery
  • Medicine
  • Process engineering
  • Chromatography

Selected publications

  • High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise

    arXiv (Cornell University) · 2026-03-15

    preprintOpen accessSenior author

    We present the first uniform-in-time high-probability bound for SGD under the PL condition, where the gradient noise contains both Markovian and martingale difference components. This significantly broadens the scope of finite-time guarantees, as the PL condition arises in many machine learning and deep learning models while Markovian noise naturally arises in decentralized optimization and online system identification problems. We further allow the magnitude of noise to grow with the function value, enabling the analysis of many practical sampling strategies. In addition to the high-probability guarantee, we establish a matching $1/k$ decay rate for the expected suboptimality. Our proof technique relies on the Poisson equation to handle the Markovian noise and a probabilistic induction argument to address the lack of almost-sure bounds on the objective. Finally, we demonstrate the applicability of our framework by analyzing three practical optimization problems: token-based decentralized linear regression, supervised learning with subsampling for privacy amplification, and online system identification.

  • Last-Iterate Guarantees for Learning in Co-coercive Games

    arXiv (Cornell University) · 2026-04-21

    preprintOpen accessSenior author

    We establish finite-time last-iterate guarantees for vanilla stochastic gradient descent in co-coercive games under noisy feedback. This is a broad class of games that is more general than strongly monotone games, allows for multiple Nash equilibria, and includes examples such as quadratic games with negative semidefinite interaction matrices and potential games with smooth concave potentials. Prior work in this setting has relied on relative noise models, where the noise vanishes as iterates approach equilibrium, an assumption that is often unrealistic in practice. We work instead under a substantially more general noise model in which the second moment of the noise is allowed to scale affinely with the squared norm of the iterates, an assumption natural in learning with unbounded action spaces. Under this model, we prove a last-iterate bound of order $O(\log(t)/t^{1/3})$, the first such bound for co-coercive games under non-vanishing noise. We additionally establish almost sure convergence of the iterates to the set of Nash equilibria and derive time-average convergence guarantees.

  • Last-Iterate Guarantees for Learning in Co-coercive Games

    ArXiv.org · 2026-04-21

    articleOpen accessSenior author

    We establish finite-time last-iterate guarantees for vanilla stochastic gradient descent in co-coercive games under noisy feedback. This is a broad class of games that is more general than strongly monotone games, allows for multiple Nash equilibria, and includes examples such as quadratic games with negative semidefinite interaction matrices and potential games with smooth concave potentials. Prior work in this setting has relied on relative noise models, where the noise vanishes as iterates approach equilibrium, an assumption that is often unrealistic in practice. We work instead under a substantially more general noise model in which the second moment of the noise is allowed to scale affinely with the squared norm of the iterates, an assumption natural in learning with unbounded action spaces. Under this model, we prove a last-iterate bound of order $O(\log(t)/t^{1/3})$, the first such bound for co-coercive games under non-vanishing noise. We additionally establish almost sure convergence of the iterates to the set of Nash equilibria and derive time-average convergence guarantees.

  • Social Federated Learning (SFL): Leveraging Shared Data to Boost Learning Performance

    SSRN Electronic Journal · 2026-01-01

    preprintOpen access
  • Regret and Sample Complexity of Online Q-Learning via Concentration of Stochastic Approximation with Time-Inhomogeneous Markov Chains

    Open MIND · 2026-02-18

    preprintSenior author

    We present the first regret bound for classical online Q-learning in infinite-horizon discounted Markov decision processes (MDPs), without relying on optimism or bonus terms. We first analyze Boltzmann Q-learning with decaying temperature and show that its regret depends critically on the suboptimality gap of the MDP: for sufficiently large gaps, the regret is sublinear, while for small gaps it deteriorates and can approach linear growth. To address this limitation, we study a Smoothed $ε_n$-Greedy exploration scheme that combines $ε_n$-greedy and Boltzmann exploration, for which we prove a gap-robust regret bound of near-$\tilde{O}(N^{9/10})$. We also obtain sample complexity guarantees, with both regret and sample complexity bounds holding with high probability. To analyze these algorithms, we develop a high-probability concentration bound for contractive Markovian stochastic approximation with iterate- and time-dependent transition dynamics. This bound may be of independent interest as the contraction factor in our framework is allowed to converge to one asymptotically.

  • Regret and Sample Complexity of Online Q-Learning via Concentration of Stochastic Approximation with Time-Inhomogeneous Markov Chains

    ArXiv.org · 2026-02-18

    articleOpen accessSenior author

    We present the first regret bound for classical online Q-learning in infinite-horizon discounted Markov decision processes (MDPs), without relying on optimism or bonus terms. We first analyze Boltzmann Q-learning with decaying temperature and show that its regret depends critically on the suboptimality gap of the MDP: for sufficiently large gaps, the regret is sublinear, while for small gaps it deteriorates and can approach linear growth. To address this limitation, we study a Smoothed $ε_n$-Greedy exploration scheme that combines $ε_n$-greedy and Boltzmann exploration, for which we prove a gap-robust regret bound of near-$\tilde{O}(N^{9/10})$. We also obtain sample complexity guarantees, with both regret and sample complexity bounds holding with high probability. To analyze these algorithms, we develop a high-probability concentration bound for contractive Markovian stochastic approximation with iterate- and time-dependent transition dynamics. This bound may be of independent interest as the contraction factor in our framework is allowed to converge to one asymptotically.

  • Heavy-Tailed and Long-Range Dependent Noise in Stochastic Approximation: A Finite-Time Analysis

    ArXiv.org · 2026-03-20

    articleOpen accessSenior author

    Stochastic approximation (SA) is a fundamental iterative framework with broad applications in reinforcement learning and optimization. Classical analyses typically rely on martingale difference or Markov noise with bounded second moments, but many practical settings, including finance and communications, frequently encounter heavy-tailed and long-range dependent (LRD) noise. In this work, we study SA for finding the root of a strongly monotone operator under these non-classical noise models. We establish the first finite-time moment bounds in both settings, providing explicit convergence rates that quantify the impact of heavy tails and temporal dependence. Our analysis employs a noise-averaging argument that regularizes the impact of noise without modifying the iteration. Finally, we apply our general framework to stochastic gradient descent (SGD) and gradient play, and corroborate our finite-time analysis through numerical experiments.

  • High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise

    ArXiv.org · 2026-03-15

    articleOpen accessSenior author

    We present the first uniform-in-time high-probability bound for SGD under the PL condition, where the gradient noise contains both Markovian and martingale difference components. This significantly broadens the scope of finite-time guarantees, as the PL condition arises in many machine learning and deep learning models while Markovian noise naturally arises in decentralized optimization and online system identification problems. We further allow the magnitude of noise to grow with the function value, enabling the analysis of many practical sampling strategies. In addition to the high-probability guarantee, we establish a matching $1/k$ decay rate for the expected suboptimality. Our proof technique relies on the Poisson equation to handle the Markovian noise and a probabilistic induction argument to address the lack of almost-sure bounds on the objective. Finally, we demonstrate the applicability of our framework by analyzing three practical optimization problems: token-based decentralized linear regression, supervised learning with subsampling for privacy amplification, and online system identification.

  • Heavy-Tailed and Long-Range Dependent Noise in Stochastic Approximation: A Finite-Time Analysis

    arXiv (Cornell University) · 2026-03-20

    preprintOpen accessSenior author

    Stochastic approximation (SA) is a fundamental iterative framework with broad applications in reinforcement learning and optimization. Classical analyses typically rely on martingale difference or Markov noise with bounded second moments, but many practical settings, including finance and communications, frequently encounter heavy-tailed and long-range dependent (LRD) noise. In this work, we study SA for finding the root of a strongly monotone operator under these non-classical noise models. We establish the first finite-time moment bounds in both settings, providing explicit convergence rates that quantify the impact of heavy tails and temporal dependence. Our analysis employs a noise-averaging argument that regularizes the impact of noise without modifying the iteration. Finally, we apply our general framework to stochastic gradient descent (SGD) and gradient play, and corroborate our finite-time analysis through numerical experiments.

  • Policy Gradient Methods for Non-Markovian Reinforcement Learning

    arXiv (Cornell University) · 2026-05-11

    preprintOpen accessSenior author

    We study policy gradient methods for reinforcement learning in non-Markovian decision processes (NMDPs), where observations and rewards depend on the entire interaction history. To handle this dependence, the agent maintains an internal state that is recursively updated to provide a compact summary of past observations and actions. In contrast to approaches that treat the agent state dynamics as fixed or learn it via predictive objectives, we propose a reward-centric formulation that jointly optimizes the agent state dynamics and the control policy to maximize the expected cumulative reward. To this end, we consider a class of Agent State-Markov (ASM) policies, comprising an agent state dynamics and a control policy that maps the agent state to actions. We establish a novel policy gradient theorem for ASM policies, extending the classical policy gradient results from the Markovian setting to episodic and infinite-horizon discounted NMDPs. Building on this gradient expression, we propose the Agent State-Markov Policy Gradient (ASMPG) algorithm, which leverages the recursive structure of the agent state dynamics for efficient optimization. We establish finite-time and almost sure convergence guarantees, and empirically demonstrate that, on a range of non-Markovian tasks, ASMPG outperforms baselines that learn state representations via predictive objectives.

Frequent coauthors

  • Zhengyuan Zhou

    Fu Wai Hospital

    56 shared
  • Panayotis Mertikopoulos

    Laboratoire d'Informatique de Grenoble

    45 shared
  • Neal Master

    35 shared
  • Ilai Bistritz

    Tel Aviv University

    27 shared
  • Aditya Dua

    Proteus Digital Health

    24 shared
  • Carri W. Chan

    24 shared
  • Peter W. Glynn

    24 shared
  • Daniel Miller

    Icahn School of Medicine at Mount Sinai

    22 shared

Education

  • Ph.D., Electrical Engineering

    Stanford University

    1990
  • M.S., Electrical Engineering

    Stanford University

    1985
  • B.S., Electrical Engineering and Computer Science

    University of California, Berkeley

    1983

Awards & honors

  • Cisco Systems Faculty Development Chair
  • David Morgenthaler Faculty Scholar
  • IBM Faculty Award
  • National Young Investigator Award
  • Research Initiation Award from the National Science Foundati…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Nicholas Bambos

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup