Erhan Bayraktar

· ProfessorVerified

University of Michigan · Mathematics

Active 2003–2026

h-index36

Citations4.7k

Papers657146 last 5y

Funding$1.7M1 active

Faculty page Lab page

See your match with Erhan Bayraktar — sign in to PhdFit.Sign in

About

Erhan Bayraktar is a full professor of Mathematics at the University of Michigan, holding the Susan Smith Chair since 2004. His research areas include stochastic analysis, control, applied probability, mean field games, machine learning, and mathematical finance. He has authored over 200 publications in top journals within these fields and is recognized as a leader in his research areas. Professor Bayraktar serves as a corresponding editor for the SIAM Journal on Control and Optimization and is on the editorial boards of several other prominent journals, including Applied Mathematics and Optimization, Frontiers in Mathematical Finance, Mathematics of Operations Research, and Mathematical Finance. His research has been continually funded by the National Science Foundation, including receiving a CAREER grant. He has been a plenary speaker at numerous conferences and workshops worldwide. In addition to his research, Professor Bayraktar is actively involved in teaching and academic activities, serving as the director of the Risk Management and Quantitative Finance Masters program since 2015. He organizes seminars and international workshops in stochastic analysis for finance and insurance. He has mentored 17 Ph.D. students, with 13 graduates now holding prestigious academic and industry positions, and has mentored over 40 postdoctoral researchers.

Research topics

Mathematics
Mathematical analysis
Humanities
Physics
Mathematical optimization
Computer Science
Artificial Intelligence
Mathematical physics
Pure mathematics
Philosophy
Economics
Mathematical economics
Combinatorics
Algorithm
Statistics
Applied mathematics

Selected publications

Reinforcement Learning for Discounted and Ergodic Control of Diffusion Processes
ArXiv.org · 2026-03-13
articleOpen access1st authorCorresponding
This paper develops a quantized Q-learning algorithm for the optimal control of controlled diffusion processes on $\mathbb{R}^d$ under both discounted and ergodic (average) cost criteria. We first establish near-optimality of finite-state MDP approximations to discrete-time discretizations of the diffusion, then introduce a quantized Q-learning scheme and prove its almost-sure convergence to near-optimal policies for the finite MDP. These policies, when interpolated to continuous time, are shown to be near-optimal for the original diffusion model under discounted costs and -- via a vanishing-discount argument -- also under ergodic costs for sufficiently small discount factors. The analysis applies under mild conditions (Lipschitz dynamics, non-degeneracy, bounded continuous costs, and Lyapunov stability for ergodic case) without requiring prior knowledge of the system dynamics or restrictions on control policies (beyond admissibility). Our results complement recent work on continuous-time reinforcement learning for diffusions by providing explicit near-optimality rates and extending rigorous guarantees both for discounted cost and ergodic cost criteria for diffusions with unbounded state space.
Publisher OA PDF
When Diffusion Model Can Ignore Dimension: An Entropy-Based Theory
ArXiv.org · 2026-05-08
articleOpen accessSenior author
Diffusion models perform remarkably well on high-dimensional data such as images, often using only a modest number of reverse-time steps. Despite this practical success, existing convergence theory does not fully explain why such samplers remain efficient in high dimensions. Many prior KL guarantees bound the discretization error in terms of the ambient dimension, while other improved results replace this dependence using intrinsic-dimensional or geometric structure assumptions. In this work, we develop an alternative information-theoretic perspective on diffusion sampler convergence. We prove that, for Gaussian mixture targets, the discretization error is controlled by the Shannon entropy of the latent mixture component rather than by the ambient dimension. Consequently, the leading step complexity scales linearly with latent entropy and depends only logarithmically on the second moment of the data. Our analysis also extends to discrete target distributions, where the relevant complexity is the entropy of the target rather than the dimension of the embedding space. These results suggest that diffusion sampling can remain efficient in high-dimensional spaces when the data distribution admits a compact latent representation, as is widely believed to be the case for natural images.
Publisher OA PDF
Generalizing super/sub mot using weak ℓ1 transport
Bernoulli · 2026-04-29
article1st authorCorresponding
Publisher DOI
The Demand Externality of Automation
ArXiv.org · 2026-05-06
articleOpen access1st authorCorresponding
Automation raises productivity and reduces paid human labor, but it also reallocates income and ownership claims. This paper studies that tradeoff in a static benchmark and in a stationary heterogeneous-agent general equilibrium. Firms choose automation from a profit function. Households differ by skill and wealth, save in a capital/equity claim, and face incomplete insurance. Wages and returns are determined by market clearing from a Cobb--Douglas final-good firm, while the wealth distribution is pinned down by a Hamilton--Jacobi--Bellman (HJB) equation and a Kolmogorov forward equation (KFE). The paper is deliberately two-sided. With strong productivity growth, high-skill complementarity, low obsolescence, and broad ownership, automation raises output, capital, and consumption. With strong exposure of low-wealth, high-marginal-propensity-to-consume (high-MPC) households and concentrated ownership, privately chosen automation can be excessive even though it raises high-skilled labor income. The central object is the derivative of household consumption demand and collective wage bill with respect to automation. Fiscal policy is modeled as a government problem rather than as an abstract planner: a tax changes the firm's automation first-order condition, raises revenue only on the remaining automation base, and must specify rebates and administrative losses.
Publisher OA PDF
Equilibrium for Time-inconsistent Mean Field Games: A Systematic Analysis by Entropy Regularization
ArXiv.org · 2026-05-14
articleOpen access1st authorCorresponding
This paper studies the existence and approximation of equilibria for general time-inconsistent mean field game (MFG) problems in the continuous-time setting. To handle the intricate nonlocal equilibrium Hamilton-Jacobi-Bellman (EHJB) system arising from initial-time dependence, such as non-exponential discounting, we develop a vanishing entropy regularization approach for solving the MFG. With entropy regularization, we first characterize the regularized equilibrium via a coupled exploratory equilibrium HJB (EEHJB) equation and a law-dependent stochastic differential equation. By exploiting Schauder fixed-point arguments and tailored parabolic regularity estimates in a suitable functional space involving both value functions and measure flows, we establish the global existence of regularized equilibria under mild assumptions. We next analyze convergence as the entropy regularization vanishes. By employing compactness arguments, Young measure techniques, and a duality tool for divergence-form Fokker-Planck equations, we prove that the regularized equilibria converge, up to subsequences, to an equilibrium of the original time-inconsistent MFG. Furthermore, under entropy regularization, we propose a policy iteration algorithm and establish its convergence under a short time horizon and weak terminal interaction conditions.
Publisher OA PDF
Continuous-time Online Learning via Mean-Field Neural Networks: Regret Analysis in Diffusion Environments
arXiv (Cornell University) · 2026-04-13
preprintOpen access1st authorCorresponding
We study continuous-time online learning where data are generated by a diffusion process with unknown coefficients. The learner employs a two-layer neural network, continuously updating its parameters in a non-anticipative manner. The mean-field limit of the learning dynamics corresponds to a stochastic Wasserstein gradient flow adapted to the data filtration. We establish regret bounds for both the mean-field limit and finite-particle system. Our analysis leverages the logarithmic Sobolev inequality, Polyak-Lojasiewicz condition, Malliavin calculus, and uniform-in-time propagation of chaos. Under displacement convexity, we obtain a constant static regret bound. In the general non-convex setting, we derive explicit linear regret bounds characterizing the effects of data variation, entropic exploration, and quadratic regularization. Finally, our simulations demonstrate the outperformance of the online approach and the impact of network width and regularization parameters.
Publisher DOI
The Demand Externality of Automation
arXiv (Cornell University) · 2026-05-06
preprintOpen access1st authorCorresponding
Automation raises productivity and reduces paid human labor, but it also reallocates income and ownership claims. This paper studies that tradeoff in a static benchmark and in a stationary heterogeneous-agent general equilibrium. Firms choose automation from a profit function. Households differ by skill and wealth, save in a capital/equity claim, and face incomplete insurance. Wages and returns are determined by market clearing from a Cobb--Douglas final-good firm, while the wealth distribution is pinned down by a Hamilton--Jacobi--Bellman (HJB) equation and a Kolmogorov forward equation (KFE). The paper is deliberately two-sided. With strong productivity growth, high-skill complementarity, low obsolescence, and broad ownership, automation raises output, capital, and consumption. With strong exposure of low-wealth, high-marginal-propensity-to-consume (high-MPC) households and concentrated ownership, privately chosen automation can be excessive even though it raises high-skilled labor income. The central object is the derivative of household consumption demand and collective wage bill with respect to automation. Fiscal policy is modeled as a government problem rather than as an abstract planner: a tax changes the firm's automation first-order condition, raises revenue only on the remaining automation base, and must specify rebates and administrative losses.
Publisher DOI
Equilibrium for Time-inconsistent Mean Field Games: A Systematic Analysis by Entropy Regularization
arXiv (Cornell University) · 2026-05-14
preprintOpen access1st authorCorresponding
This paper studies the existence and approximation of equilibria for general time-inconsistent mean field game (MFG) problems in the continuous-time setting. To handle the intricate nonlocal equilibrium Hamilton-Jacobi-Bellman (EHJB) system arising from initial-time dependence, such as non-exponential discounting, we develop a vanishing entropy regularization approach for solving the MFG. With entropy regularization, we first characterize the regularized equilibrium via a coupled exploratory equilibrium HJB (EEHJB) equation and a law-dependent stochastic differential equation. By exploiting Schauder fixed-point arguments and tailored parabolic regularity estimates in a suitable functional space involving both value functions and measure flows, we establish the global existence of regularized equilibria under mild assumptions. We next analyze convergence as the entropy regularization vanishes. By employing compactness arguments, Young measure techniques, and a duality tool for divergence-form Fokker-Planck equations, we prove that the regularized equilibria converge, up to subsequences, to an equilibrium of the original time-inconsistent MFG. Furthermore, under entropy regularization, we propose a policy iteration algorithm and establish its convergence under a short time horizon and weak terminal interaction conditions.
Publisher DOI
Policy Gradient for Continuous-Time Mean-Field Control
arXiv (Cornell University) · 2026-05-20
preprintOpen access1st authorCorresponding
This paper develops a policy gradient method for entropy-regularized mean-field control in the discounted infinite-horizon setting. We consider randomized feedback policies and a coupled representative-particle/population system, in which the representative state evolves jointly with a population law governed by a McKean--Vlasov equation. The resulting value function is therefore defined on the product space $\mathbb R^d \times \mathcal P_2(\mathbb R^d)$. A key distinction from existing policy gradient methods for mean-field control is that, after computing the value function under a fixed policy, our approach does not require solving an additional equation to obtain the policy gradient. Instead, we derive an explicit policy gradient formula directly in terms of the value function. The formulation is based on an instantaneous advantage function, which quantifies the gain of taking a given action relative to the current randomized policy. We establish a Gâteaux policy-gradient formula, which gives the first-order variation of the objective along arbitrary policy perturbations, and then derive the corresponding ascent direction under finite-dimensional policy parametrization. The resulting formula leads to a model-based actor--critic scheme. The critic is obtained by solving the associated linear stationary Hamilton--Jacobi--Bellman equation for the value function, using cylindrical functions to represent dependence on the population law. The actor is then updated according to the derived policy-gradient formula. We further analyze the well-posedness of the PDE in a polynomial-growth function class. Finally, we illustrate the proposed method through numerical experiments on an LQR model and a crowd-motion problem.
Publisher DOI
Analytical Approach to Continuous-Time Causal Optimal Transport
arXiv (Cornell University) · 2026-05-19
preprintOpen access
We study causal optimal transport in continuous time, with Markovian cost, between a finite-state Markov source and a diffusion target. By replacing the source with its conditional law given the observation of the target, we characterize the value of this transport problem through a fully nonlinear parabolic master equation on an enlarged state space. We further show that this value coincides with those of two equivalent stochastic control problems on the simplex: a control of the Kushner--Stratonovich filtering equation with a zero-mean condition, and a state-constrained stochastic optimal control problem. Both formulations give rise to implementable numerical schemes that approximate the value from above and below.
Publisher DOI

Recent grants

New Developments in Mean Field Game Theory and Applications
NSF · $330k · 2021–2026
CAREER: Topics in Optimal Stopping and Control
NSF · $400k · 2010–2016
New Problems in Stochastic Control Motivated by Mathematical Finance
NSF · $339k · 2016–2020
Problems in Stochastic Control, Incomplete Markets, and Stochastic Limit Theorems
NSF · $89k · 2006–2009
AMC-SS: Problems in Mathematical Finance
NSF · $282k · 2009–2013

Frequent coauthors

Virginia R. Young
89 shared
Song Yao
University of Pittsburgh
45 shared
Yuchong Zhang
University of Toronto
35 shared
Xin Zhang
34 shared
Zhou Zhou
Chongqing University
28 shared
H. Vincent Poor
Princeton University
28 shared
Bahman Angoshtari
27 shared
Hao Xing
Citadel
23 shared

Labs

U-M LSA MathematicsPI

Education

PHD, ELECTRICAL ENGINEERING
PRINCETON UNIVERSITY
2004
BS, Mathematics
Middle East Technical University
2000
BS, Electrical Engineering
Middle East Technical University
2000

Awards & honors

CAREER grant from the National Science Foundation

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Erhan Bayraktar

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you