Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Tengyuan Liang

Tengyuan Liang

· JP Gan Professor of Econometrics and Statistics, and Applied AI in the Wallman Society of Fellows

University of Chicago · Applied AI

Active 2014–2026

h-index20
Citations1.7k
Papers8439 last 5y
Funding$400k1 active
See your match with Tengyuan Liang — sign in to PhdFit.Sign in

About

Tengyuan Liang is the JP Gan Professor of Econometrics and Statistics, and Applied AI in the Wallman Society of Fellows at the University of Chicago Booth School of Business. His research builds mathematical foundations for modern AI, spanning statistical learning theory, generative models, and causal inference.

Research topics

  • Statistics
  • Applied mathematics
  • Computer Science
  • Mathematics
  • Artificial Intelligence
  • Algorithm
  • Econometrics
  • Combinatorics

Selected publications

  • Nonparametric Point Identification of Treatment Effect Distributions via Rank Stickiness

    arXiv (Cornell University) · 2026-04-23

    preprintOpen access1st authorCorresponding

    Treatment effect distributions are not identified without restrictions on the joint distribution of potential outcomes. Existing approaches either impose rank preservation -- a strong assumption -- or derive partial identification bounds that are often wide. We show that a single scalar parameter, rank stickiness, suffices for nonparametric point identification while permitting rank violations. The identified joint distribution -- the coupling that maximizes average rank correlation subject to a relative entropy constraint, which we call the Bregman-Sinkhorn copula -- is uniquely determined by the marginals and rank stickiness. Its conditional distribution is an exponential tilt of the marginal with a Bregman divergence as the exponent, yielding closed-form conditional moments and rank violation probabilities; the copula nests the comonotonic and Gaussian copulas as special cases. The empirical Bregman-Sinkhorn copula converges at the parametric $\sqrt{n}$-rate with a Gaussian process limit, despite the infinite-dimensional parameter space. We apply the framework to estimate the full treatment effect distribution, derive a variance estimator for the average treatment effect tighter than the Fréchet--Hoeffding and Neyman bounds, and extend to observational studies under unconfoundedness.

  • Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction

    Journal of the American Statistical Association · 2026-04-22

    preprintOpen accessSenior author

    Practitioners often face the challenge of deploying prediction models in new environments with shifted distributions of covariates and responses. With observational data, such shifts are often driven by unobserved confounding, and can in fact alter the concept of which model is best. This paper studies distribution shifts in the domain adaptation problem with unobserved confounding. We postulate a linear structural causal model to account for endogeneity and unobserved confounding, and we leverage exogenous invariant covariate representations to cure concept shifts and improve target prediction. We propose a data-driven representation learning method that optimizes for a lower-dimensional linear subspace and a prediction model confined to that subspace. This method operates on a non-convex objective—that interpolates between predictability and stability—constrained to the Stiefel manifold, using an analog of projected gradient descent. We analyze the optimization landscape and prove that, provided sufficient regularization, nearly all local optima align with an invariant linear subspace resilient to distribution shifts. This method achieves a nearly ideal gap between target and source risk. We validate the method and theory with real-world data sets to illustrate the tradeoffs between predictability and stability.

  • Nonparametric Point Identification of Treatment Effect Distributions via Rank Stickiness

    arXiv (Cornell University) · 2026-04-23

    articleOpen access1st authorCorresponding

    Treatment effect distributions are not identified without restrictions on the joint distribution of potential outcomes. Existing approaches either impose rank preservation -- a strong assumption -- or derive partial identification bounds that are often wide. We show that a single scalar parameter, rank stickiness, suffices for nonparametric point identification while permitting rank violations. The identified joint distribution -- the coupling that maximizes average rank correlation subject to a relative entropy constraint, which we call the Bregman-Sinkhorn copula -- is uniquely determined by the marginals and rank stickiness. Its conditional distribution is an exponential tilt of the marginal with a Bregman divergence as the exponent, yielding closed-form conditional moments and rank violation probabilities; the copula nests the comonotonic and Gaussian copulas as special cases. The empirical Bregman-Sinkhorn copula converges at the parametric $\sqrt{n}$-rate with a Gaussian process limit, despite the infinite-dimensional parameter space. We apply the framework to estimate the full treatment effect distribution, derive a variance estimator for the average treatment effect tighter than the Fréchet--Hoeffding and Neyman bounds, and extend to observational studies under unconfoundedness.

  • Textual Factors: A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information

    Management Science · 2025-10-14 · 1 citations

    article

    We introduce a general approach for analyzing large-scale text-based data, combining the strengths of neural network language processing and generative statistical modeling to create a factor structure of unstructured data for downstream regressions typically used in social sciences. We generate textual factors by (i) representing texts using vector word embedding, (ii) clustering the vectors using locality-sensitive hashing to generate supports of topics, and (iii) identifying relatively interpretable spanning clusters (i.e., textual factors) through topic modeling. Our data-driven approach captures complex linguistic structures while ensuring computational scalability and economic interpretability, plausibly attaining certain advantages over and complementing other unstructured data analytics used by researchers, including emergent large language models. We conduct initial validation tests of the framework and discuss three types of its applications: (i) enhancing prediction and inference with texts, (ii) interpreting (non–text-based) models, and (iii) constructing new text-based metrics and explanatory variables. We illustrate each of these applications using examples in finance and economics such as macroeconomic forecasting from news articles, interpreting multifactor asset pricing models from corporate filings, and measuring theme-based technology breakthroughs from patents. Finally, we provide a flexible statistical package of textual factors for online distribution to facilitate future research and applications. This paper was accepted by David Simchi-Levi, finance. Funding: The authors gratefully acknowledge the financial support from the Ewing Marion Kauffman Foundation, the Becker Friedman Institute of Economics, the Fama-Miller Center for Research in Finance, INQUIRE Europe, the Kenan Institute of Private Enterprise, and the Risk Institute at OSU Fisher College of Business (while L. W. Cong was a fellow at the institute). W. Zhu acknowledges financial support from the Tsinghua University Initiative Scientific Research Program [Grant 2022Z04W02016], the Tsinghua University School of Economics and Management [Research Grant 2022051002], and the National Natural Science Foundation of China [Grant 72442014]. Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2020.01180 .

  • Gaussianized Design Optimization for Covariate Balance in Randomized Experiments

    ArXiv.org · 2025-02-22

    preprintOpen access

    Achieving covariate balance in randomized experiments enhances the precision of treatment effect estimation. However, existing methods often require heuristic adjustments based on domain knowledge and are primarily developed for binary treatments. This paper presents Gaussianized Design Optimization, a novel framework for optimally balancing covariates in experimental design. The core idea is to Gaussianize the treatment assignments: we model treatments as transformations of random variables drawn from a multivariate Gaussian distribution, converting the design problem into a nonlinear continuous optimization over Gaussian covariance matrices. Compared to existing methods, our approach offers significant flexibility in optimizing covariate balance across a diverse range of designs and covariate types. Adapting the Burer-Monteiro approach for solving semidefinite programs, we introduce first-order local algorithms for optimizing covariate balance, improving upon several widely used designs. Furthermore, we develop inferential procedures for constructing design-based confidence intervals under Gaussianization and extend the framework to accommodate continuous treatments. Simulations demonstrate the effectiveness of Gaussianization in multiple practical scenarios.

  • Randomization inference when N equals one

    Biometrika · 2025-01-01 · 3 citations

    article1st authorCorresponding

    Summary For decades, $ N $-of-1 experiments, where a unit serves as its own control and treatment in different time windows, have been used in certain medical contexts. However, due to effects that accumulate over long time windows and interventions that have complex evolution, a lack of robust inference tools has limited the widespread applicability of such $ N $-of-1 designs. This work combines techniques from experimental design in causal inference and system identification from control theory to provide such an inference framework. We derive a model of the dynamic interference effect that arises in linear time-invariant dynamical systems. We show that a family of causal estimands analogous to those studied in potential outcomes are estimable via a standard estimator derived from the method of moments. We derive formulae for higher moments of this estimator and describe conditions under which $ N $-of-1 designs may provide faster ways to estimate the effects of interventions in dynamical systems. We also provide conditions under which our estimator is asymptotically normal and derive valid confidence intervals for this setting.

  • Distributional Shrinkage II: Higher-Order Scores Encode Brenier Map

    ArXiv.org · 2025-12-10

    preprintOpen access1st authorCorresponding

    Consider the additive Gaussian model $Y = X + σZ$, where $X \sim P$ is an unknown signal, $Z \sim N(0,1)$ is independent of $X$, and $σ> 0$ is known. Let $Q$ denote the law of $Y$. We construct a hierarchy of denoisers $T_0, T_1, \ldots, T_\infty \colon \mathbb{R} \to \mathbb{R}$ that depend only on higher-order score functions $q^{(m)}/q$, $m \geq 1$, of $Q$ and require no knowledge of the law $P$. The $K$-th order denoiser $T_K$ involves scores up to order $2K{-}1$ and satisfies $W_r(T_K \sharp Q, P) = O(σ^{2(K+1)})$ for every $r \geq 1$; in the limit, $T_\infty$ recovers the monotone optimal transport map (Brenier map) pushing $Q$ onto $P$. We provide a complete characterization of the combinatorial structure governing this hierarchy through partial Bell polynomial recursions, making precise how higher-order score functions encode the Brenier map. We further establish rates of convergence for estimating these scores from $n$ i.i.d.\ draws from $Q$ under two complementary strategies: (i) plug-in kernel density estimation, and (ii) higher-order score matching. The construction reveals a precise interplay among higher-order Fisher-type information, optimal transport, and the combinatorics of integer partitions.

  • No-Regret Generative Modeling via Parabolic Monge-Ampère PDE

    ArXiv.org · 2025-04-12

    preprintOpen accessSenior author

    We introduce a novel generative modeling framework based on a discretized parabolic Monge-Ampère PDE, which emerges as a continuous limit of the Sinkhorn algorithm commonly used in optimal transport. Our method performs iterative refinement in the space of Brenier maps using a mirror gradient descent step. We establish theoretical guarantees for generative modeling through the lens of no-regret analysis, demonstrating that the iterates converge to the optimal Brenier map under a variety of step-size schedules. As a technical contribution, we derive a new Evolution Variational Inequality tailored to the parabolic Monge-Ampère PDE, connecting geometry, transportation cost, and regret. Our framework accommodates non-log-concave target distributions, constructs an optimal sampling process via the Brenier map, and integrates favorable learning techniques from generative adversarial networks and score-based diffusion models. As direct applications, we illustrate how our theory paves new pathways for generative modeling and variational inference.

  • Distributional Shrinkage I: Universal Denoiser Beyond Tweedie's Formula

    ArXiv.org · 2025-11-12

    preprintOpen access1st authorCorresponding

    We study the problem of denoising when only the noise level is known, not the noise distribution. Independent noise $Z$ corrupts a signal $X$, yielding the observation $Y = X + σZ$ with known $σ\in (0,1)$. We propose \emph{universal} denoisers, agnostic to both signal and noise distributions, that recover the signal distribution $P_X$ from $P_Y$. When the focus is on distributional recovery of $P_X$ rather than on individual realizations of $X$, our denoisers achieve order-of-magnitude improvements over the Bayes-optimal denoiser derived from Tweedie's formula, which achieves $O(σ^2)$ accuracy. They shrink $P_Y$ toward $P_X$ with $O(σ^4)$ and $O(σ^6)$ accuracy in matching generalized moments and densities. Drawing on optimal transport theory, our denoisers approximate the Monge--Ampère equation with higher-order accuracy and can be implemented efficiently via score matching. Let $q$ denote the density of $P_Y$. For distributional denoising, we propose replacing the Bayes-optimal denoiser, $$\mathbf{T}^*(y) = y + σ^2 \nabla \log q(y),$$ with denoisers exhibiting less-aggressive distributional shrinkage, $$\mathbf{T}_1(y) = y + \frac{σ^2}{2} \nabla \log q(y),$$ $$\mathbf{T}_2(y) = y + \frac{σ^2}{2} \nabla \log q(y) - \frac{σ^4}{8} \nabla \!\left( \frac{1}{2} \| \nabla \log q(y) \|^2 + \nabla \cdot \nabla \log q(y) \right)\!.$$

  • Textual Factors: A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information

    SSRN Electronic Journal · 2024-01-01 · 5 citations

    articleOpen access

Recent grants

Frequent coauthors

Awards & honors

  • J. Parker Bursk Memorial Prize
  • Winkelman Fellowship
  • National Science Foundation CAREER Grant
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Tengyuan Liang

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup