Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
James M. Robins

James M. Robins

· Mitchell L. and Robin LaFoley Dong Professor of EpidemiologyVerified

Harvard University · Epidemiology

Active 1981–2026

h-index118
Citations81.0k
Papers50873 last 5y
Funding$15.5M
See your match with James M. Robins — sign in to PhdFit.Sign in

Research topics

  • Computer Science
  • Artificial Intelligence
  • Political Science
  • Mathematics
  • Econometrics
  • Statistics
  • Law
  • Sociology
  • Library science
  • Psychology
  • Machine Learning
  • Social psychology
  • Developmental psychology
  • Gender studies
  • Economics
  • Applied mathematics
  • Demographic economics
  • Mathematical optimization

Selected publications

  • Estimating the effect of hepatitis C infection on multidrug-resistant tuberculosis treatment outcomes under hypothetical interventions on regimen composition and adherence

    American Journal of Epidemiology · 2026-01-31

    articleOpen access

    Hepatitis C virus (HCV) infection is associated with unfavorable multidrug- and rifampicin-resistant (MDR/RR) tuberculosis (TB) outcomes. We examined whether this association would decrease in settings where no participants were lost-to-follow-up or where all adhered to regimens comprised of priority TB drugs. We analyzed data from 1530 participants with HCV testing in the endTB observational cohort (NCT03259269). We estimated the relative risk of death, treatment failure, and loss-to-follow-up comparing participants with and without HCV, using inverse probability weighting to adjust for confounding. We then estimated relative risks of HCV on death and failure in weighted pseudopopulations representing hypothetical interventions eliminating loss-to-follow-up and ensuring adherence to strong MDR/RR-TB regimens. The unadjusted risk difference comparing participants with and without HCV was 14.1% (95% confidence interval [CI] 8.0%, 20.1%), decreasing to 11.0% (95%CI, 3.0%, 19.1%) after weighting. In pseudopopulations without loss-to-follow-up or with adequate adherence to strong regimens, the risk differences were 7.7% (95% CI, 0.8%, 16.2%) and 7.0% (95% CI, -1.6%, 17.3%), respectively. Adjustment for baseline confounders attenuated the association between HCV and unfavorable outcomes, suggesting these factors partly explain the disparity. Further attenuation after eliminating loss-to-follow-up suggests that improving treatment retention in MDR/RR-TB care may reduce outcome disparities among patients with HCV.

  • Causal Inference: A Tale of Three Frameworks

    Journal of Data Science · 2026-01-01 · 2 citations

    articleOpen accessSenior author

    Causal inference is a central goal across many scientific disciplines. Over the past several decades, three major frameworks have emerged to formalize causal questions and guide their analysis: the potential outcomes framework, structural equation models, and directed acyclic graphs. Although these frameworks differ in language, assumptions, and philosophical orientation, they often lead to compatible or complementary insights. This paper provides a comparative introduction to the three frameworks, clarifying their connections, highlighting their distinct strengths and limitations, and illustrating how they can be used together in practice. The discussion is aimed at researchers and graduate students with some background in statistics or causal inference who are seeking a conceptual foundation for applying causal methods across a range of substantive domains.

  • On the asymptotic validity of confidence sets for linear functionals of solutions to integral equations

    ArXiv.org · 2025-02-23

    preprintOpen access

    This paper examines the construction of confidence sets for parameters defined as linear functionals of a function of W and X whose conditional mean given Z and X equals the conditional mean of another variable Y given Z and X. Many estimands of interest in causal inference can be expressed in this form, including the average treatment effect in proximal causal inference and treatment effect contrasts in instrumental variable models. We derive a necessary condition for a confidence set to be uniformly valid over a model that allows for the dependence between W and Z given X to be arbitrarily weak. Specifically, we show that for any such confidence set, there must exist some laws in the model under which, with high probability, the confidence set has a diameter greater than or equal to the diameter of the parameter's range. In particular, consistent with the weak instruments literature, Wald confidence intervals are not uniformly valid over the aforementioned model when the parameter's range is infinite. Furthermore, we argue that inverting the score test, a successful approach in that literature, generally fails for the broader class of parameters considered here. We present a method for constructing uniformly valid confidence sets in the special case where all variables, but possibly Y, are binary and discuss its limitations. Finally, we emphasize that developing uniformly valid confidence sets for the class of parameters considered in this paper remains an open problem.

  • On the asymptotic validity of confidence sets for linear functionals of solutions to integral equations

    Biometrika · 2025-01-01

    articleOpen access

    This paper examines the construction of confidence sets for parameters defined as linear functionals of a function of [Formula: see text] and [Formula: see text] whose conditional mean given [Formula: see text] and [Formula: see text] equals the conditional mean of another variable [Formula: see text] given [Formula: see text] and [Formula: see text]. Many estimands of interest in causal inference can be expressed in this form, including the average treatment effect in proximal causal inference and treatment effect contrasts in instrumental variable models. We derive a necessary condition for a confidence set to be uniformly valid over a model that allows for the dependence between [Formula: see text] and [Formula: see text] given [Formula: see text] to be arbitrarily weak. We show that, for any such confidence set, there must exist some laws in the model under which, with high probability, the confidence set has a diameter greater than or equal to the diameter of the parameter's range. In particular, consistent with the weak instrument literature, Wald confidence intervals are not uniformly valid over the aforementioned model when the parameter's range is infinite. Furthermore, we argue that inverting the score test, a successful approach in that literature, generally fails for the broader class of parameters considered here. We present a method for constructing uniformly valid confidence sets when all variables, but possibly [Formula: see text], are binary, discuss its limitations and emphasize that developing valid confidence sets for the class of parameters considered here remains an open problem.

  • Rejoinder: Nonparametric identification is not enough, but randomized controlled trials are

    Observational Studies · 2025-03-01

    articleOpen access

    We thank the editor for organizing a diverse and wide-ranging discussion, and we thank the commentators for their detailed and thoughtful remarks. Most of the commentators provide broader perspectives on randomized experiments and their role in modern empirical practice. We believe this broader perspective is important, and the comments serve as complements to the somewhat narrow points we made in our paper. However, we believe these narrow points are of great consequence, and we find it useful to briefly recapitulate them here. When a practitioner aims to estimate averages of bounded potential outcomes (e.g., the average treatment effect on a binary outcome) in a setting where both ignorability and positivity are known to hold after adjusting for at least one continuous covariate, the following statements are true: • If the propensity score is known, such as in a randomized controlled trial (RCT), there exist simple estimators that are uniformly root-n consistent and asymptotically normal. Confidence intervals based on these estimators are finite-sample valid and their widths shrink at a root-n rate. • If the propensity score is not known, such as in an observational study, there exist neither uniformly consistent estimators nor uniform (i.e., honest) large-sample confidence intervals whose widths are shrinking with the sample size. To achieve these properties, the practitioner must impose untestable assumptions on either the propensity score function or the conditional expectation function of the outcomes.

  • Nonparametric identification is not enough, but randomized controlled trials are

    Observational Studies · 2025-03-01

    articleOpen access

    We argue that randomized controlled trials (RCTs) are special even among studies for which a nonparametric unconfoundedness assumption is credible. This claim follows from two results of Robins and Ritov (1997). First, in settings with at least one continuous confounder, there exists no estimator of the average treatment effect that is uniformly consistent unless the propensity score is known or additional assumptions are made on the complexity of the propensity score function. Second, with binary outcomes, knowledge of the propensity score yields a uniformly consistent estimator and finite-sample valid confidence intervals that shrink at a parametric rate, regardless of how complicated the propensity score function might be. We emphasize the latter point, and note that a successfully executed RCT provides knowledge of the propensity score to the researcher. We conclude that statistical estimation and inference tend to be fundamentally more difficult in observational settings than in RCTs, even when all confounders are observed and measured without error.

  • Debiased Ill-Posed Regression

    ArXiv.org · 2025-05-27

    preprintOpen access

    In various statistical settings, the goal is to estimate a function which is restricted by the statistical model only through a conditional moment restriction. Prominent examples include the nonparametric instrumental variable framework for estimating the structural function of the outcome variable, and the proximal causal inference framework for estimating the bridge functions. A common strategy in the literature is to find the minimizer of the projected mean squared error. However, this approach can be sensitive to misspecification or slow convergence rate of the estimators of the involved nuisance components. In this work, we propose a debiased estimation strategy based on the influence function of a modification of the projected error and demonstrate its finite-sample convergence rate. Our proposed estimator possesses a second-order bias with respect to the involved nuisance functions and a desirable robustness property with respect to the misspecification of one of the nuisance functions. The proposed estimator involves a hyper-parameter, for which the optimal value depends on potentially unknown features of the underlying data-generating process. Hence, we further propose a hyper-parameter selection approach based on cross-validation and derive an error bound for the resulting estimator. This analysis highlights the potential rate loss due to hyper-parameter selection and underscore the importance and advantages of incorporating debiasing in this setting. We also study the application of our approach to the estimation of regular parameters in a specific parameter class, which are linear functionals of the solutions to the conditional moment restrictions and provide sufficient conditions for achieving root-n consistency using our debiased estimator.

  • Grace periods in comparative effectiveness studies of sustained treatments

    Journal of the Royal Statistical Society Series A (Statistics in Society) · 2024-01-22 · 13 citations

    articleOpen access

    Abstract Researchers are often interested in estimating the effect of sustained use of a treatment on a health outcome. However, adherence to strict treatment protocols can be challenging for individuals in practice and, when non-adherence is expected, estimates of the effect of sustained use may not be useful for decision making. As an alternative, more relaxed treatment protocols which allow for periods of time off treatment (i.e. grace periods) have been considered in pragmatic randomized trials and observational studies. In this article, we consider the interpretation, identification, and estimation of treatment strategies which include grace periods. We contrast natural grace period strategies which allow individuals the flexibility to take treatment as they would naturally do, with stochastic grace period strategies in which the investigator specifies the distribution of treatment utilization. We estimate the effect of initiation of a thiazide diuretic or an angiotensin-converting enzyme inhibitor in hypertensive individuals under various strategies which include grace periods.

  • Minimax rates for heterogeneous causal effect estimation

    The Annals of Statistics · 2024-04-01 · 8 citations

    articleOpen access

    Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a Hölder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid.

  • Thomas S. Richardson and James M. Robins’ contribution to the Discussion of ‘Parameterizing and simulating from causal models’ by Evans and Didelez

    Journal of the Royal Statistical Society Series B (Statistical Methodology) · 2024-02-23 · 1 citations

    articleOpen accessSenior author

Recent grants

Frequent coauthors

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with James M. Robins

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup