Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Ronald Parr

Ronald Parr

· Professor of Computer Science

Duke University · Computer Science

Active 1993–2025

h-index38
Citations6.1k
Papers8210 last 5y
Funding$1.3M
See your match with Ronald Parr — sign in to PhdFit.Sign in

About

Ronald Parr is a Professor of Computer Science at Duke University, where he has contributed extensively to the fields of machine learning, reinforcement learning, and decision processes. His research focuses on developing algorithms for Markov Decision Processes, reinforcement learning, and related areas, with particular attention to policy iteration, value function approximation, and the application of probabilistic models in robotics and autonomous systems. Parr has authored numerous influential papers, including work on non-myopic sensing, factored MDPs, and the integration of noise as a regularizer in decision-making models. Throughout his career, Parr has been recognized for his contributions to artificial intelligence and machine learning, including winning awards such as the IJCAI-JAIR best paper award. He has held leadership roles, such as department chair at Duke University, and has delivered multiple graduation speeches. His work has advanced the understanding of optimal strategies in stochastic environments and has contributed to the development of scalable algorithms for complex decision-making tasks in robotics, security games, and reinforcement learning. Parr's research is characterized by a rigorous approach to solving real-world problems through innovative computational methods and theoretical insights.

Research topics

  • Computer Science
  • Machine Learning
  • Artificial Intelligence
  • Mathematics
  • Algorithm
  • Mathematical optimization
  • Engineering
  • Mathematical analysis
  • Theoretical computer science
  • Mathematical economics

Selected publications

  • A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning

    arXiv (Cornell University) · 2025-01-03

    preprintOpen accessSenior author

    In off-policy policy evaluation (OPE) tasks within reinforcement learning, Temporal Difference Learning(TD) and Fitted Q-Iteration (FQI) have traditionally been viewed as differing in the number of updates toward the target value function: TD makes one update, FQI makes an infinite number, and Partial Fitted Q-Iteration (PFQI) performs a finite number. We show that this view is not accurate, and provide a new mathematical perspective under linear value function approximation that unifies these methods as a single iterative method solving the same linear system, but using different matrix splitting schemes and preconditioners. We show that increasing the number of updates under the same target value function, i.e., the target network technique, is a transition from using a constant preconditioner to using a data-feature adaptive preconditioner. This elucidates, for the first time, why TD convergence does not necessarily imply FQI convergence, and establishes tight convergence connections among TD, PFQI, and FQI. Our framework enables sharper theoretical results than previous work and characterization of the convergence conditions for each algorithm, without relying on assumptions about the features (e.g., linear independence). We also provide an encoder-decoder perspective to better understand the convergence conditions of TD, and prove, for the first time, that when a large learning rate doesn't work, trying a smaller one may help. Our framework also leads to the discovery of new crucial conditions on features for convergence, and shows how common assumptions about features influence convergence, e.g., the assumption of linearly independent features can be dropped without compromising the convergence guarantees of stochastic TD in the on-policy setting. This paper is also the first to introduce matrix splitting into the convergence analysis of these algorithms.

  • Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

    2024-01-01 · 2 citations

    article
  • Using Noise to Infer Aspects of Simplicity Without Learning

    2024-01-01

    article
  • Amazing Things Come From Having Many Good Models

    arXiv (Cornell University) · 2024-07-05 · 6 citations

    preprintOpen access

    The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mostly magic. In light of the Rashomon Effect, this perspective piece proposes reshaping the way we think about machine learning, particularly for tabular data problems in the nondeterministic (noisy) setting. We address how the Rashomon Effect impacts (1) the existence of simple-yet-accurate models, (2) flexibility to address user preferences, such as fairness and monotonicity, without losing performance, (3) uncertainty in predictions, fairness, and explanations, (4) reliable variable importance, (5) algorithm choice, specifically, providing advanced knowledge of which algorithms might be suitable for a given problem, and (6) public policy. We also discuss a theory of when the Rashomon Effect occurs and why. Our goal is to illustrate how the Rashomon Effect can have a massive impact on the use of machine learning for complex problems in society.

  • An Optimal Tightness Bound for the Simulation Lemma

    arXiv (Cornell University) · 2024-06-24

    preprintOpen accessSenior author

    We present a bound for value-prediction error with respect to model misspecification that is tight, including constant factors. This is a direct improvement of the "simulation lemma," a foundational result in reinforcement learning. We demonstrate that existing bounds are quite loose, becoming vacuous for large discount factors, due to the suboptimal treatment of compounding probability errors. By carefully considering this quantity on its own, instead of as a subcomponent of value error, we derive a bound that is sub-linear with respect to transition function misspecification. We then demonstrate broader applicability of this technique, improving a similar bound in the related subfield of hierarchical abstraction.

  • Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

    arXiv (Cornell University) · 2024-07-10

    preprintOpen access

    Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can an agent learn such a state representation, and how can it detect when it has found one? We introduce a metric that can accomplish both objectives, without requiring access to -- or knowledge of -- an underlying, unobservable state space. Our metric, the $λ$-discrepancy, is the difference between two distinct temporal difference (TD) value estimates, each computed using TD($λ$) with a different value of $λ$. Since TD($λ{=}0$) makes an implicit Markov assumption and TD($λ{=}1$) does not, a discrepancy between these estimates is a potential indicator of a non-Markovian state representation. Indeed, we prove that the $λ$-discrepancy is exactly zero for all Markov decision processes and almost always non-zero for a broad class of partially observable environments. We also demonstrate empirically that, once detected, minimizing the $λ$-discrepancy can help with learning a memory function to mitigate the corresponding partial observability. We then train a reinforcement learning agent that simultaneously constructs two recurrent value networks with different $λ$ parameters and minimizes the difference between them as an auxiliary loss. The approach scales to challenging partially observable domains, where the resulting agent frequently performs significantly better (and never performs worse) than a baseline recurrent agent with only a single value network.

  • A Path to Simpler Models Starts With Noise

    PubMed · 2023-10-30 · 1 citations

    preprintOpen access

    The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large. In this work, we propose and study a mechanism of the data generation process, coupled with choices usually made by the analyst during the learning process, that determines the size of the Rashomon ratio. Specifically, we demonstrate that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models. Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.

  • A Path to Simpler Models Starts With Noise

    2023-01-01

    article
  • On the Existence of Simpler Machine Learning Models

    2022 ACM Conference on Fairness, Accountability, and Transparency · 2022 · 66 citations

    Senior authorCorresponding
    • Computer Science
    • Computer Science
    • Artificial Intelligence

    It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is generally NP-hard. We often do not know whether the search for a simpler model will be worthwhile, and thus we do not go to the trouble of searching for one. In this work, we ask an important practical question: can accurate-yet-simple models be proven to exist, or shown likely to exist, before explicitly searching for them? We hypothesize that there is an important reason that simple-yet-accurate models often do exist. This hypothesis is that the size of the Rashomon set is often large, where the Rashomon set is the set of almost-equally-accurate models from a function class. If the Rashomon set is large, it contains numerous accurate models, and perhaps at least one of them is the simple model we desire. In this work, we formally present the Rashomon ratio as a new gauge of simplicity for a learning problem, depending on a function class and a data set. The Rashomon ratio is the ratio of the volume of the set of accurate models to the volume of the hypothesis space, and it is different from standard complexity measures from statistical learning theory. Insight from studying the Rashomon ratio provides an easy way to check whether a simpler model might exist for a problem before finding it, namely whether several different machine learning methods achieve similar performance on the data. In that sense, the Rashomon ratio is a powerful tool for understanding why and when an accurate-yet-simple model might exist. If, as we hypothesize in this work, many real-world data sets admit large Rashomon sets, the implications are vast: it means that simple or interpretable models may often be used for high-stakes decisions without losing accuracy.

  • Computing Optimal Strategies to Commit to in Stochastic Games

    Proceedings of the AAAI Conference on Artificial Intelligence · 2021 · 35 citations

    • Computer Science
    • Computer Science
    • Mathematical economics

    Significant progress has been made recently in the following two lines of research in the intersection of AI and game theory: (1) the computation of optimal strategies to commit to (Stackelberg strategies), and (2) the computation of correlated equilibria of stochastic games. In this paper, we unite these two lines of research by studying the computation of Stackelberg strategies in stochastic games. We provide theoretical results on the value of being able to commit and the value of being able to correlate, as well as complexity results about computing Stackelberg strategies in stochastic games. We then modify the QPACE algorithm (MacDermed et al. 2011) to compute Stackelberg strategies, and provide experimental results.

Recent grants

Frequent coauthors

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Ronald Parr

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup