Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Peter Spirtes

Peter Spirtes

· Marianna Brown Dietrich Professor and Head of PhilosophyVerified

Carnegie Mellon University · Philosophy

Active 1982–2026

h-index46
Citations17.1k
Papers23019 last 5y
Funding$2.4M1 active
See your match with Peter Spirtes — sign in to PhdFit.Sign in

About

Peter Spirtes is the Marianna Brown Dietrich Professor and Head of Philosophy at the Department of Philosophy within the Dietrich College of Humanities and Social Sciences at Carnegie Mellon University. His research primarily focuses on the inference of causal relationships from statistical data, especially in contexts where fully controlled experiments are not feasible. He leads the TETRAD project, which aims to specify and prove conditions under which reliable causal inferences can be made from background knowledge and observational data, and to develop practical computer programs for inferring causal structures. His interdisciplinary work involves philosophy, statistics, graph theory, and computer science, with significant implications for various disciplines that rely on causal inference from data. Spirtes's research explores the limits of causal inference, the relationship between probability and causality, and the development of tools to assist scientists in building causal models. His contributions include the development of algorithms and software such as the TETRAD II program, and his work has advanced understanding of causal inference, Markov equivalence, and the conditions under which causal conclusions can be reliably drawn from data.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Data Mining
  • Mathematics
  • Statistics
  • Econometrics
  • Data science
  • Epistemology
  • Software engineering
  • Programming language
  • World Wide Web
  • Psychology

Selected publications

  • Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

    arXiv (Cornell University) · 2026-03-05

    articleOpen access

    Causal discovery with latent variables is a fundamental task. Yet most existing methods rely on strong structural assumptions, such as enforcing specific indicator patterns for latents or restricting how they can interact with others. We argue that a core obstacle to a general, structural-assumption-free approach is the lack of an equivalence characterization: without knowing what can be identified, one generally cannot design methods for how to identify it. In this work, we aim to close this gap for linear non-Gaussian models. We establish the graphical criterion for when two graphs with arbitrary latent structure and cycles are distributionally equivalent, that is, they induce the same observed distribution set. Key to our approach is a new tool, edge rank constraints, which fills a missing piece in the toolbox for latent-variable causal discovery in even broader settings. We further provide a procedure to traverse the whole equivalence class and develop an algorithm to recover models from data up to such equivalence. To our knowledge, this is the first equivalence characterization with latent variables in any parametric setting without structural assumptions, and hence the first structural-assumption-free discovery method. Code and an interactive demo are available at https://equiv.cc.

  • Causal Representation Learning from General Environments under Nonparametric Mixing

    arXiv (Cornell University) · 2026-04-26

    preprintOpen access

    Causal representation learning aims to recover the latent causal variables and their causal relations, typically represented by directed acyclic graphs (DAGs), from low-level observations such as image pixels. A prevailing line of research exploits multiple environments, which assume how data distributions change, including single-node interventions, coupled interventions, or hard interventions, or parametric constraints on the mixing function or the latent causal model, such as linearity. Despite the novelty and elegance of the results, they are often violated in real problems. Accordingly, we formalize a set of desiderata for causal representation learning that applies to a broader class of environments, referred to as general environments. Interestingly, we show that one can fully recover the latent DAG and identify the latent variables up to minor indeterminacies under a nonparametric mixing function and nonlinear latent causal models, such as additive (Gaussian) noise models or heteroscedastic noise models, by properly leveraging sufficient change conditions on the causal mechanisms up to third-order derivatives. These represent, to our knowledge, the first results to fully recover the latent DAG from general environments under nonparametric mixing. Notably, our results match or improve upon many existing works, but require less restrictive assumptions about changing environments.

  • Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

    Open MIND · 2026-03-05

    preprint

    Causal discovery with latent variables is a fundamental task. Yet most existing methods rely on strong structural assumptions, such as enforcing specific indicator patterns for latents or restricting how they can interact with others. We argue that a core obstacle to a general, structural-assumption-free approach is the lack of an equivalence characterization: without knowing what can be identified, one generally cannot design methods for how to identify it. In this work, we aim to close this gap for linear non-Gaussian models. We establish the graphical criterion for when two graphs with arbitrary latent structure and cycles are distributionally equivalent, that is, they induce the same observed distribution set. Key to our approach is a new tool, edge rank constraints, which fills a missing piece in the toolbox for latent-variable causal discovery in even broader settings. We further provide a procedure to traverse the whole equivalence class and develop an algorithm to recover models from data up to such equivalence. To our knowledge, this is the first equivalence characterization with latent variables in any parametric setting without structural assumptions, and hence the first structural-assumption-free discovery method. Code and an interactive demo are available at https://equiv.cc.

  • Causal Representation Learning from General Environments under Nonparametric Mixing

    ArXiv.org · 2026-04-26

    articleOpen access

    Causal representation learning aims to recover the latent causal variables and their causal relations, typically represented by directed acyclic graphs (DAGs), from low-level observations such as image pixels. A prevailing line of research exploits multiple environments, which assume how data distributions change, including single-node interventions, coupled interventions, or hard interventions, or parametric constraints on the mixing function or the latent causal model, such as linearity. Despite the novelty and elegance of the results, they are often violated in real problems. Accordingly, we formalize a set of desiderata for causal representation learning that applies to a broader class of environments, referred to as general environments. Interestingly, we show that one can fully recover the latent DAG and identify the latent variables up to minor indeterminacies under a nonparametric mixing function and nonlinear latent causal models, such as additive (Gaussian) noise models or heteroscedastic noise models, by properly leveraging sufficient change conditions on the causal mechanisms up to third-order derivatives. These represent, to our knowledge, the first results to fully recover the latent DAG from general environments under nonparametric mixing. Notably, our results match or improve upon many existing works, but require less restrictive assumptions about changing environments.

  • Learning Hidden Causal Factors from Psychometrics Data Using Distributional Information

    Underline Science Inc. · 2025-06-18

    otherOpen access

    Understanding latent variables and their causal mechanisms is central to psychological theory, yet most latent variable models in psychology have largely remained correlational. This work attempts to address three pivotal issues: identifying useful information from observational data that reveal latent causal factors, developing algorithms to leverage this distributional information, ensuring the identifiability of the recovered latent factors and their causal structure. We introduce a generalizable framework for discovering hidden causal structures from observed distributions in psychometric data. Applied to survey datasets on personality traits, teacher burnout, and multitasking behavior, our method uncovers hidden causal factors and their intricate interactions. Additionally, our findings offer an alternative perspective on psychometric scoring, grounded in the strength of the learned causal relations. These insights contribute to behavioral modeling and measurement and await further confirmatory studies to validate their implications for psychological science.

  • Permutation-Based Rank Test in the Presence of Discretization and Application in Causal Discovery with Mixed Data

    ArXiv.org · 2025-01-31

    preprintOpen access

    Recent advances have shown that statistical tests for the rank of cross-covariance matrices play an important role in causal discovery. These rank tests include partial correlation tests as special cases and provide further graphical information about latent variables. Existing rank tests typically assume that all the continuous variables can be perfectly measured, and yet, in practice many variables can only be measured after discretization. For example, in psychometric studies, the continuous level of certain personality dimensions of a person can only be measured after being discretized into order-preserving options such as disagree, neutral, and agree. Motivated by this, we propose Mixed data Permutation-based Rank Test (MPRT), which properly controls the statistical errors even when some or all variables are discretized. Theoretically, we establish the exchangeability and estimate the asymptotic null distribution by permutations; as a consequence, MPRT can effectively control the Type I error in the presence of discretization while previous methods cannot. Empirically, our method is validated by extensive experiments on synthetic data and real-world data to demonstrate its effectiveness as well as applicability in causal discovery.

  • Causal discovery and counterfactual reasoning to optimize persuasive dialogue policies

    Behaviour and Information Technology · 2025-03-20 · 4 citations

    articleOpen access
  • Corrigendum to “Estimating bounds on causal effects in high-dimensional and possibly confounded systems” [Int. J. Approx. Reason. 88 (2017) 371–384]

    International Journal of Approximate Reasoning · 2025-05-22

    erratumOpen accessSenior author
  • Reflection-Window Decoding: Text Generation with Selective Refinement

    ArXiv.org · 2025-02-05

    preprintOpen access

    The autoregressive decoding for text generation in large language models (LLMs), while widely used, is inherently suboptimal due to the lack of a built-in mechanism to perform refinement and/or correction of the generated content. In this paper, we consider optimality in terms of the joint probability over the generated response, when jointly considering all tokens at the same time. We theoretically characterize the potential deviation of the autoregressively generated response from its globally optimal counterpart that is of the same length. Our analysis suggests that we need to be cautious when noticeable uncertainty arises during text generation, which may signal the sub-optimality of the generation history. To address the pitfall of autoregressive decoding for text generation, we propose an approach that incorporates a sliding reflection window and a pausing criterion, such that refinement and generation can be carried out interchangeably as the decoding proceeds. Our selective refinement framework strikes a balance between efficiency and optimality, and our extensive experimental results demonstrate the effectiveness of our approach.

  • Score-based Greedy Search for Structure Identification of Partially Observed Linear Causal Models

    arXiv (Cornell University) · 2025-10-05

    preprintOpen access

    Identifying the structure of a partially observed causal system is essential to various scientific fields. Recent advances have focused on constraint-based causal discovery to solve this problem, and yet in practice these methods often face challenges related to multiple testing and error propagation. These issues could be mitigated by a score-based method and thus it has raised great attention whether there exists a score-based greedy search method that can handle the partially observed scenario. In this work, we propose the first score-based greedy search method for the identification of structure involving latent variables with identifiability guarantees. Specifically, we propose Generalized N Factor Model and establish the global consistency: the true structure including latent variables can be identified up to the Markov equivalence class by using score. We then design Latent variable Greedy Equivalence Search (LGES), a greedy search algorithm for this class of model with well-defined operators, which search very efficiently over the graph space to find the optimal structure. Our experiments on both synthetic and real-life data validate the effectiveness of our method (code will be publicly available).

Recent grants

Frequent coauthors

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Peter Spirtes

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup