Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Tomer D. Ullman

Tomer D. Ullman

· Morris Kahn Associate Professor

Harvard University · Human Development and Psychology

Active 2009–2026

h-index26
Citations5.6k
Papers15084 last 5y
Funding
See your match with Tomer D. Ullman — sign in to PhdFit.Sign in

About

I am the Morris Kahn Associate Professor of Psychology in the Department of Psychology at Harvard University. I head the Computation, Cognition, and Development lab, with a focus on intuitive theories and people's common-sense reasoning about physics and psychology.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Cognitive science
  • Epistemology
  • Psychology
  • Machine Learning
  • Cognitive psychology
  • Social psychology
  • Data science

Selected publications

  • Agents of Chaos

    arXiv (Cornell University) · 2026-02-23 · 5 citations

    preprintOpen access

    We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy, tool use, and multi-party communication, we document eleven representative case studies. Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover. In several cases, agents reported task completion while the underlying system state contradicted those reports. We also report on some of the failed attempts. Our findings establish the existence of security-, privacy-, and governance-relevant vulnerabilities in realistic deployment settings. These behaviors raise unresolved questions regarding accountability, delegated authority, and responsibility for downstream harms, and warrant urgent attention from legal scholars, policymakers, and researchers across disciplines. This report serves as an initial empirical contribution to that broader conversation.

  • Agents of Chaos

    arXiv (Cornell University) · 2026-02-23

    articleOpen access

    We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy, tool use, and multi-party communication, we document eleven representative case studies. Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover. In several cases, agents reported task completion while the underlying system state contradicted those reports. We also report on some of the failed attempts. Our findings establish the existence of security-, privacy-, and governance-relevant vulnerabilities in realistic deployment settings. These behaviors raise unresolved questions regarding accountability, delegated authority, and responsibility for downstream harms, and warrant urgent attention from legal scholars, policymakers, and researchers across disciplines. This report serves as an initial empirical contribution to that broader conversation.

  • Emotion may indirectly link rendering and social reasoning

    Trends in Cognitive Sciences · 2026-01-21

    articleSenior author
  • Agents of Chaos

    Open MIND · 2026-01-01

    article

    We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy, tool use, and multi-party communication, we document eleven representative case studies. Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover. In several cases, agents reported task completion while the underlying system state contradicted those reports. We also report on some of the failed attempts. Our findings establish the existence of security-, privacy-, and governance-relevant vulnerabilities in realistic deployment settings. These behaviors raise unresolved questions regarding accountability, delegated authority, and responsibility for downstream harms, and warrant urgent attention from legal scholars, policymakers, and researchers across disciplines. This report serves as an initial empirical contribution to that broader conversation.

  • Human-Like Coarse Object Representations in Vision Models

    Open MIND · 2026-02-12

    preprintSenior author

    Humans appear to represent objects for intuitive physics with coarse, volumetric bodies'' that smooth concavities - trading fine visual details for efficient physical predictions - yet their internal structure is largely unknown. Segmentation models, in contrast, optimize pixel-accurate masks that may misalign with such bodies. We ask whether and when these models nonetheless acquire human-like bodies. Using a time-to-collision (TTC) behavioral paradigm, we introduce a comparison pipeline and alignment metric, then vary model training time, size, and effective capacity via pruning. Across all manipulations, alignment with human behavior follows an inverse U-shaped curve: small/briefly trained/pruned models under-segment into blobs; large/fully trained models over-segment with boundary wiggles; and an intermediate ideal body granularity'' best matches humans. This suggests human-like coarse bodies emerge from resource constraints rather than bespoke biases, and points to simple knobs - early checkpoints, modest architectures, light pruning - for eliciting physics-efficient representations. We situate these results within resource-rational accounts balancing recognition detail against physical affordances.

  • Human-Like Coarse Object Representations in Vision Models

    ArXiv.org · 2026-02-12

    articleOpen accessSenior author

    Humans appear to represent objects for intuitive physics with coarse, volumetric bodies'' that smooth concavities - trading fine visual details for efficient physical predictions - yet their internal structure is largely unknown. Segmentation models, in contrast, optimize pixel-accurate masks that may misalign with such bodies. We ask whether and when these models nonetheless acquire human-like bodies. Using a time-to-collision (TTC) behavioral paradigm, we introduce a comparison pipeline and alignment metric, then vary model training time, size, and effective capacity via pruning. Across all manipulations, alignment with human behavior follows an inverse U-shaped curve: small/briefly trained/pruned models under-segment into blobs; large/fully trained models over-segment with boundary wiggles; and an intermediate ideal body granularity'' best matches humans. This suggests human-like coarse bodies emerge from resource constraints rather than bespoke biases, and points to simple knobs - early checkpoints, modest architectures, light pruning - for eliciting physics-efficient representations. We situate these results within resource-rational accounts balancing recognition detail against physical affordances.

  • Re-evaluating Theory of Mind evaluation in large language models

    Philosophical Transactions of the Royal Society B Biological Sciences · 2025-08-14 · 4 citations

    articleSenior author

    The question of whether large language models (LLMs) possess Theory of Mind (ToM)-often defined as the ability to reason about others' mental states-has sparked significant scientific and public interest. However, the evidence as to whether LLMs possess ToM is mixed, and the recent growth in evaluations has not resulted in a convergence. Here, we take inspiration from cognitive science to re-evaluate the state of ToM evaluation in LLMs. We argue that a major reason for the disagreement on whether LLMs have ToM is a lack of clarity on whether models should be expected to match human behaviours, or the computations underlying those behaviours. We also highlight ways in which current evaluations may be deviating from 'pure' measurements of ToM abilities, which also contributes to the confusion. We conclude by discussing several directions for future research, including the relationship between ToM and pragmatic communication, which could advance our understanding of artificial systems as well as human cognition.This article is part of the theme issue 'At the heart of human communication: new views on the complex relationship between pragmatics and Theory of Mind'.

  • Resource bounds on mental simulations: Evidence from a liquid-reasoning task.

    Journal of Experimental Psychology General · 2025-06-09

    articleOpen accessSenior author

    People are able to reason about the physical dynamics of everyday objects. Bute there are theoretical disagreements about the computations that underlie this ability. One proposal is that people are running an approximate mental simulation of their environment. However, such a simulation must be limited in its resources. We applied the notion of a resource-bound simulation to a task of reasoning about liquids and showed that people's changing behavior can be explained by an approximate simulation that hits a resource limit after some time elapses. In Experiments 1 and 2, people performed well on tasks that asked them to estimate the time-to-fill and water level of different containers when filled over short periods of time (1-7 s). Experiment 3 shows systematic biases in visual volume estimation, which further strengthens the proposal that people are using a simulation to solve the first two experiments. Experiment 4 extends the reasoning time for the time-to-fill task and shows the existence of a "switch point," as expected from a resource-bound simulation model. The model also accounts for individual differences: People who perform worse on a digit-span task have an earlier switch point. Our work argues for the theoretical proposal that people are using mental simulations to reason about intuitive physics but further informs the suggestion that these simulations are limited in resources. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

  • Reverse-engineering the centered self

    2025-06-19

    preprintOpen access

    In certain problem solving contexts, people organize their domain through treating themselves as theperceptual and cognitive center of their world. They identify and solve a particular problem from theirperspective as a particular agent, with a particular location, at a particular time, in a particular environment. When they do this, selecting and solving problems from their perspective as an agent, they engage in a distinctive kind of agent-centered problem solving. Partially Observable Markov Decision Processes (POMDPs), a framework for modeling decision-making in uncertain environments unfolding over time, have effectively become a "standard model" of intelligent agency. Yet, as these models are ordinarily interpreted, they do not explicitly represent agent-centered problem solving. Accordingly, to model this type of problem solving, we begin by extending the standard POMDP framework to define “ePOMDPs.” This formalism models how an agent, once it centers itself on a particular self-and-world representation, plans and acts rationally from its own perspective. To capture the way that such agents choose which problem to solve, we build on our ePOMDPs to develop a “meta-ePOMDP” agent within a hierarchical Bayesian framework. We implement our meta-ePOMDP agents for two different suites of “centering game” tasks which highlight different aspects of our theory. We find that our models explain signatures of agent-centered problem solving not captured by alternative models, in particular, the difficulty of navigating spaces of possible problem representations. We close by suggesting that our model could provide the beginnings of a computational framework for a person to have a self.

  • The Development of Sensitivity to Automatic Behavior

    2025-10-03

    articleOpen accessSenior author

    People’s behavior can be roughly categorized into two modes: either reflective and thoughtful, or automatic and rote. Past work on Theory of Mind has focused on the first category. But do children notice when people are acting in an automatic way? This paper examined five- to ten-year-old children’s reasoning about others’ rote behavior, focusing on the consequences of this inference in teaching contexts (N = 660 across four studies, 327 girls). Children’s sensitivity to rote behavior increased with development, with consistent competence emerging around age 7. Rote behavior was also associated with worse teaching. These results indicate when and how reasoning about automatic behavior matters to children’s perception of others, and suggest novel extensions to models of Theory of Mind.

Frequent coauthors

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Tomer D. Ullman

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup