
Robert Hawkins
· ProfessorVerifiedStanford University · Symbolic Systems
Active 1954–2026
About
Robert Hawkins is an Assistant Professor of Linguistics and, by courtesy, of Psychology at Stanford University. He holds a Bachelor of Science degree in Cognitive Science and Mathematics from Indiana University, obtained in 2014, and a PhD in Psychology from Stanford University, completed in 2019. His research focuses on the cognitive mechanisms that enable people to communicate, collaborate, and coordinate with one another in flexible ways. He directs the Social Interaction & Language (SoIL) Lab at Stanford University, where his team investigates these problems through large-scale, multi-player web experiments and computational models of language and social reasoning. Hawkins is a member of the Bio-X and Wu Tsai Neurosciences Institute, contributing to interdisciplinary efforts to understand social and cognitive processes.
Research topics
- Computer Science
- Artificial Intelligence
- Cognitive science
- Cognitive psychology
- Psychology
Selected publications
When More Words Say Less: Decoupling Length and Specificity in Image Description Evaluation
ArXiv.org · 2026-01-08
articleOpen accessVision-language models (VLMs) are increasingly used to make visual content accessible via text-based descriptions. In current systems, however, description specificity is often conflated with their length. We argue that these two concepts must be disentangled: descriptions can be concise yet dense with information, or lengthy yet vacuous. We define specificity relative to a contrast set, where a description is more specific to the extent that it picks out the target image better than other possible images. We construct a dataset that controls for length while varying information content, and validate that people reliably prefer more specific descriptions regardless of length. We find that controlling for length alone cannot account for differences in specificity: how the length budget is allocated makes a difference. These results support evaluation approaches that directly prioritize specificity over verbosity.
When More Words Say Less: Decoupling Length and Specificity in Image Description Evaluation
arXiv (Cornell University) · 2026-01-08
preprintOpen accessVision-language models (VLMs) are increasingly used to make visual content accessible via text-based descriptions. In current systems, however, description specificity is often conflated with their length. We argue that these two concepts must be disentangled: descriptions can be concise yet dense with information, or lengthy yet vacuous. We define specificity relative to a contrast set, where a description is more specific to the extent that it picks out the target image better than other possible images. We construct a dataset that controls for length while varying information content, and validate that people reliably prefer more specific descriptions regardless of length. We find that controlling for length alone cannot account for differences in specificity: how the length budget is allocated makes a difference. These results support evaluation approaches that directly prioritize specificity over verbosity.
Using LLMs to Advance the Cognitive Science of Collectives
ArXiv.org · 2025-05-28
preprintOpen accessSenior authorLLMs are already transforming the study of individual cognition, but their application to studying collective cognition has been underexplored. We lay out how LLMs may be able to address the complexity that has hindered the study of collectives and raise possible risks that warrant new methods.
A Critical Crossroads in Child Welfare: Reform or Abolish?
Families in Society The Journal of Contemporary Social Services · 2025-05-26
articleMinding the Politeness Gap in Cross-cultural Communication
ArXiv.org · 2025-06-18
preprintOpen accessSenior authorMisunderstandings in cross-cultural communication often arise from subtle differences in interpretation, but it is unclear whether these differences arise from the literal meanings assigned to words or from more general pragmatic factors such as norms around politeness and brevity. In this paper, we report three experiments examining how speakers of British and American English interpret intensifiers like "quite" and "very." To better understand these cross-cultural differences, we developed a computational cognitive model where listeners recursively reason about speakers who balance informativity, politeness, and utterance cost. Our model comparisons suggested that cross-cultural differences in intensifier interpretation stem from a combination of (1) different literal meanings, (2) different weights on utterance cost. These findings challenge accounts based purely on semantic variation or politeness norms, demonstrating that cross-cultural differences in interpretation emerge from an intricate interplay between the two.
Signaling social identity in referential communication
2025-10-16
articleOpen accessAny choice of words simultaneously conveys information about the world and, at the same time, conveys information about the speaker, revealing aspects of their social identity. In this paper, we investigate how speakers strategically modify referential language to signal group membership. Across four experiments using a minimal referential communication paradigm, we find that speakers with the explicit goal of signaling social affiliation (1) choose more concise utterances, (2) preferentially select group-specific referents and descriptions, and (3) resist the otherwise strong tendency to be understood by everyone in the audience. Standard models of referential communication that focus on the trade-off between informativity and efficiency cannot explain these patterns; we argue instead for a model where speakers trade off the referential utility of being understood against the social utility of being identified as an in-group member.
The MIT Press eBooks · 2025-07-01 · 10 citations
bookOpen accessAn engaging research methods text integrating a classic approach to conducting experiments in psychology with open science practices and values. How does a researcher run a high-quality psychology experiment? What time-tested methods should be used, and how can more robust and accurate results be achieved? A dynamic collaboration between groundbreaking cognitive scientist Michael Frank and a diverse cohort of researchers innovating in the field—Mika Braginsky, Julie Cachia, Nicholas Coles, Tom Hardwicke, Robert Hawkins, Maya Mathur, and Rondeline Williams—Experimentology introduces the art of the modern psychological experiment with an emphasis on open science values of accessibility and transparency. Experimentology follows the timeline of an experiment, with sections covering basic foundations, planning, execution, data-gathering and analysis, and reporting. Narrative examples from a range of subdisciplines, including cognitive, developmental, and social psychology, model each component and account for the pitfalls that can undermine the reliability, validity, and replicability of results. Through an embrace of open science strategies such as data sharing and preregistration, Experimentology shows how the challenges of the replication crisis can be met constructively and collaboratively. Written for a global audience, Experimentology updates a classic research methods textbook with a new focus on ethics and the benefits of open science.
Dynamics of topic exploration in conversation
2025-05-10 · 1 citations
preprintOpen accessSenior authorConversations are intricately structured forms of social interaction in which talkers move through interconnected topics with nested levels of semantic specificity. What principles govern how conversational partners jointly navigate an expansive topic space? To characterize these dynamics, we introduce a new dataset of annotated topic shifts from N=1,505 annotators on 200 distinct video call conversations between strangers (Reece et al., 2023). Conversational dyads made stochastic but systematic transitions between topics, and within individual topics, we find that dyads begin concentrated in semantic space before dispersing to more idiosyncratic regions as topics progress. The same dispersion pattern also holds over entire conversations, providing quantitative evidence for nested levels of increasing specificity over conversations. Overall, our findings suggest that strangers get to know one another through systematic exploration of topic space, revealing hierarchical structure in idle talk.
Comparing human and LLM politeness strategies in free production
ArXiv.org · 2025-06-11
preprintOpen accessSenior authorPolite speech poses a fundamental alignment challenge for large language models (LLMs). Humans deploy a rich repertoire of linguistic strategies to balance informational and social goals -- from positive approaches that build rapport (compliments, expressions of interest) to negative strategies that minimize imposition (hedging, indirectness). We investigate whether LLMs employ a similarly context-sensitive repertoire by comparing human and LLM responses in both constrained and open-ended production tasks. We find that larger models ($\ge$70B parameters) successfully replicate key preferences from the computational pragmatics literature, and human evaluators surprisingly prefer LLM-generated responses in open-ended contexts. However, further linguistic analyses reveal that models disproportionately rely on negative politeness strategies even in positive contexts, potentially leading to misinterpretations. While modern LLMs demonstrate an impressive handle on politeness strategies, these subtle differences raise important questions about pragmatic alignment in AI systems.
Relevant answers to polar questions
2025-04-17 · 1 citations
preprintOpen access1st authorCorrespondingPeople often provide answers that go beyond what a question literally asks, but it has been difficult to pin down what makes some answers more relevant than others. Here we introduce PRIOR-PQ, a probabilistic cognitive model formalizing how people use theory of mind to produce and interpret relevantly overinformative answers to yes-no questions. Specifically, PRIOR-PQ grounds the pragmatics of question-answering in inferences about the underlying goal that motivated the questioner to ask the given question as opposed to a different question. We evaluate our probabilistic model against human answering behavior elicited in three case studies of increasing complexity, demonstrating its ability to predict nuanced patterns of relevance better than existing models, including state-of-the-art large language models. We also show how the goal-sensitive reasoning instantiated in our probabilistic model motivates a novel chain-of-thought prompting method allowing language models to approach more human-like performance. This work illuminates the mechanistic role of theory of mind in the pragmatics of question-answer exchanges, bridging formal semantics, cognitive science, and artificial intelligence. Our findings have implications for developing more socially grounded dialogue systems and highlight the importance of integrating normative cognitive models with machine learning approaches.
Recent grants
Neurotrophins and Consolidation of Learning-Related Synaptic Plasticity
NIH · $1.8M · 2020–2025
NIH · $23.4M · 2005
NIH · $1.6M · 2019
NIH · $967k · 2007
NIH · $2.8M · 2005
Frequent coauthors
- 329 shared
Eric R. Kandel
- 62 shared
Min Zhuo
Fujian Medical University
- 54 shared
Noah D. Goodman
- 53 shared
Craig H. Bailey
Columbia University
- 41 shared
Mary Elizabeth Bach
Columbia University
- 33 shared
Clifford G. Kentros
Norwegian University of Science and Technology
- 33 shared
Iksung Jin
Yonsei University
- 32 shared
Thomas L. Griffiths
Education
- 2019
PhD, Psychology
Stanford University
- 2014
BS, Cognitive Science & Mathematics
Indiana University
Awards & honors
- Stanford Honors Thesis Prizes - Symbolic Systems
- Glushko Prize for Excellence in Undergraduate Research in Sy…
- Barwise Award for Distinguished Contributions to Symbolic Sy…
- Symbolic Systems Distinguished Teaching Award
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Robert Hawkins
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup