
Stacy C. Marsella
· Professor, Jointly Appointed with College of ScienceVerifiedNortheastern University · Artificial Intelligence and Data Science
Active 1986–2026
About
Stacy C. Marsella is a professor in the Khoury College of Computer Sciences and the College of Science at Northeastern University, based in Boston. His multidisciplinary research is grounded in the computational modeling of human cognition, emotion, and social behavior, as well as the evaluation of those models. His work extends to applications in health interventions, social skills training, and planning operations. Marsella's applied work includes frameworks for large-scale social simulations of towns and a range of techniques and tools for creating virtual humans, facsimiles of people that can engage in face-to-face interactions. Prior to joining Northeastern, Marsella was a research professor at the University of Southern California (USC) and a research director at the Institute for Creative Technologies. He also held positions at USC’s Information Sciences Institute and at Bell Labs. He has served as a general chair of Autonomous Agents and Multiagent Systems and chair of Intelligent Virtual Agents. In 2010, he received an ACM SIIGART career award for his contributions to agent research. Marsella is an associate editor of the IEEE Transactions on Affective Computing, a board member of the International Foundation for Autonomous Agents and Multiagent Systems, and a member of the steering committee for Intelligent Virtual Agents. He is a fellow of the Society of Experimental Social Psychologists and a member of the Association for the Advancement of Artificial Intelligence and the International Society for Research on Emotions.
Research topics
- Computer Science
- Psychology
- Business
- Human–computer interaction
- Engineering
- Computer Security
- Artificial Intelligence
- Political Science
- Economics
- Multimedia
- Internet privacy
- Risk analysis (engineering)
- World Wide Web
- Industrial organization
- Marketing
- Management science
- Social psychology
- Medicine
- Law
- Public relations
- Knowledge management
- Microeconomics
Selected publications
Computational Brain & Behavior · 2026-04-20
articleOpen accessProceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences · 2026-01-01
articleOpen accessAs school shootings increase in frequency, understanding behavior in response to active shooter threats is essential for emergency disaster preparedness. This study utilized a 3D Unity simulation to examine how social and informational cues influence evacuation and exposure to threat. A total of 842 participants were assigned to one of 27 conditions in a 3 (NPC behavior: run, hide, mixed) × 3 (proximal information: run, hide, none) × 3 (public address: run, hide, none) design. Participants were more likely to evacuate when cues encouraged running, particularly when proximal information was present. However, congruent run cues increased exposure to threat, likely due to impulsive crowd-following. Individual factors such as age, positive affect, and experience with first-person shooter games predicted outcomes, suggesting variability in how people process and respond to high-stress situations. These findings highlight the need for emergency protocols that integrate clear communication and account for differences in response and awareness.
Theory of mind on demand: do we prepare or react?
Frontiers in Psychology · 2026-02-03
articleOpen accessSenior authorReasoning about others' thoughts, emotions, or intentions is a sophisticated human ability. Modelling such a complex phenomenon with limited available resources is a challenging pursuit. This work proposes the hypothesis of demand-driven and reactive ToM in humans as an additional strategy to establish sufficient mental inferences in complex social spaces. The authors consider a perspective of bounded rationality and cognitive costs in conceptualising ToM and understanding how humans form, maintain, and reason with models of others efficiently and effectively. This study presents qualitative data exploring what patterns in human ToM may allow humans to quickly and seemingly effortlessly perform the complex task of inferring other people's mental states. The results consist of several themes, which point to various heuristics that may be employed in shaping tractable ToM mechanisms. In conclusion, this qualitative approach to understanding ToM efficiency shaped the hypothesis of reactive ToM mechanisms human cognition, which needs to be tested in confirmatory quantitative studies. Study limitations, implications for modelling, and directions for future research are discussed.
ArXiv.org · 2026-01-01
articleOpen accessLarge Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dialogue. We introduce an evaluation framework that pairs AI psychotherapists with simulated patient agents equipped with dynamic cognitive-affective models and assesses therapy session simulations against a comprehensive quality of care and risk ontology. We apply this framework to a high-impact test case, Alcohol Use Disorder, evaluating six AI agents (including ChatGPT, Gemini, and Character AI) against a clinically-validated cohort of 15 patient personas representing diverse clinical phenotypes. Our large-scale simulation (N=369 sessions) reveals critical safety gaps in the use of AI for mental health support. We identify specific iatrogenic risks, including the validation of patient delusions ("AI Psychosis") and failure to de-escalate suicide risk. Finally, we validate an interactive data visualization dashboard with diverse stakeholders, including AI engineers and red teamers, mental health professionals, and policy experts (N=9), demonstrating that this framework effectively enables stakeholders to audit the "black box" of AI psychotherapy. These findings underscore the critical safety risks of AI-provided mental health support and the necessity of simulation-based clinical red teaming before deployment.
A Comparative Study of Large Language Models for Gesture Selection in Virtual Agents
Research Square · 2026-03-20
preprintOpen accessSenior authorData in Brief · 2026-01-18 · 1 citations
articleOpen accessWe present datasets from three large-scale human-subject experiments involving red-team hacking in a cyber range in the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while collecting multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, key logs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined data release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE DataPort entries for Experiments 1-3.
arXiv (Cornell University) · 2026-02-23
preprintOpen accessLarge Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dialogue. We introduce an evaluation framework that pairs AI psychotherapists with simulated patient agents equipped with dynamic cognitive-affective models and assesses therapy session simulations against a comprehensive quality of care and risk ontology. We apply this framework to a high-impact test case, Alcohol Use Disorder, evaluating six AI agents (including ChatGPT, Gemini, and Character AI) against a clinically-validated cohort of 15 patient personas representing diverse clinical phenotypes. Our large-scale simulation (N=369 sessions) reveals critical safety gaps in the use of AI for mental health support. We identify specific iatrogenic risks, including the validation of patient delusions ("AI Psychosis") and failure to de-escalate suicide risk. Finally, we validate an interactive data visualization dashboard with diverse stakeholders, including AI engineers and red teamers, mental health professionals, and policy experts (N=9), demonstrating that this framework effectively enables stakeholders to audit the "black box" of AI psychotherapy. These findings underscore the critical safety risks of AI-provided mental health support and the necessity of simulation-based clinical red teaming before deployment.
A richly annotated dataset of in-context co-speech hand gestures across diverse speaker professions
2025-03-27
preprintOpen accessSenior authorHand gestures form an integral part of human communication and their complexity makes their study and generation difficult. Here, we present a dataset comprising 2373 annotated gestures, designed to facilitate in-depth analysis of human communication. We captured these gestures from nine speakers across three distinct categories: University lecturers, Politicians, and Psychotherapists. The annotations encompass various aspects, including gesture types (e.g., metaphoric, iconic), descriptive terms characterizing each gesture (e.g., 'sweep', 'container'), and their corresponding verbal utterances. The dataset also includes detailed physical properties such as hand height, distance to the body, arm angle, hand configuration, palm orientation, repetitions, size, and speed, alongside 3D pose tracking data. Where possible, video recordings provide additional multimodal context. Notably, we identified several previously undocumented lexemes, expanding the current lexicon of gesture research. This dataset offers a valuable resource for studying human communication, training models for gesture recognition and generation, and designing socially intelligent virtual agents.
Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies
AHFE international · 2025-01-01
articleOpen accessAdversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the ability to model and detect when they exhibit ambiguity aversion, a cognitive bias reflecting a preference for known (versus unknown) probabilities. We introduce a novel methodological framework that (1) leverages rich, multi-modal data from human-subjects red-team experiments, (2) employs a large language model (LLM) pipeline to parse unstructured logs into MITRE ATT&CK-mapped action sequences, and (3) applies a new computational model to infer an attacker’s ambiguity aversion level in near-real time. By operationalizing this cognitive trait, our work provides a foundational component for developing adaptive cognitive defense strategies.
ArXiv.org · 2025-08-28
preprintOpen accessWe present three large-scale human-subjects red-team cyber range datasets from the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while we captured multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, keylogs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE Dataport entries for Experiments 1-3.
Recent grants
CHS: Small: Narrative-Based Training Simulations Using Theory of Mind
NSF · $493k · 2015–2019
Frequent coauthors
- 162 shared
Jonathan Gratch
- 82 shared
David V. Pynadath
Creative Technologies (United States)
- 65 shared
David Traum
- 49 shared
Milind Tambe
- 41 shared
Mei Si
- 32 shared
Margaux Lhommet
Northeastern University
- 30 shared
Stefan Scherer
META Health
- 29 shared
Arno Hartholt
Labs
Cognitive Embodied Social Agents Research (CESAR) LabPI
Awards & honors
- ACM SIIGART career award (2010)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Stacy C. Marsella
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup