Stacy C. Marsella

· Professor, Jointly Appointed with College of ScienceVerified

Northeastern University · Artificial Intelligence and Data Science

Active 1986–2026

h-index62

Citations17.5k

Papers41165 last 5y

Funding$493k

Faculty page

See your match with Stacy C. Marsella — sign in to PhdFit.Sign in

About

Stacy C. Marsella is a professor in the Khoury College of Computer Sciences and the College of Science at Northeastern University, based in Boston. His multidisciplinary research is grounded in the computational modeling of human cognition, emotion, and social behavior, as well as the evaluation of those models. His work extends to applications in health interventions, social skills training, and planning operations. Marsella's applied work includes frameworks for large-scale social simulations of towns and a range of techniques and tools for creating virtual humans, facsimiles of people that can engage in face-to-face interactions. Prior to joining Northeastern, Marsella was a research professor at the University of Southern California (USC) and a research director at the Institute for Creative Technologies. He also held positions at USC’s Information Sciences Institute and at Bell Labs. He has served as a general chair of Autonomous Agents and Multiagent Systems and chair of Intelligent Virtual Agents. In 2010, he received an ACM SIIGART career award for his contributions to agent research. Marsella is an associate editor of the IEEE Transactions on Affective Computing, a board member of the International Foundation for Autonomous Agents and Multiagent Systems, and a member of the steering committee for Intelligent Virtual Agents. He is a fellow of the Society of Experimental Social Psychologists and a member of the Association for the Advancement of Artificial Intelligence and the International Society for Research on Emotions.

Research topics

Computer Science
Psychology
Business
Human–computer interaction
Engineering
Computer Security
Artificial Intelligence
Political Science
Economics
Multimedia
Internet privacy
Risk analysis (engineering)
World Wide Web
Industrial organization
Marketing
Management science
Social psychology
Medicine
Law
Public relations
Knowledge management
Microeconomics

Selected publications

Guarding Against Malicious Biased Threats (GAMBiT): Experimental Design of Cognitive Sensors and Triggers with Behavioral Impact Analysis
Computational Brain & Behavior · 2026-04-20
articleOpen access
Publisher OA PDF DOI
Pumped Up Kicks: The Impact of Social Contagion and Informational Cues on Evacuation Behavior and Exposure to Threat in a Simulated School Crisis
Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences · 2026-01-01
articleOpen access
As school shootings increase in frequency, understanding behavior in response to active shooter threats is essential for emergency disaster preparedness. This study utilized a 3D Unity simulation to examine how social and informational cues influence evacuation and exposure to threat. A total of 842 participants were assigned to one of 27 conditions in a 3 (NPC behavior: run, hide, mixed) × 3 (proximal information: run, hide, none) × 3 (public address: run, hide, none) design. Participants were more likely to evacuate when cues encouraged running, particularly when proximal information was present. However, congruent run cues increased exposure to threat, likely due to impulsive crowd-following. Individual factors such as age, positive affect, and experience with first-person shooter games predicted outcomes, suggesting variability in how people process and respond to high-stress situations. These findings highlight the need for emergency protocols that integrate clear communication and account for differences in response and awareness.
Publisher OA PDF DOI
Theory of mind on demand: do we prepare or react?
Frontiers in Psychology · 2026-02-03
articleOpen accessSenior author
Reasoning about others' thoughts, emotions, or intentions is a sophisticated human ability. Modelling such a complex phenomenon with limited available resources is a challenging pursuit. This work proposes the hypothesis of demand-driven and reactive ToM in humans as an additional strategy to establish sufficient mental inferences in complex social spaces. The authors consider a perspective of bounded rationality and cognitive costs in conceptualising ToM and understanding how humans form, maintain, and reason with models of others efficiently and effectively. This study presents qualitative data exploring what patterns in human ToM may allow humans to quickly and seemingly effortlessly perform the complex task of inferring other people's mental states. The results consist of several themes, which point to various heuristics that may be employed in shaping tractable ToM mechanisms. In conclusion, this qualitative approach to understanding ToM efficiency shaped the hypothesis of reactive ToM mechanisms human cognition, which needs to be tested in confirmatory quantitative studies. Study limitations, implications for modelling, and directions for future research are discussed.
Publisher OA PDF DOI
Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming
ArXiv.org · 2026-01-01
articleOpen access
Large Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dialogue. We introduce an evaluation framework that pairs AI psychotherapists with simulated patient agents equipped with dynamic cognitive-affective models and assesses therapy session simulations against a comprehensive quality of care and risk ontology. We apply this framework to a high-impact test case, Alcohol Use Disorder, evaluating six AI agents (including ChatGPT, Gemini, and Character AI) against a clinically-validated cohort of 15 patient personas representing diverse clinical phenotypes. Our large-scale simulation (N=369 sessions) reveals critical safety gaps in the use of AI for mental health support. We identify specific iatrogenic risks, including the validation of patient delusions ("AI Psychosis") and failure to de-escalate suicide risk. Finally, we validate an interactive data visualization dashboard with diverse stakeholders, including AI engineers and red teamers, mental health professionals, and policy experts (N=9), demonstrating that this framework effectively enables stakeholders to audit the "black box" of AI psychotherapy. These findings underscore the critical safety risks of AI-provided mental health support and the necessity of simulation-based clinical red teaming before deployment.
Publisher OA PDF
A Comparative Study of Large Language Models for Gesture Selection in Virtual Agents
Research Square · 2026-03-20
preprintOpen accessSenior author
Publisher OA PDF DOI
Guarding against malicious biased threats (GAMBiT) datasets: Revealing cognitive bias in human-subjects red-team cyber range operations
Data in Brief · 2026-01-18 · 1 citations
articleOpen access
We present datasets from three large-scale human-subject experiments involving red-team hacking in a cyber range in the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while collecting multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, key logs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined data release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE DataPort entries for Experiments 1-3.
Publisher DOI
Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming
arXiv (Cornell University) · 2026-02-23
preprintOpen access
Large Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dialogue. We introduce an evaluation framework that pairs AI psychotherapists with simulated patient agents equipped with dynamic cognitive-affective models and assesses therapy session simulations against a comprehensive quality of care and risk ontology. We apply this framework to a high-impact test case, Alcohol Use Disorder, evaluating six AI agents (including ChatGPT, Gemini, and Character AI) against a clinically-validated cohort of 15 patient personas representing diverse clinical phenotypes. Our large-scale simulation (N=369 sessions) reveals critical safety gaps in the use of AI for mental health support. We identify specific iatrogenic risks, including the validation of patient delusions ("AI Psychosis") and failure to de-escalate suicide risk. Finally, we validate an interactive data visualization dashboard with diverse stakeholders, including AI engineers and red teamers, mental health professionals, and policy experts (N=9), demonstrating that this framework effectively enables stakeholders to audit the "black box" of AI psychotherapy. These findings underscore the critical safety risks of AI-provided mental health support and the necessity of simulation-based clinical red teaming before deployment.
Publisher DOI
A richly annotated dataset of in-context co-speech hand gestures across diverse speaker professions
2025-03-27
preprintOpen accessSenior author
Hand gestures form an integral part of human communication and their complexity makes their study and generation difficult. Here, we present a dataset comprising 2373 annotated gestures, designed to facilitate in-depth analysis of human communication. We captured these gestures from nine speakers across three distinct categories: University lecturers, Politicians, and Psychotherapists. The annotations encompass various aspects, including gesture types (e.g., metaphoric, iconic), descriptive terms characterizing each gesture (e.g., 'sweep', 'container'), and their corresponding verbal utterances. The dataset also includes detailed physical properties such as hand height, distance to the body, arm angle, hand configuration, palm orientation, repetitions, size, and speed, alongside 3D pose tracking data. Where possible, video recordings provide additional multimodal context. Notably, we identified several previously undocumented lexemes, expanding the current lexicon of gesture research. This dataset offers a valuable resource for studying human communication, training models for gesture recognition and generation, and designing socially intelligent virtual agents.
Publisher OA PDF DOI
Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies
AHFE international · 2025-01-01
articleOpen access
Adversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the ability to model and detect when they exhibit ambiguity aversion, a cognitive bias reflecting a preference for known (versus unknown) probabilities. We introduce a novel methodological framework that (1) leverages rich, multi-modal data from human-subjects red-team experiments, (2) employs a large language model (LLM) pipeline to parse unstructured logs into MITRE ATT&#38;CK-mapped action sequences, and (3) applies a new computational model to infer an attacker’s ambiguity aversion level in near-real time. By operationalizing this cognitive trait, our work provides a foundational component for developing adaptive cognitive defense strategies.
Publisher OA PDF DOI
Guarding Against Malicious Biased Threats (GAMBiT) Experiments: Revealing Cognitive Bias in Human-Subjects Red-Team Cyber Range Operations
ArXiv.org · 2025-08-28
preprintOpen access
We present three large-scale human-subjects red-team cyber range datasets from the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while we captured multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, keylogs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE Dataport entries for Experiments 1-3.
Publisher OA PDF DOI

Recent grants

CHS: Small: Narrative-Based Training Simulations Using Theory of Mind
NSF · $493k · 2015–2019

Frequent coauthors

Jonathan Gratch
162 shared
David V. Pynadath
Creative Technologies (United States)
82 shared
David Traum
65 shared
Milind Tambe
49 shared
Mei Si
41 shared
Margaux Lhommet
Northeastern University
32 shared
Stefan Scherer
META Health
30 shared
Arno Hartholt
29 shared

Labs

Cognitive Embodied Social Agents Research (CESAR) LabPI

Awards & honors

ACM SIIGART career award (2010)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Stacy C. Marsella

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you