
Research topics
- Computer Science
- Political Science
- Sociology
- Social Science
- Artificial Intelligence
- Engineering ethics
- Engineering
- Law
- Public relations
- Psychology
- Computer Security
- Internet privacy
- Business
- World Wide Web
- Economics
- Aesthetics
- Management science
- Human–computer interaction
- Art
- Management
- Data science
- Cognitive psychology
- Risk analysis (engineering)
Selected publications
Value Alignment of Social Media Ranking Algorithms
2026-04-13
articleOpen accessSenior authorWhile social media feed rankings are primarily driven by engagement signals rather than any explicit value system, the resulting algorithmic feeds are not value-neutral: engagement may prioritize specific individualistic values. This paper presents an approach for social media feed value alignment. We adopt Schwartz's theory of Basic Human Values -- a broad set of human values that articulates complementary and opposing values forming the building blocks of many cultures -- and we implement an algorithmic approach that models and then ranks feeds by expressions of Schwartz's values in social media posts. Our approach enables controls where users can express weights on their desired values, combining these weights and post value expressions into a ranking that respects users' articulated trade-offs. Through controlled experiments (N=141 and N=250), we demonstrate that users can use these controls to architect feeds reflecting their desired values. Across users, value-ranked feeds align with personal values, diverging substantially from existing engagement-driven feeds.
Art Card Game (ACG): Embedding Illustration in Gameplay to Mitigate Artist Self-Criticism
ArXiv.org · 2026-05-19
articleOpen accessSenior authorPersistent self-criticism--harsh evaluative self-talk--can undermine illustrators' performance and well-being. Traditional interventions draw on psychotherapeutic approaches (e.g., compassion training) but sit outside the illustration workflow, requiring time, facilitation, and skill transfer. We propose an in-workflow alternative: evaluative off-centering, a mechanism redirecting self-critical evaluation away from an inherently self-evaluative task (like illustration) by embedding it in an alternative activity. We instantiate evaluative off-centering in Art Card Game (ACG) that integrates illustration into a card customization game: players illustrate cards that become playable assets in a head-to-head battle. In a four-day randomized controlled study with hobbyist and professional illustrators (N=38), ACG outperformed a control condition with identical illustration constraints but no evaluative off-centering mechanisms (e.g. multiplayer, gameplay), yielding significantly higher pride in produced artwork and activity enjoyment. Pride and enjoyment--positive affect states linked to lower self-criticism--help explain how ACG reduces self-criticism. We discuss design implications for creativity support tools that apply evaluative off-centering across creative domains.
2026-04-13 · 1 citations
articleOpen accessSenior authorSocial media users have repeatedly advocated for control over the currently opaque operations of feed algorithms. Large language models (LLMs) now offer the promise of custom-defined feeds—but users often fail to foresee the gaps and edge cases in how they define their custom feed. We introduce feed elicitation interviews, an interactive method that guides users through identifying these gaps and articulating their preferences to better author custom social media feeds. We deploy this approach in a online study to create custom Bluesky feeds and find that participants significantly prefer the feeds produced from their elicited preferences to those produced by users manually describing their feeds. Through feed elicitation interviews, we advance users’ ability to control their social media experience, empowering them to describe and implement their desired feeds.
Art Card Game (ACG): Embedding Illustration in Gameplay to Mitigate Artist Self-Criticism
arXiv (Cornell University) · 2026-05-19
preprintOpen accessSenior authorPersistent self-criticism--harsh evaluative self-talk--can undermine illustrators' performance and well-being. Traditional interventions draw on psychotherapeutic approaches (e.g., compassion training) but sit outside the illustration workflow, requiring time, facilitation, and skill transfer. We propose an in-workflow alternative: evaluative off-centering, a mechanism redirecting self-critical evaluation away from an inherently self-evaluative task (like illustration) by embedding it in an alternative activity. We instantiate evaluative off-centering in Art Card Game (ACG) that integrates illustration into a card customization game: players illustrate cards that become playable assets in a head-to-head battle. In a four-day randomized controlled study with hobbyist and professional illustrators (N=38), ACG outperformed a control condition with identical illustration constraints but no evaluative off-centering mechanisms (e.g. multiplayer, gameplay), yielding significantly higher pride in produced artwork and activity enjoyment. Pride and enjoyment--positive affect states linked to lower self-criticism--help explain how ACG reduces self-criticism. We discuss design implications for creativity support tools that apply evaluative off-centering across creative domains.
The MIT Press eBooks · 2025-10-07 · 2 citations
bookSenior authorA dramatic new future of work in which managers assemble exactly the expertise they need—within minutes. Gone are the days of static organizational charts and staffing based on the manager’s rolodex and intuition. Now you can recruit any expertise you need from a global online network within minutes: an on-demand, on-the-spot expert at the exact moment that you need their help. You can right-size their involvement, too; some of those experts give a second opinion or a moment of brainstorming, whereas others join as full-fledged team members for a sustained collaborative effort. This is the future promised by flash teams, a model that The New York Times has already praised for its “revolutionary potential”: a world where experts are available anytime and everywhere, where remote work has become a norm, and where AI is in the loop to guide team decisions. In Flash Teams, award-winning management scholar Melissa Valentine and computer scientist Michael Bernstein chart the opportunities of flash teams and navigate the challenges that teams and managers will face. They distill lessons from their own work assembling and managing flash teams on demand that every manager can learn from so they can successfully use flash teams in their own organizations. Drawing on original research and industry examples, this book will help readers to: Industries are already being transformed by this new approach to teaming. Flash Teams arms leaders, managers, and entrepreneurs with the tools they need to accomplish their goals with confidence, speed, and agility.
Finetuning LLMs for Human Behavior Prediction in Social Science Experiments
2025-01-01
articleOpen accessSenior author2025-01-22
book-chapter1st authorCorrespondingBalancing Producer Fairness and Efficiency via Prior-Weighted Rating System Design
Proceedings of the International AAAI Conference on Web and Social Media · 2025-06-07 · 1 citations
articleOpen accessOnline marketplaces use rating systems to promote the discovery of high-quality products. However, these systems also lead to high variance in producers' economic outcomes: a new producer who sells high-quality items, may unluckily receive a low rating early, severely impacting their future popularity. We investigate the design of rating systems that balance the goals of identifying high-quality products (``efficiency'') and minimizing the variance in outcomes of producers of similar quality (individual ``producer fairness''). We show that there is a trade-off between these two goals: rating systems that promote efficiency are necessarily less individually fair to producers. We introduce prior-weighted rating systems as an approach to managing this trade-off. Informally, the system we propose sets a system-wide prior for the quality of an incoming product; subsequently, the system updates that prior to a posterior for each product's quality based on user-generated ratings over time. We show theoretically that in markets where products accrue reviews at an equal rate, the strength of the rating system's prior determines the operating point on the identified trade-off: the stronger the prior, the more the marketplace discounts early ratings data (increasing individual fairness), but the slower the platform is in learning about true item quality (so efficiency suffers). We further analyze this trade-off in a responsive market where customers make decisions based on historical ratings. Through calibrated simulations in 19 different real-world datasets sourced from large online platforms, we show that the choice of prior strength mediates the same efficiency-consistency trade-off in this setting. Overall, we demonstrate that by tuning the prior as a design choice in a prior-weighted rating system, platforms can be intentional about the balance between efficiency and producer fairness.
More than Marketing? On the Information Value of AI Benchmarks for Practitioners
2025-03-19 · 1 citations
articleFinetuning LLMs for Human Behavior Prediction in Social Science Experiments
ArXiv.org · 2025-09-06
preprintOpen accessSenior authorLarge language models (LLMs) offer a powerful opportunity to simulate the results of social science experiments. In this work, we demonstrate that finetuning LLMs directly on individual-level responses from past experiments meaningfully improves the accuracy of such simulations across diverse social science domains. We construct SocSci210 via an automatic pipeline, a dataset comprising 2.9 million responses from 400,491 participants in 210 open-source social science experiments. Through finetuning, we achieve multiple levels of generalization. In completely unseen studies, our strongest model, Socrates-Qwen-14B, produces predictions that are 26% more aligned with distributions of human responses to diverse outcome questions under varying conditions relative to its base model (Qwen2.5-14B), outperforming GPT-4o by 13%. By finetuning on a subset of conditions in a study, generalization to new unseen conditions is particularly robust, improving by 71%. Since SocSci210 contains rich demographic information, we reduce demographic parity difference, a measure of bias, by 10.6% through finetuning. Because social sciences routinely generate rich, topic-specific datasets, our findings indicate that finetuning on such data could enable more accurate simulations for experimental hypothesis screening. We release our data, models and finetuning code at stanfordhci.github.io/socrates.
Recent grants
CAREER: Enabling expert crowdsourcing via coordination, targeted contribution and education
NSF · $550k · 2014–2019
Frequent coauthors
- 44 shared
Li Fei-Fei
- 37 shared
Ranjay Krishna
- 36 shared
William Freedman
Temple University
- 36 shared
David S. Bell
- 36 shared
Seymour Ben‐Zvi
State University of New York
- 36 shared
I.S. Tackel
Thomas Jefferson University Hospital
- 36 shared
V. L. NEWHOUSE
Pennsylvania State University
- 36 shared
Rf Aston
Labs
Awards & honors
- Stanford Honors Thesis Prizes - Symbolic Systems
- Glushko Prize for Excellence in Undergraduate Research in Sy…
- Barwise Award for Distinguished Contributions to Symbolic Sy…
- Symbolic Systems Distinguished Teaching Award
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Michael Bernstein
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup