
Mochen Yang
· Associate ProfessorVerifiedUniversity of Minnesota · Information and Decision Sciences
Active 2013–2026
About
I study algorithmic decision-making from both a “make” perspective and a “use” perspective. From the “make” perspective, I design theoretically robust and computationally efficient algorithms to support decision-making in information-intensive environment. From the “use” perspective, I examine the antecedents of algorithmic decision-making as well as its impact on decision quality, fairness, and privacy. My aspiration is to study real problems and develop practical solutions to support individual and organizational decision-making.
Research topics
- Business
- Computer Science
- Sociology
- Artificial Intelligence
- Economics
- Psychology
- Monetary economics
- Risk analysis (engineering)
- Epistemology
- Finance
- Art
- Mathematics
- Anthropology
- Philosophy
- Management science
- Data science
- Knowledge management
- Cognitive science
Selected publications
Regurgitative Training: The Value of Real Data in Training Large Language Models
Management Science · 2026-05-11 · 3 citations
preprintOpen accessWhat happens if we train a new large language model (LLM) using data at least partially generated by other LLMs? The explosive success of LLMs means that content online will increasingly be generated by LLMs rather than humans, which inevitably enters the training data sets of next-generation LLMs. In this paper, we study the implications of such “regurgitative training” on LLM performance. Starting with the machine translation task (a representative language task with well-established evaluation criteria), we fine-tune LLMs with data generated either by themselves or by other LLMs, and we find strong evidence that regurgitative training handicaps the performance of fine-tuned LLMs. A comparison between LLM-generated data and real data reveals suggestive evidence that higher error rates and lower lexical diversity in LLM-generated data may be at play. Accordingly, we propose and evaluate three strategies to mitigate the performance loss by (i) prioritizing high-quality LLM-generated data, (ii) mixing data generated by multiple LLMs, and (iii) prioritizing LLM-generated data that most resemble real data. All three strategies can improve the performance of regurgitative training to some extent but cannot fully close the gap from training with real data. This highlights that real, human-generated data cannot be easily substituted by LLM-generated data in training LLMs. Additionally, we investigate regurgitative training on a creative ideation task with human judgement-based evaluations. Interestingly, we find that preference-based fine-tuning with human feedback on LLM-generated ideas can actually improve ideation performance. This showcases that human preference data when combined with LLM-generated data can bring performance gains. This paper was accepted by Hemant Bhargava, information systems. Funding: This work was supported by the National Natural Science Foundation of China [Grants 72421001 and 72172070] and the Singapore Ministry of Education Academic Research Fund Tier 2 A-8003504 [Robert Brown Promising Researcher Award MOE-T2EP40]. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2024.07005 .
Agentic AI and Structured vs. Self-guided learning
AEA Randomized Controlled Trials · 2025-07-23
datasetAgentic AI and Structured vs. Self-guided learning
AEA Randomized Controlled Trials · 2025-07-23
datasetRobustness is Important: Limitations of LLMs for Data Fitting
ArXiv.org · 2025-08-27
preprintOpen accessLarge Language Models (LLMs) are being applied in a wide array of settings, well beyond the typical language-oriented use cases. In particular, LLMs are increasingly used as a plug-and-play method for fitting data and generating predictions. Prior work has shown that LLMs, via in-context learning or supervised fine-tuning, can perform competitively with many tabular supervised learning techniques in terms of predictive performance. However, we identify a critical vulnerability of using LLMs for data fitting -- making changes to data representation that are completely irrelevant to the underlying learning task can drastically alter LLMs' predictions on the same data. For example, simply changing variable names can sway the size of prediction error by as much as 82% in certain settings. Such prediction sensitivity with respect to task-irrelevant variations manifests under both in-context learning and supervised fine-tuning, for both close-weight and open-weight general-purpose LLMs. Moreover, by examining the attention scores of an open-weight LLM, we discover a non-uniform attention pattern: training examples and variable names/values which happen to occupy certain positions in the prompt receive more attention when output tokens are generated, even though different positions are expected to receive roughly the same attention. This partially explains the sensitivity in the presence of task-irrelevant variations. We also consider a state-of-the-art tabular foundation model (TabPFN) trained specifically for data fitting. Despite being explicitly designed to achieve prediction robustness, TabPFN is still not immune to task-irrelevant variations. Overall, despite LLMs' impressive predictive capabilities, currently they lack even the basic level of robustness to be used as a principled data-fitting tool.
What, Why, and How: An Empiricist’s Guide to Double/Debiased Machine Learning
Information Systems Research · 2025-12-05 · 2 citations
articleWe provide an introduction to double/debiased machine learning (DML), a framework that enables effect estimation when dealing with complex, high-dimensional data. In many empirical analyses, especially in fields such as information systems, researchers face difficult choices about which control variables to include and how to model their relationships with the outcome. These modeling decisions can significantly change results, leading to uncertainty about which findings are reliable. DML offers a practical solution by combining modern machine learning with rigorous statistical inference. The idea is to let flexible ML models (such as random forests or gradient boosting) capture complex relationships among control variables while still delivering reliable estimates for the key effect of interest. DML can be applied to many familiar research designs, including standard regression with controls, instrumental variables, difference in differences, and models that incorporate ML-generated features. Empirical studies and simulations show that DML is typically more robust to misspecification than traditional regression and more reliable than earlier semiparametric methods. However, DML is not automatic—it still requires sound research design and high-quality machine learning estimation. Used thoughtfully, DML provides a powerful, flexible, and statistically grounded approach for empirical research in modern data environments.
Agentic AI and Managers' Analytics Capabilities: An Exploration
SSRN Electronic Journal · 2025-01-01
articleOpen accessSenior authorImproving Convergence of Flexible Combinatorial Auctions with Rationality-Based Ask Prices
SSRN Electronic Journal · 2025-01-01
preprintOpen accessSenior authorCost-Aware Calibration of Classifiers
INFORMS Journal on Data Science · 2024-12-09
article1st authorCorrespondingMost classification techniques in machine learning are able to produce probability predictions in addition to class predictions. However, these predicted probabilities are often not well calibrated in that they deviate from the actual outcome rates (i.e., the proportion of data instances that actually belong to a certain class). A lack of calibration can jeopardize downstream decision tasks that rely on accurate probability predictions. Although several post hoc calibration methods have been proposed, they generally do not consider the potentially asymmetric costs associated with overprediction versus underprediction. In this research, we formally define the problem of cost-aware calibration and propose a metric to quantify the cost of miscalibration for a given classifier. Next, we propose three approaches to achieve cost-aware calibration, two of which are cost-aware adaptations of existing calibration algorithms; the third one (named MetaCal) is a Bayes optimal learning algorithm inspired by prior work on cost-aware classification. We carry out systematic empirical evaluations on multiple public data sets to demonstrate the effectiveness of the proposed approaches in reducing the cost of miscalibration. Finally, we generalize the definition and metric as well as solution algorithms of cost-aware calibration to account for nonlinear cost structures that may arise in real-world decision tasks. History: David Martens served as the senior editor for this article. Data Ethics & Reproducibility Note: There are no data ethics considerations. The code capsule is available on Code Ocean at https://doi.org/10.24433/CO.8552538.v1 and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2024.0038 ).
Engaging Users on Social Media Business Pages: The Roles of User Comments and Firm Responses
MIS Quarterly · 2024-06-01 · 11 citations
articleSenior authorFirms must strategically manage their responses to user comments to keep users engaged on their social media business pages. The question of whether, how, and when a firm should respond to user comments to achieve favorable outcomes is of great interest to researchers and practitioners. We focus on these questions and study the effects of initial user comments and firm responses on subsequent user engagement on social media business pages. In particular, we theorize and examine how two features of initial user comments (i.e., sentiment and controversialness) and two features of firm responses (i.e., uniqueness and timeliness) jointly affect the volume and sentiment of subsequent user comments. By analyzing data from the Facebook business pages of multiple U.S. retailers (10,312 firm posts from 37 firms and over 1 million user comments), we found that firms are more likely to respond to negative comments (than positive or neutral comments) but less likely to respond to controversial comments (which evoke diverse opinions and emotions). Further, we found that engaging with negative and controversial comments and promptly responding to comments are linked to an increase in the volume of subsequent user comments but also to a more negative sentiment in these comments. We also found that providing unique responses improves the volume and sentiment of subsequent user comments. Our findings offer theoretical and practical insights into firms’ response management on social media.
What, Why, and How: An Empiricist's Guide to Double/Debiased Machine Learning
SSRN Electronic Journal · 2024-01-01
articleOpen access
Frequent coauthors
- 52 shared
Gediminas Adomavičius
- 27 shared
Xuan Bi
- 16 shared
Gordon Burtch
Boston University
- 16 shared
Alok Gupta
- 10 shared
Yuqing Ren
University of Minnesota
- 6 shared
Edward McFowland
Harvard University
- 4 shared
Tianshu Sun
Cheung Kong Graduate School of Business
- 4 shared
Zachary J. Sheffler
Education
B.S., Information Systems and Information Management
Tsinghua University
Ph.D., Information and Decision Sciences
Carlson School of Management, University of Minnesota
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Mochen Yang
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup