Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Allison Koenecke

Allison Koenecke

· Assistant Professor of Information ScienceVerified

Cornell University · Computer Science

Active 2012–2026

h-index8
Citations763
Papers3834 last 5y
Funding
See your match with Allison Koenecke — sign in to PhdFit.Sign in

About

Allison Koenecke is an assistant professor of information science at Cornell Tech and the Cornell Ann S. Bowers College of Computing and Information Science. Her research focuses on algorithmic fairness, applying computational methods such as machine learning and causal inference to study societal inequities in various domains, including online services and public health. She has held a postdoctoral researcher role at Microsoft Research and earned her Ph.D. from Stanford’s Institute for Computational and Mathematical Engineering. Koenecke has received several NSF grants, a Cornell CIS DEIB Faculty of the Year Award, and has been honored as a Sloan Fellow in Computer Science and a Forbes 30 Under 30 in Science. She is frequently quoted as an expert on disparities in automated speech-to-text systems and has been featured in prominent news outlets including the New York Times, the Atlantic, Forbes, Wired, and Scientific American. Her work has been published in venues such as Nature, PNAS, NeurIPS, and FAccT.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Computer Science
  • Natural Language Processing
  • Internal medicine
  • Medicine
  • Artificial Intelligence
  • Speech recognition
  • Psychology
  • Immunology
  • Philosophy
  • Linguistics

Selected publications

  • A Critical Pragmatism Approach for Algorithmic Fairness: Lessons from Urban Planning Theory

    arXiv (Cornell University) · 2026-05-04

    preprintOpen accessSenior author

    As data scientists grapple with increasingly complex ethical decisions in machine learning (ML) and data science, the field of algorithmic fairness has offered multiple solutions, from formal mathematical definitions to holistic notions of fairness drawn from various academic disciplines. However, navigating and implementing these fairness approaches in practice remains an ongoing challenge. In this paper, we draw a parallel between the types of problems arising in algorithmic fairness and urban planning. We frame algorithmic fairness problems as `wicked problems,' a term originating from the planning and policy space to describe the intractable, value-laden, and complex nature of this work. As such, we argue that the field of algorithmic fairness can learn from theoretical work in urban planning in ameliorating its own set of wicked problems. Urban planning is typically concerned with practical issues of governance, resource allocation, stakeholder engagement, and conflicts involving deep-seated differences. These are challenges that existing fairness frameworks can easily overlook. We present a flexible framework for designing fairer algorithms based on the urban planning theory approach of critical pragmatism -- a reflective and deliberative approach to addressing wicked problems that considers what practitioners actually do in the face of conflict and power. We provide specific recommendations and apply them to several case studies in ML and algorithm design: automated mortgage lending, school choice, and feminicide counterdata collection. Researchers and practitioners can incorporate these recommendations derived from urban planning into their ongoing work to more holistically address practical problems arising in fair algorithm design.

  • Global energy and fertilizer dependencies via the Strait of Hormuz

    2026-04-18

    articleOpen accessSenior author

    The Strait of Hormuz is a critical maritime chokepoint, with concerns of disruption arising from conflict. We quantify trade dependencies across 172 countries, finding 19.6% [95% Interpercentile Interval: 17.4, 21.8] of global fossil fuel trade transits the strait, and that lower income countries are disproportionately dependent on Hormuz-sourced fertilizer. Countries doubly dependent on Hormuz-sourced energy and fertilizer face pre-existing humanitarian crises, indicating global structural vulnerability.

  • Into the Unknown: Accounting for Missing Demographic Data when Mitigating Ad Delivery Skew

    arXiv (Cornell University) · 2026-05-12

    preprintOpen accessSenior author

    Online advertising platforms use algorithmic systems to power the process of matching ads to users, termed ad delivery. Prior audits have demonstrated that ad delivery can be skewed by demographic attributes, such that ads are systematically under-delivered to certain groups despite advertiser intent to reach groups proportionally. This under-delivery raises a serious concern in the context of ads promoting public services, which might prevent certain groups of individuals from accessing information about resources on the basis of their demographic identity. In the absence of platform-provided solutions to skewed ad delivery, advertisers can counteract skew by targeting demographic groups directly. However, direct targeting excludes users whose demographics the platform cannot infer ("unknown users") if advertising platforms do not provide a way to target unknown users directly, as is the case on Google Ads. We collaborate with a state-level government agency to reduce gender-based skew in ad delivery with an intervention that accounts for unknown users while incorporating gender-based targeting. In particular, we design a budget split intervention that directly incorporates unknown users and targets users with Google-inferred gender labels (i.e., male, female). We find that this intervention is a valuable approach to addressing ad delivery skew without excluding unknown users, and serves as a middle ground in the trade-off between higher costs (from more granular demographic targeting) and skew (from ignoring demographics entirely). This approach is responsive to the needs of real-world, resource-constrained advertisers who are committed to the equitable distribution of public service outreach via online advertising. We conclude with recommendations for government advertisers, online advertising platforms, and researchers.

  • Operationalizing Pluralistic Values in Large Language Model Alignment Reveals Trade-offs in Safety, Inclusivity, and Model Behavior

    Proceedings of the AAAI Conference on Artificial Intelligence · 2026-03-14

    articleOpen access

    Although large language models (LLMs) are increasingly trained using human feedback for safety and alignment with human values, alignment decisions often overlook human social diversity. This study examines how incorporating pluralistic values affects LLM behavior by systematically evaluating demographic variation and design parameters in the alignment pipeline. We collect alignment data from US and German participants (N = 1,095 participants, 27,375 ratings) who rated LLM responses across five dimensions: Toxicity, Emotional Awareness (EA), Sensitivity, Stereotypical Bias, and Helpfulness. We fine-tuned multiple Large Language Models and Large Reasoning Models using preferences from different social groups while varying rating scales, disagreement handling methods, and optimization techniques. The results revealed systematic demographic effects: male participants rated responses 18% less toxic than female participants; conservative and Black participants rated responses 27.9% and 44% higher on EA than liberal and White participants, respectively. Models fine-tuned on group-specific preferences exhibited distinct behaviors. Technical design choices showed strong effects: the preservation of rater disagreement achieved roughly 53% greater toxicity reduction than majority voting, and 5-point scales yielded about 22% more reduction than binary formats; and Direct Preference Optimization (DPO) consistently outperformed Group Relative Policy Optimization (GRPO) in multi-value optimization. These findings represent a preliminary step in answering a critical question: How should alignment balance expert-driven and user-driven signals to ensure both safety and fair representation?

  • Global energy and fertilizer dependencies via the Strait of Hormuz

    SocArXiv (OSF Preprints) · 2026-04-18

    preprintOpen access1st authorCorresponding

    The Strait of Hormuz is a critical maritime chokepoint, with concerns of disruption arising from conflict. We quantify trade dependencies across 172 countries, finding 19.6% [95% Interpercentile Interval: 17.4, 21.8] of global fossil fuel trade transits the strait, and that lower income countries are disproportionately dependent on Hormuz-sourced fertilizer. Countries doubly dependent on Hormuz-sourced energy and fertilizer face pre-existing humanitarian crises, indicating global structural vulnerability.

  • Speech AI for All: The What, How, and Who of Measurement

    2026-04-13

    article

    Optimized for “typical” and fluent speech, today’s speech AI systems perform poorly for people with speech diversities, sometimes to an unusable or even harmful degree. These harms play out in daily life through household voice assistants and workplace meeting services, in higher stakes scenarios like medical transcription, and in emerging applications of AI in augmentative and alternative communication. Standard metrics aiming to quantify these inequities, however, fail to comprehensively understand the impact of speech AI on diverse user groups, and furthermore do not easily generalize to newer speech language and speech generation models. To address these social inequities and measurement limitations, this workshop brings academics, practitioners, and non-profit workers together in proactive dialogue to improve measurement of speech AI performance and user impact. Through a poster session and breakout group discussions, our workshop will extend current understanding on how to best leverage existing metrics, like Word Error Rate, within the HCI design ecosystem, and also explore new innovations in speech AI measurement. Key outcomes of this workshop include: a research agenda for CHI community to guide and contribute to speech AI development, groundwork for new papers on speech AI measurement, and a diversity-centered benchmark suite for external evaluators.

  • A Critical Pragmatism Approach for Algorithmic Fairness: Lessons from Urban Planning Theory

    ArXiv.org · 2026-05-04

    articleOpen accessSenior author

    As data scientists grapple with increasingly complex ethical decisions in machine learning (ML) and data science, the field of algorithmic fairness has offered multiple solutions, from formal mathematical definitions to holistic notions of fairness drawn from various academic disciplines. However, navigating and implementing these fairness approaches in practice remains an ongoing challenge. In this paper, we draw a parallel between the types of problems arising in algorithmic fairness and urban planning. We frame algorithmic fairness problems as `wicked problems,' a term originating from the planning and policy space to describe the intractable, value-laden, and complex nature of this work. As such, we argue that the field of algorithmic fairness can learn from theoretical work in urban planning in ameliorating its own set of wicked problems. Urban planning is typically concerned with practical issues of governance, resource allocation, stakeholder engagement, and conflicts involving deep-seated differences. These are challenges that existing fairness frameworks can easily overlook. We present a flexible framework for designing fairer algorithms based on the urban planning theory approach of critical pragmatism -- a reflective and deliberative approach to addressing wicked problems that considers what practitioners actually do in the face of conflict and power. We provide specific recommendations and apply them to several case studies in ML and algorithm design: automated mortgage lending, school choice, and feminicide counterdata collection. Researchers and practitioners can incorporate these recommendations derived from urban planning into their ongoing work to more holistically address practical problems arising in fair algorithm design.

  • LLMs in social services: How does chatbot accuracy affect human accuracy?

    arXiv (Cornell University) · 2026-03-11

    preprintOpen accessSenior author

    Social service programs like the Supplemental Nutrition Assistance Program (SNAP, or food stamps) have eligibility rules that can be challenging to understand. For nonprofit caseworkers who often support clients in navigating a dozen or more complex programs, LLM-based chatbots may offer a means to provide better, faster help to clients whose situations may be less common. In this paper, we measure the potential effects of LLM-based chatbot suggestions on caseworkers' ability to provide accurate guidance. We first created a 770-question multiple-choice benchmark dataset of difficult, but realistic questions that a caseworker might receive. Next, using these benchmark questions and corresponding expert-verified answers, we conducted a randomized experiment with caseworkers recruited from nonprofit outreach organizations in Los Angeles. Caseworkers in the control condition did not see chatbot suggestions and had a mean accuracy of 49%. Caseworkers in the treatment condition saw chatbot suggestions that we artificially varied to range in aggregate accuracy from low (53%) to high (100%). Caseworker performance significantly improves as chatbot quality improves: high-quality chatbots (96-100% accurate) improved caseworker accuracy by 27 percentage points. At the question-level, incorrect chatbot suggestions substantially reduce caseworker accuracy, with a two-thirds reduction on easy questions where the control group performed best (without chatbot suggestions). Finally, improvements in caseworker accuracy level off as chatbot accuracy increases, a phenomenon that we call the "AI underreliance plateau," which is a concern for real-world deployment and highlights the importance of evaluating human-in-the-loop tools with their users.

  • Into the Unknown: Accounting for Missing Demographic Data when Mitigating Ad Delivery Skew

    ArXiv.org · 2026-05-12

    articleOpen accessSenior author

    Online advertising platforms use algorithmic systems to power the process of matching ads to users, termed ad delivery. Prior audits have demonstrated that ad delivery can be skewed by demographic attributes, such that ads are systematically under-delivered to certain groups despite advertiser intent to reach groups proportionally. This under-delivery raises a serious concern in the context of ads promoting public services, which might prevent certain groups of individuals from accessing information about resources on the basis of their demographic identity. In the absence of platform-provided solutions to skewed ad delivery, advertisers can counteract skew by targeting demographic groups directly. However, direct targeting excludes users whose demographics the platform cannot infer ("unknown users") if advertising platforms do not provide a way to target unknown users directly, as is the case on Google Ads. We collaborate with a state-level government agency to reduce gender-based skew in ad delivery with an intervention that accounts for unknown users while incorporating gender-based targeting. In particular, we design a budget split intervention that directly incorporates unknown users and targets users with Google-inferred gender labels (i.e., male, female). We find that this intervention is a valuable approach to addressing ad delivery skew without excluding unknown users, and serves as a middle ground in the trade-off between higher costs (from more granular demographic targeting) and skew (from ignoring demographics entirely). This approach is responsive to the needs of real-world, resource-constrained advertisers who are committed to the equitable distribution of public service outreach via online advertising. We conclude with recommendations for government advertisers, online advertising platforms, and researchers.

  • LLMs in social services: How does chatbot accuracy affect human accuracy?

    ArXiv.org · 2026-03-11

    articleOpen accessSenior author

    Social service programs like the Supplemental Nutrition Assistance Program (SNAP, or food stamps) have eligibility rules that can be challenging to understand. For nonprofit caseworkers who often support clients in navigating a dozen or more complex programs, LLM-based chatbots may offer a means to provide better, faster help to clients whose situations may be less common. In this paper, we measure the potential effects of LLM-based chatbot suggestions on caseworkers' ability to provide accurate guidance. We first created a 770-question multiple-choice benchmark dataset of difficult, but realistic questions that a caseworker might receive. Next, using these benchmark questions and corresponding expert-verified answers, we conducted a randomized experiment with caseworkers recruited from nonprofit outreach organizations in Los Angeles. Caseworkers in the control condition did not see chatbot suggestions and had a mean accuracy of 49%. Caseworkers in the treatment condition saw chatbot suggestions that we artificially varied to range in aggregate accuracy from low (53%) to high (100%). Caseworker performance significantly improves as chatbot quality improves: high-quality chatbots (96-100% accurate) improved caseworker accuracy by 27 percentage points. At the question-level, incorrect chatbot suggestions substantially reduce caseworker accuracy, with a two-thirds reduction on easy questions where the control group performed best (without chatbot suggestions). Finally, improvements in caseworker accuracy level off as chatbot accuracy increases, a phenomenon that we call the "AI underreliance plateau," which is a concern for real-world deployment and highlights the importance of evaluating human-in-the-loop tools with their users.

Frequent coauthors

  • Maximilian F. Konig

    Johns Hopkins University

    28 shared
  • Chetan Bettegowda

    Johns Hopkins Medicine

    27 shared
  • Susan Athey

    15 shared
  • Michael Powell

    United States Military Academy

    15 shared
  • Ruoxuan Xiong

    15 shared
  • Joshua T Vogelstein

    Johns Hopkins University

    14 shared
  • Bert Vogelstein

    Howard Hughes Medical Institute

    14 shared
  • Kenneth W. Kinzler

    Johns Hopkins University

    11 shared

Labs

Awards & honors

  • Cornell CIS DEIB Faculty of the Year Award
  • Sloan Fellow in Computer Science
  • Forbes 30 Under 30 in Science
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Allison Koenecke

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup