
Umang Bhatt
· Assistant Professor and Fellow in Computer ScienceVerifiedNew York University · Center for Data Science
Active 2009–2026
About
Umang Bhatt is an Assistant Professor and Fellow in the Department of Computer Science at the University of Cambridge, King's College. His research focuses on advancing the field of data science and artificial intelligence, contributing to the development of innovative methods and applications within these domains. As a fellow at the NYU Center for Data Science, he has been recognized for his originality and breadth in research, which has led to peer-reviewed publications and active collaborations across disciplines. His work is part of a broader effort to shape the future of AI and data science through independent research and interdisciplinary engagement.
Research topics
- Computer Security
- Computer Science
- Artificial Intelligence
- Data science
- Risk analysis (engineering)
- Simulation
Selected publications
Revisiting Rogers' Paradox in the Context of Human-AI Interaction
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-08
otherOpen accessai-rogers-paradox-main.zip from Revisiting Rogers' Paradox in the Context of Human-AI Interaction
Figshare · 2026-04-15
articleOpen accessLaTeX files and figures.
Revisiting Rogers’ Paradox in the context of human–AI interaction
Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences · 2026-05-14
preprintOpen accessPeople learn about the world in many ways: from conducting experiments to copying others' behaviour. The choices we make about how to learn can impact the collective understanding of a whole population, were others to learn from us. Alan Rogers developed simulations to study these phenomena-where agents could individually or socially learn amidst a dynamic, uncertain world-and uncovered a surprising result: the availability of cheap social learning yielded no benefit to population fitness over individual learning. Rogers' Paradox spawned decades of work to understand factors that favour social learning and better model human cultural development. But what happens when humans can learn from artificial intelligence (AI) systems that are themselves learning from us? We revisit Rogers' Paradox in the context of human-AI interaction and extend the simulations towards a simplified network of humans and AIs learning together about an uncertain world. We examine the impact of several learning strategies on the equilibrium of a society's 'collective world model' and assess levers available to stakeholders in human-AI interactions to change network dynamics. We then model negative feedback loops that may arise from humans learning socially from AI, and consider other open questions that could be explored in our simulation framework. This article is part of the theme issue 'World models in natural and artificial intelligence'.
Revisiting Rogers' Paradox in the Context of Human-AI Interaction
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-08
otherOpen accessai-rogers-paradox-main.zip from Revisiting Rogers' Paradox in the Context of Human-AI Interaction
Figshare · 2026-04-15
articleOpen accessLaTeX files and figures.
Measuring and mitigating overreliance to build human-compatible AI
ArXiv.org · 2025-09-08 · 1 citations
preprintOpen accessLarge language models (LLMs) distinguish themselves from previous technologies by functioning as collaborative ``thought partners,'' capable of engaging more fluidly in natural language on a range of tasks. As LLMs increasingly influence consequential decisions across diverse domains from healthcare to personal advice, the risk of overreliance -- relying on LLMs beyond their capabilities -- grows. This paper argues that measuring and mitigating overreliance must become central to LLM research and deployment. First, we consolidate risks from overreliance at both the individual and societal levels, including high-stakes errors, governance challenges, and cognitive deskilling. Then, we explore LLM characteristics, system design features, and user cognitive biases that together raise serious and unique concerns about overreliance on LLMs in practice. We also examine historical approaches for measuring overreliance, identifying three important gaps and proposing three promising directions to improve measurement. Finally, we propose mitigation strategies that can be pursued to ensure LLMs augment rather than undermine human capabilities.
Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems
ArXiv.org · 2025-02-25
preprintOpen accessSenior authorWhile Retrieval Augmented Generation (RAG) has emerged as a popular technique for improving Large Language Model (LLM) systems, it introduces a large number of choices, parameters and hyperparameters that must be made or tuned. This includes the LLM, embedding, and ranker models themselves, as well as hyperparameters governing individual RAG components. Yet, collectively optimizing the entire configuration in a RAG or LLM system remains under-explored - especially in multi-objective settings - due to intractably large solution spaces, noisy objective evaluations, and the high cost of evaluations. In this work, we introduce the first approach for multi-objective parameter optimization of cost, latency, safety and alignment over entire LLM and RAG systems. We find that Bayesian optimization methods significantly outperform baseline approaches, obtaining a superior Pareto front on two new RAG benchmark tasks. We conclude our work with important considerations for practitioners who are designing multi-objective RAG systems, highlighting nuances such as how optimal configurations may not generalize across tasks and objectives.
Context-specific certification of AI systems: a pilot in the financial industry
AI and Ethics · 2025-04-11 · 4 citations
articleOpen accessSenior authorAbstract The rapid proliferation of artificial intelligence (AI) systems across diverse sectors underscores the fundamental need for regulatory frameworks that address ethical, legal, and social implications of its deployment. This article examines the inherent challenges AI poses to traditional regulatory approaches, particularly concerning key pillars of responsible AI (RAI): adherence to human rights, fairness, non-discrimination, explainability, and accountability. Recognizing the lag between technological advancement and regulatory development, we pose a third-party, system-level AI certification framework as an interim solution. This framework is designed to bridge the current regulatory gap and complement future legislation. Our work provides a comprehensive analysis of certification processes, detailing key actors and mechanisms involved in AI system auditing. Through a detailed case study of a pilot certification program in the financial industry, we offer insights into the practical implementation, challenges, and potential of such a framework. This research begins to establish a recognized and actionable AI certification system, aimed at guiding AI development towards alignment with global standards. By offering a path towards responsible AI implementation, this work addresses the urgent need for governance mechanisms that keep pace with rapid technological advancement and ensure the responsible development and deployment of AI systems.
Towards Interactive Evaluations for Interaction Harms in Human-AI Systems
Proceedings of the AAAI/ACM Conference on AI Ethics and Society · 2025-10-15 · 3 citations
articleOpen accessCurrent AI evaluation methods, which rely on static, model-only tests, fail to account for harms that emerge through sustained human-AI interaction. As AI systems proliferate and are increasingly integrated into real-world applications, this disconnect between evaluation approaches and actual usage becomes more significant. In this paper, we propose a shift towards evaluation based on interactional ethics, which focuses on interaction harms—issues like inappropriate parasocial relationships, social manipulation, and cognitive overreliance that develop over time through repeated interaction, rather than through isolated outputs. First, we discuss the limitations of current evaluation methods, which (1) are static, (2) assume a universal user experience, and (3) have limited construct validity. Drawing on research from human-computer interaction, natural language processing, and the social sciences, we present practical principles for designing interactive evaluations. These include ecologically valid interaction scenarios, human impact metrics, and diverse human participation approaches. Finally, we explore implementation challenges and open research questions for researchers, practitioners, and regulators aiming to integrate interactive evaluations into AI governance frameworks. This work lays the groundwork for developing more effective evaluation methods that better capture the complex dynamics between humans and AI systems.
Learning Personalized Decision Support Policies
Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11 · 1 citations
articleOpen access1st authorCorrespondingIndividual human decision-makers may benefit from different forms of support to improve decision outcomes, but when will each form of support yield better outcomes? In this work, we posit that personalizing access to decision support tools can be an effective mechanism for instantiating the appropriate use of AI assistance. Specifically, we propose the general problem of learning a decision support policy that, for a given input, chooses which form of support to provide to decision-makers for whom we initially have no prior information. We develop Modiste, an interactive tool to learn personalized decision support policies. Modiste leverages stochastic contextual bandit techniques to personalize a decision support policy for each decision-maker. In our computational experiments, we characterize the expertise profiles of decision-makers for whom personalized policies will outperform offline policies, including population-wide baselines. Our experiments include realistic forms of support (e.g., expert consensus and predictions from a large language model) on vision and language tasks. Our human subject experiments add nuance to and bolster our computational experiments, demonstrating the practical utility of personalized policies when real users benefit from accessing support across tasks.
Frequent coauthors
- 124 shared
Adrian Weller
- 25 shared
Katherine M. Collins
- 16 shared
José M. F. Moura
- 13 shared
Mateja Jamnik
- 10 shared
Alice Xiang
- 9 shared
W. T. Gowers
Collège de France
- 8 shared
Valerie Chen
Carnegie Mellon University
- 8 shared
Matthew L Barker
Awards & honors
- Assistant Professor and Fellow in Computer Science, Universi…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Umang Bhatt
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup