
Talia Gillis
· Associate Professor of LawVerifiedColumbia University · Columbia Law School
Active 2015–2025
About
Talia Gillis is the Daniel G. Ross Professor of Law at Columbia Law School, joining the faculty in 2020. She holds a S.J.D. from Harvard Law School, a Ph.D. in Business Economics from Harvard University, an LL.B. from Hebrew University, and a B.C.L. from Oxford University. Her research focuses on the law and economics of consumer markets, with particular interest in household financial behavior and how technological and legal changes influence consumer welfare. Gillis studies regulatory tools such as financial disclosures and fiduciary duties, and empirically examines how households manage their financial flows and mental accounting. Her recent work explores the impact of artificial intelligence and consumer fintech on consumers, raising distributional concerns. She has received multiple awards and grants, including the 2022 Junior Faculty Grant and the Richard Paul Richman Center Grant, and has been recognized for her scholarly contributions through awards such as the 2022 AALS Scholarly Papers Competition.
Research topics
- Computer Science
- Political Science
- Machine Learning
- Sociology
- Economics
- Business
- Artificial Intelligence
- Law and economics
- Engineering
- Psychology
- Law
- Software engineering
- Financial economics
- Actuarial science
- Operations research
- Risk analysis (engineering)
Selected publications
Bridging Prediction and Intervention Problems in Social Systems
ArXiv.org · 2025-07-07
preprintOpen accessMany automated decision systems (ADS) are designed to solve prediction problems -- where the goal is to learn patterns from a sample of the population and apply them to individuals from the same population. In reality, these prediction systems operationalize holistic policy interventions in deployment. Once deployed, ADS can shape impacted population outcomes through an effective policy change in how decision-makers operate, while also being defined by past and present interactions between stakeholders and the limitations of existing organizational, as well as societal, infrastructure and context. In this work, we consider the ways in which we must shift from a prediction-focused paradigm to an intervention-oriented paradigm when considering the impact of ADS within social systems. We argue this requires a new default problem setup for ADS beyond prediction, to instead consider predictions as decision support, final decisions, and outcomes. We highlight how this perspective unifies modern statistical frameworks and other tools to study the design, implementation, and evaluation of ADS systems, and point to the research directions necessary to operationalize this paradigm shift. Using these tools, we characterize the limitations of focusing on isolated prediction tasks, and lay the foundation for a more intervention-oriented approach to developing and deploying ADS.
Towards Effective Discrimination Testing for Generative AI
2025-06-23
articleGenerative AI (GenAI) models present new challenges in regulating against discriminatory behavior. In this paper, we argue that GenAI fairness research still has not met these challenges; instead, a significant gap remains between existing bias assessment methods and regulatory goals. This leads to ineffective regulation that can allow deployment of reportedly fair, yet actually discriminatory, GenAI systems. Towards remedying this problem, we connect the legal and technical literature around GenAI bias evaluation and identify areas of misalignment. Through four case studies, we demonstrate how this misalignment between fairness testing techniques and regulatory goals can result in discriminatory outcomes in real-world deployments, especially in adaptive or complex environments. We offer practical recommendations for improving discrimination testing to better align with regulatory goals and enhance the reliability of fairness assessments in future deployments.
A Longitudinal Measurement of Privacy Policy Evolution for Large Language Models
ArXiv.org · 2025-11-24
preprintOpen accessLarge language model (LLM) services have been rapidly integrated into people's daily lives as chatbots and agentic systems. They are nourished by collecting rich streams of data, raising privacy concerns around excessive collection of sensitive personal information. Privacy policies are the fundamental mechanism for informing users about data practices in modern information privacy paradigm. Although traditional web and mobile policies are well studied, the privacy policies of LLM providers, their LLM-specific content, and their evolution over time remain largely underexplored. In this paper, we present the first longitudinal empirical study of privacy policies for mainstream LLM providers worldwide. We curate a chronological dataset of 74 historical privacy policies and 115 supplemental privacy documents from 11 LLM providers across 5 countries up to August 2025, and extract over 3,000 sentence-level edits between consecutive policy versions. We compare LLM privacy policies to those of other software formats, propose a taxonomy tailored to LLM privacy policies, annotate policy edits and align them with a timeline of key LLM ecosystem events. Results show they are substantially longer, demand college-level reading ability, and remain highly vague. Our taxonomy analysis reveals patterns in how providers disclose LLM-specific practices and highlights regional disparities in coverage. Policy edits are concentrated in first-party data collection and international/specific-audience sections, and that product releases and regulatory actions are the primary drivers, shedding light on the status quo and the evolution of LLM privacy policies.
2024-02-26 · 2 citations
article1st authorCorrespondingThis paper examines an approach to algorithmic discrimination that seeks to blind predictions to protected characteristics by orthogonalizing inputs. The approach uses protected characteristics (such as race or sex) during the training phase of a model but masks these during deployment. The approach posits that including these characteristics in training prevents correlated features from acting as proxies, while assigning uniform values to them at deployment ensures decisions do not vary by group status.
Operationalizing the Search for Less Discriminatory Alternatives in Fair Lending
2024-06-03 · 30 citations
articleOpen access1st authorCorrespondingThe Less Discriminatory Alternative is a key provision of the disparate impact doctrine in the United States. In fair lending, this provision mandates that lenders must adopt models that reduce discrimination when they do not compromise their business interests. In this paper, we develop practical methods to audit for less discriminatory alternatives. Our approach is designed to verify the existence of less discriminatory machine learning models – by returning an alternative model that can reduce discrimination without compromising performance (discovery) or by certifying that an alternative model does not exist (refutation). We develop a method to fit the least discriminatory linear classification model in a specific lending task – by minimizing an exact measure of disparity (e.g., the maximum gap in group FNR) and enforcing hard performance constraints for business necessity (e.g., on FNR and FPR). We apply our method to study the prevalence of less discriminatory alternatives on real-world datasets from consumer finance applications. Our results highlight how models may inadvertently lead to unnecessary discrimination across common deployment regimes, and demonstrate how our approach can support lenders, regulators, and plaintiffs by reliably detecting less discriminatory alternatives in such instances.
Jerusalem Review of Legal Studies · 2024-06-01 · 2 citations
article1st authorCorresponding“Personalized Law”is a remarkable book in its scope and creativity, inviting readers to imagine a radically different world of customized legal rules while challenging our assumption that current legal rules are depersonalized. Whether taken as a practical guide for developing more effective and equitable legal rules or as a thought experiment questioning our current notions of legal commands, it provides insights into the relationship between legal design and the policies underlying those laws. In this Response, I address one type of first-stage prediction imperfection — the instability of intrapersonal predictions across model iterations — and discuss its implications for personalized law. While some prediction error is a necessary property of classification and prediction methods, I argue that this error, as it pertains to an individual’s prediction, may not be stable over iterations of the prediction model. As I will demonstrate in a particular setting below, small changes to the training set used to predict a borrower credit risk can produce different risk scores despite the stability of the overall model accuracy measure. If the prediction and classification functions we use to produce individual scores are unstable, this means that legal rules at the second stage, when tailored to reflect and individual’s score or classification, will also be unstable. Decisions made by model designers can produce varying legal rules for individuals even at the initial stage of model development; however, my focus is on the instability of predictions over time.
SSRN Electronic Journal · 2024-01-01
articleOpen access2024-06-03 · 9 citations
articleOpen accessRecent regulatory efforts, including Executive Order 14110 and the AI Bill of Rights, have focused on mitigating discrimination in AI systems through novel and traditional application of anti-discrimination laws. While these initiatives rightly emphasize fairness testing and mitigation, we argue that they pay insufficient attention to robust bias measurement and mitigation—and that without doing so, the frameworks cannot effectively achieve the goal of reducing discrimination in deployed AI models. This oversight is particularly concerning given the instability and brittleness of current algorithmic bias mitigation and fairness optimization methods, as highlighted by growing evidence in the algorithmic fairness literature. This instability heightens the risk of what we term discrimination-hacking or d-hacking, a scenario where, inadvertently or deliberately, the selection of models based on favorable fairness metrics within specific samples could lead to misleading or non-generalizable fairness performance. We term this effect d-hacking because systematically selecting among numerous models to find the least discriminatory one parallels the concept of p-hacking in social science research of selectively reporting outcomes that appear statistically significant resulting in misleading conclusions. In light of these challenges, we argue that AI fairness regulation should not only call for fairness measurement and bias mitigation, but also specify methods to ensure robust solutions to discrimination in AI systems. Towards the goal of arguing for robust fairness assessment and bias mitigation in AI regulation, this paper (1) synthesizes evidence of d-hacking in the computer science literature and provides experimental demonstrations of d-hacking, (2) analyzes current legal frameworks to understand the treatment of robust fairness and non-discriminatory behavior, both in recent AI regulation proposals and traditional U.S. discrimination law, and (3) outlines policy recommendations for preventing d-hacking in high-stakes domains.
Towards Effective Discrimination Testing for Generative AI
arXiv (Cornell University) · 2024-12-30
preprintOpen accessGenerative AI (GenAI) models present new challenges in regulating against discriminatory behavior. In this paper, we argue that GenAI fairness research still has not met these challenges; instead, a significant gap remains between existing bias assessment methods and regulatory goals. This leads to ineffective regulation that can allow deployment of reportedly fair, yet actually discriminatory, GenAI systems. Towards remedying this problem, we connect the legal and technical literature around GenAI bias evaluation and identify areas of misalignment. Through four case studies, we demonstrate how this misalignment between fairness testing techniques and regulatory goals can result in discriminatory outcomes in real-world deployments, especially in adaptive or complex environments. We offer practical recommendations for improving discrimination testing to better align with regulatory goals and enhance the reliability of fairness assessments in future deployments.
Incomplete Contracts and Future Data Usage
SSRN Electronic Journal · 2023-01-01
articleOpen access
Frequent coauthors
- 5 shared
Jann Spiess
- 3 shared
Bryce McLaughlin
- 2 shared
Eric L. Talley
European Corporate Governance Institute
- 2 shared
Jens Frankenreiter
Washington University in St. Louis
- 1 shared
Berk Ustun
- 1 shared
Vitaly Meursault
Federal Reserve Bank of Philadelphia
- 1 shared
Howell E. Jackson
- 1 shared
Zara Yasmine Hall
Columbia University
Awards & honors
- The Input Fallacy (2022 AALS Scholarly Papers Competition)
- Junior Faculty Grant (2022)
- Richard Paul Richman Center for Business, Law, and Public Po…
- Empirical Law and Finance Fellow (Harvard Law School) (2018–…
- Harvard Law School Summer Academic Fellow (2019, 2013)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Talia Gillis
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup