Kilian Weinberger

· Professor of Computer Science

Cornell University · Computer Science

Active 1994–2025

h-index84

Citations77.9k

Papers29099 last 5y

Funding$2.7M

Faculty page Lab page

OpenAlex

See your match with Kilian Weinberger — sign in to PhdFit.Sign in

About

Kilian Weinberger is a professor in the Department of Computer Science at Cornell University and a field member in statistics. He received his Ph.D. from the University of Pennsylvania in machine learning under the supervision of Lawrence Saul, and his undergraduate degree in mathematics and computing from the University of Oxford. His research focuses on machine learning and its applications, including learning under resource constraints, metric learning, AI in science, computer vision, autonomous vehicles, Gaussian processes, and deep learning. Weinberger has worked as an associate professor at Washington University in St. Louis and as a research scientist at Yahoo! Research in Santa Clara. He has received numerous awards, including the Outstanding AAAI Senior Program Chair Award in 2011, an NSF CAREER award in 2012, the Daniel M. Lazar '29 Excellence in Teaching Award in 2016, and the Ann S. Bowers Teaching and Advising Excellence Award in 2024. He is an ACM and AAAI fellow, a 2021 Blavatnik National Awards Finalist, and a member of the Sloan Research Fellowships Selection Committee since 2024.

Research topics

Artificial Intelligence
Computer Science
Machine Learning
Natural Language Processing
Mathematics
Geography
Remote sensing
Biology
Computational biology
Computer vision
Optics
Statistics
Genetics
Physics
Speech recognition

Selected publications

SpeechOp: Inference-Time Task Composition for Generative Speech Processing
ArXiv.org · 2025-09-17
preprintOpen access
While generative Text-to-Speech (TTS) systems leverage vast ``in-the-wild" data to achieve remarkable success, speech-to-speech processing tasks like enhancement face data limitations, which lead data-hungry generative approaches to distort speech content and speaker identity. To bridge this gap, we present SpeechOp, a multi-task latent diffusion model that transforms pre-trained TTS models into a universal speech processor capable of performing a wide range of speech tasks and composing them in novel ways at inference time. By adapting a pre-trained TTS model, SpeechOp inherits a rich understanding of natural speech, accelerating training and improving S2S task quality, while simultaneously enhancing core TTS performance. Finally, we introduce Implicit Task Composition (ITC), a novel pipeline where ASR-derived transcripts (e.g., from Whisper) guide SpeechOp's enhancement via our principled inference-time task composition. ITC achieves state-of-the-art content preservation by robustly combining web-scale speech understanding with SpeechOp's generative capabilities. Audio samples are available at https://justinlovelace.github.io/projects/speechop
Publisher OA PDF DOI
Towards high-accuracy bacterial taxonomy identification using phenotypic single-cell Raman spectroscopy data
ISME Communications · 2025-01-01 · 6 citations
articleOpen access
Abstract Single-cell Raman Spectroscopy (SCRS) emerges as a promising tool for single-cell phenotyping in environmental ecological studies, offering non-intrusive, high-resolution, and high-throughput capabilities. In this study, we obtained a large and the first comprehensive SCRS dataset that captured phenotypic variations with cell growth status for 36 microbial strains, and we compared and optimized analysis techniques and classifiers for SCRS-based taxonomy identification. First, we benchmarked five dimensionality reduction (DR) methods, 10 classifiers, and the impact of cell growth variances using a SCRS dataset with both taxonomy and cellular growth stage labels. Unsupervised DR methods and non-neural network classifiers are recommended for at a balance between accuracy and time efficiency, achieved up to 96.1% taxonomy classification accuracy. Second, accuracy variances caused by cellular growth variance (&lt;2.9% difference) was found less than the influence from model selection (up to 41.4% difference). Remarkably, simultaneous high accuracy in growth stage classification (93.3%) and taxonomy classification (94%) were achievable using an innovative two-step classifier model. Third, this study is the first to successfully apply models trained on pure culture SCRS data to achieve taxonomic identification of microbes in environmental samples at an accuracy of 79%, and with validation via Raman-FISH (fluorescence in situ hybridization). This study paves the groundwork for standardizing SCRS-based biotechnologies in single-cell phenotyping and taxonomic classification beyond laboratory pure culture to real environmental microorganisms and promises advances in SCRS applications for elucidating organismal functions, ecological adaptability, and environmental interactions.
Publisher OA PDF DOI
INPROVF: Leveraging Large Language Models to Repair High-level Robot Controllers from Assumption Violations
2025-08-17
article
This paper presents INPROVF, an automatic framework that combines large language models (LLMs) and formal methods to speed up the repair process of high-level robot controllers. Previous approaches based solely on formal methods are computationally expensive and cannot scale to large state spaces. In contrast, INPROVF uses LLMs to generate repair candidates, and formal methods to verify their correctness. To improve the quality of these candidates, our framework first translates the symbolic representations of the environment and controllers into natural language descriptions. If a candidate fails the verification, INPROVF provides feedback on potential unsafe behaviors or unsatisfied tasks, and iteratively prompts LLMs to generate improved solutions. We demonstrate the effectiveness of INPROVF through 12 violations with various workspaces, tasks, and state space sizes.
Publisher DOI
Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration
2025-10-19
preprintOpen access
Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autonomous vehicles (CAVs) equipped with two different configurations of LiDAR sensors, plus a roadside unit with dual LiDARs. Our dataset provides point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training. We provide detailed statistical analysis on the quality of our dataset and extensively benchmark existing V2X methods on it. The Mixed Signals dataset is ready-to-use, with precise alignment and consistent annotations across time and viewpoints. Dataset website is available at https://mixedsignalsdataset.cs.cornell.edu/.
Publisher OA PDF DOI
Improving Multislice Electron Ptychography with a Generative Prior
ArXiv.org · 2025-07-23
preprintOpen access
Multislice electron ptychography (MEP) is an inverse imaging technique that computationally reconstructs the highest-resolution images of atomic crystal structures from diffraction patterns. Available algorithms often solve this inverse problem iteratively but are both time consuming and produce suboptimal solutions due to their ill-posed nature. We develop MEP-Diffusion, a diffusion model trained on a large database of crystal structures specifically for MEP to augment existing iterative solvers. MEP-Diffusion is easily integrated as a generative prior into existing reconstruction methods via Diffusion Posterior Sampling (DPS). We find that this hybrid approach greatly enhances the quality of the reconstructed 3D volumes, achieving a 90.50% improvement in SSIM over existing methods.
Publisher OA PDF DOI
Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene
ArXiv.org · 2025-02-10
preprintOpen access
Self-driving cars relying solely on ego-centric perception face limitations in sensing, often failing to detect occluded, faraway objects. Collaborative autonomous driving (CAV) seems like a promising direction, but collecting data for development is non-trivial. It requires placing multiple sensor-equipped agents in a real-world driving scene, simultaneously! As such, existing datasets are limited in locations and agents. We introduce a novel surrogate to the rescue, which is to generate realistic perception from different viewpoints in a driving scene, conditioned on a real-world sample - the ego-car's sensory data. This surrogate has huge potential: it could potentially turn any ego-car dataset into a collaborative driving one to scale up the development of CAV. We present the very first solution, using a combination of simulated collaborative data and real ego-car data. Our method, Transfer Your Perspective (TYP), learns a conditioned diffusion model whose output samples are not only realistic but also consistent in both semantics and layouts with the given ego-car data. Empirical results demonstrate TYP's effectiveness in aiding in a CAV setting. In particular, TYP enables us to (pre-)train collaborative perception algorithms like early and late fusion with little or no real-world collaborative data, greatly facilitating downstream CAV applications.
Publisher OA PDF DOI
3D Ptychographic Inverse Imaging with Generative Diffusion Models
Microscopy and Microanalysis · 2025-07-01
articleOpen access
Publisher OA PDF DOI
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation
ArXiv.org · 2025-02-27
preprintOpen accessSenior author
High-quality benchmarks are essential for evaluating reasoning and retrieval capabilities of large language models (LLMs). However, curating datasets for this purpose is not a permanent solution as they are prone to data leakage and inflated performance results. To address these challenges, we propose PhantomWiki: a pipeline to generate unique, factually consistent document corpora with diverse question-answer pairs. Unlike prior work, PhantomWiki is neither a fixed dataset, nor is it based on any existing data. Instead, a new PhantomWiki instance is generated on demand for each evaluation. We vary the question difficulty and corpus size to disentangle reasoning and retrieval capabilities respectively, and find that PhantomWiki datasets are surprisingly challenging for frontier LLMs. Thus, we contribute a scalable and data leakage-resistant framework for disentangled evaluation of reasoning, retrieval, and tool-use abilities. Our code is available at https://github.com/kilian-group/phantom-wiki.
Publisher OA PDF DOI
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
ArXiv.org · 2025-02-26
preprintOpen accessSenior author
Large language models (LLMs) should undergo rigorous audits to identify potential risks, such as copyright and privacy infringements. Once these risks emerge, timely updates are crucial to remove undesirable responses, ensuring legal and safe model usage. It has spurred recent research into LLM unlearning, focusing on erasing targeted undesirable knowledge without compromising the integrity of other, non-targeted responses. Existing studies have introduced various unlearning objectives to pursue LLM unlearning without necessitating complete retraining. However, each of these objectives has unique properties, and no unified framework is currently available to comprehend them thoroughly. To fill the gap, we propose a toolkit of the gradient effect (G-effect), quantifying the impacts of unlearning objectives on model performance from a gradient perspective. A notable advantage is its broad ability to detail the unlearning impacts from various aspects across instances, updating steps, and LLM layers. Accordingly, the G-effect offers new insights into identifying drawbacks of existing unlearning objectives, further motivating us to explore a series of new solutions for their mitigation and improvements. Finally, we outline promising directions that merit further studies, aiming at contributing to the community to advance this important field.
Publisher OA PDF DOI
Improving Multislice Electron Ptychography with a Generative Prior
2025-10-19 · 1 citations
article
Multislice electron ptychography (MEP) is an inverse imaging technique that computationally reconstructs the highest-resolution images of atomic crystal structures from diffraction patterns. Available algorithms often solve this inverse problem iteratively but are both time consuming and produce suboptimal solutions due to their ill-posed nature. We develop MEP-DIFFUSION, a diffusion model trained on a large database of crystal structures specifically for MEP to augment existing iterative solvers. MEP-DIFFUSION is easily integrated as a generative prior into existing reconstruction methods via Diffusion Posterior Sampling (DPS). We find that this hybrid approach greatly enhances the quality of the reconstructed 3D volumes, achieving a 90.50% improvement in SSIM over existing methods.
Publisher DOI

Recent grants

RI: AF: Small: Collaborative Research: Differentially Private Learning: From Theory to Applications
NSF · $250k · 2016–2021
III: Small: Collaborative Research: Towards Interpretable Machine Learning
NSF · $250k · 2015–2021
CAREER: New Directions for Metric Learning
NSF · $478k · 2012–2015
TRIPODS: Data Science for Improved Decision-Making: Learning in the Context of Uncertainty, Causality, Privacy, and Network Structures
NSF · $1.5M · 2017–2023
CAREER: New Directions for Metric Learning
NSF · $211k · 2015–2018

Frequent coauthors

Bharath Hariharan
37 shared
Mark Campbell
36 shared
Jacob R. Gardner
33 shared
Wei‐Lun Chao
33 shared
Geoff Pleiss
32 shared
Yurong You
31 shared
Gao Huang
26 shared
Felix Wu
Columbia University
24 shared

Labs

Kilian Weinberger LabPI

Education

Ph.D., machine learning
University of Pennsylvania
B.A., mathematics and computing
University of Oxford

Awards & honors

Outstanding AAAI Senior Program Chair Award (2011)
NSF CAREER award (2012)
Daniel M. Lazar '29 Excellence in Teaching Award (2016)
Ann S. Bowers Teaching and Advising Excellence Award (2024)
ACM Fellow

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Kilian Weinberger

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you