
Jure Leskovec
· Associate Professor of Computer ScienceVerifiedStanford University · Biomedical Data Science
Active 1977–2026
About
Jure Leskovec is a Professor of Computer Science at Stanford University. His general research area is applied machine learning for large interconnected systems, with a focus on modeling complex, richly-labeled relational structures, graphs, and networks across systems at all scales. These scales range from interactions of proteins within a cell to interactions between humans in society. His research applications include commonsense reasoning, recommender systems, computational social science, and computational biology, with a particular emphasis on drug discovery.
Research topics
- Computer Science
- Artificial Intelligence
- Data Mining
- Machine Learning
- Biology
- Sociology
- Engineering
- Mathematics
- Geography
- Cell biology
- Economics
- Demography
- Data science
- Demographic economics
- Political Science
- Economic growth
- Anatomy
- Socioeconomics
- Psychology
- Econometrics
- Internet privacy
- Law
- Operating system
- Economic geography
Selected publications
TxConformal: Controlling False Discoveries in AI-Driven Therapeutic Discovery
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-30
articleArtificial Intelligence (AI) is transforming therapeutic discovery by scoring a large set of promising candidates and prioritizing a shortlist for further investigation. Quantifying the reliability of AI scores and preventing false positives among selected candidates is key to the efficiency of the discovery process. Conformal prediction (CP) has emerged as a popular tool for guiding such prioritization, especially via the conformal selection framework to control false discovery rates (FDR) in selecting top-ranked candidates under distributional shift 1, 2 . However, deploying these advances in real-world therapeutic discovery remains challenging: distribution shifts are difficult to quantify and correct in high-dimensional biomedical data, and practical workflows often require flexible error metrics. Here, we present T x C onformal , a general framework for trustworthy decision making when building shortlists using AI scores. T x C onformal adjusts for distribution shift by balancing the hidden representations in AI models and then provides confidence measures for true discoveries of target biological properties. These confidence measures, interpretable as p-values, can be used in conjunction with statistical multiple testing procedures to derive selection decisions with limited false positives or to estimate the errors in given selection decisions. T x C onformal controls the false positive rate in six real-world tasks spanning various therapeutic discovery stages, modalities, and AI models with realistic data splits. When selecting promising combinatorial genetic perturbations, T x C onformal nearly halves false-positive selections compared to baseline methods, substantially reducing unnecessary experimental costs by tens of thousands of dollars. When selecting stable protein structures under mutant shifts, T x C onformal identifies about 10 times more proteins than baseline methods at stringent thresholds when running at a target FDR level of 10%, recovering over 90% of valuable candidates that baseline methods miss due to unaccounted distribution shifts. Furthermore, we demonstrate that T x C onformal robustly supports various alternative error metrics suitable for resource-constrained settings. Finally, in a prospective fixed-budget virtual screening campaign for novel antibiotic discovery, T x C onformal predicted false positives in close agreement with experimental outcomes, with substantial improvements over simple baselines.
Bio-BLIP: A Multimodal Architecture for Transferable Reasoning in Genomic Variant Interpretation
bioRxiv (Cold Spring Harbor Laboratory) · 2026-05-15
articleOpen accessSenior authorAbstract Developing scientific hypotheses in biology requires integrating heterogeneous evidence across DNA sequence, gene context, protein function, and prior literature. Existing multimodal AI systems expose biological evidence to reasoning models through textification or by projecting biological embeddings into fine-tuned language models. However, these models are typically highly optimized the specific set of tasks for which they are fine-tuned. Here we present Bio-BLIP, a multimodal Q-former based architecture which leverages biological embeddings and a LLM to generalize to complex reasoning tasks without task-specific fine-tuning. The key to Bio-BLIP is a new neural network architecture that integrates four data modalities – DNA, genes, proteins, and text – through a master Qformer model, which integrates the modality-specific information into a fixed-length prefix for the LLM backbone. Bio-BLIP is pretrained on the task of human genetic variant annotation and achieves a 29.8% increase in generating accurate variant features over frontier LLMs. We evaluate Bio-BLIP zero-shot on downstream genomic tasks of variant prioritization and target gene prediction. Bio-BLIP outperforms two alignment-free genomic language models on regulatory variant prioritization for Mendelian disease. Across the target gene prediction task, Bio-BLIP improves accuracy over LLMs by leveraging learned genomic variant knowledge in difficult cases. Our model produces rich, transparent reasoning traces. In biological domains characterized by multiple scales of data and varied downstream tasks, Bio-BLIP offers a step toward natively multimodal, generalizable reasoning.
Are Current AI Virtual Cell Models Useful for Scientific Discovery?
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-25
articleSenior authorAbstract AI models are increasingly developed to predict the effect of perturbations on gene expression, but current benchmarks fail to reliably measure model performance. Here, we argue that new benchmarks that directly measure the value of model predictions for specific scientific discovery outcomes are needed to address this gap. We present PerturbHD, an evaluation framework for AI-enabled hit discovery, to demonstrate the benefits our proposed approach.
Data for Universal Cell Embeddings: A Foundation Model for Cell Biology
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-01
datasetOpen accessSenior authorData for replicating the Universal Cell Embeddings paper code. See preprint here: https://www.biorxiv.org/content/10.1101/2023.11.28.568918v1
Proteo-R1: Reasoning Foundation Models for De Novo Protein Design
arXiv (Cornell University) · 2026-05-01
preprintOpen accessDeep learning in \emph{de novo} protein design has achieved atomic-level fidelity. However, existing models remain largely non-deliberative: they directly synthesize molecular geometries without explicitly reasoning about which residues or interactions are functionally essential. As a result, design decisions are entangled with continuous sampling dynamics, limiting interpretability, controllability, and systematic reuse of biochemical knowledge. We introduce \textbf{Proteo-R1}, a reasoning-guided protein design framework that explicitly decouples \emph{molecular understanding} from \emph{geometric generation}. Proteo-R1 adopts a dual-expert architecture in which a multimodal large language model (MLLM) serves as an \emph{understanding expert}, analyzing protein sequences, structures, and textual context to identify key functional residues that govern binding and specificity. These residue-level decisions are then passed as hard constraints to a separate diffusion-based \emph{generation expert}, which performs conditional co-design while respecting the fixed interaction anchors. This factorization mirrors how human experts approach molecular engineering: first, reasoning about critical interactions, then optimizing geometry subject to those constraints. By operationalizing reasoning as explicit residue-level commitments rather than latent textual guidance, Proteo-R1 achieves stable, interpretable, and modular integration of LLM reasoning with state-of-the-art geometric generative models. Code, data, and demos are available at https://smiles724.github.io/r1/.
Data for Universal Cell Embeddings: A Foundation Model for Cell Biology
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-01
datasetOpen accessSenior authorData for replicating the Universal Cell Embeddings paper code. See preprint here: https://www.biorxiv.org/content/10.1101/2023.11.28.568918v1
Proteo-R1: Reasoning Foundation Models for De Novo Protein Design
ArXiv.org · 2026-05-01
articleOpen accessDeep learning in \emph{de novo} protein design has achieved atomic-level fidelity. However, existing models remain largely non-deliberative: they directly synthesize molecular geometries without explicitly reasoning about which residues or interactions are functionally essential. As a result, design decisions are entangled with continuous sampling dynamics, limiting interpretability, controllability, and systematic reuse of biochemical knowledge. We introduce \textbf{Proteo-R1}, a reasoning-guided protein design framework that explicitly decouples \emph{molecular understanding} from \emph{geometric generation}. Proteo-R1 adopts a dual-expert architecture in which a multimodal large language model (MLLM) serves as an \emph{understanding expert}, analyzing protein sequences, structures, and textual context to identify key functional residues that govern binding and specificity. These residue-level decisions are then passed as hard constraints to a separate diffusion-based \emph{generation expert}, which performs conditional co-design while respecting the fixed interaction anchors. This factorization mirrors how human experts approach molecular engineering: first, reasoning about critical interactions, then optimizing geometry subject to those constraints. By operationalizing reasoning as explicit residue-level commitments rather than latent textual guidance, Proteo-R1 achieves stable, interpretable, and modular integration of LLM reasoning with state-of-the-art geometric generative models. Code, data, and demos are available at https://smiles724.github.io/r1/.
LLMs Generate Structurally Realistic Social Networks but Overestimate Political Homophily
Proceedings of the International AAAI Conference on Web and Social Media · 2025-06-07 · 8 citations
articleOpen accessSenior authorGenerating social networks is essential for many applications, such as epidemic modeling and social simulations. The emergence of generative AI, especially large language models (LLMs), offers new possibilities for social network generation: LLMs can generate networks without additional training or need to define network parameters, and users can flexibly define individuals in the network using natural language. However, this potential raises two critical questions: 1) are the social networks generated by LLMs realistic, and 2) what are risks of bias, given the importance of demographics in forming social ties? To answer these questions, we develop three prompting methods for network generation and compare the generated networks to a suite of real social networks. We find that more realistic networks are generated with “local” methods, where the LLM constructs relations for one persona at a time, compared to “global” methods that construct the entire network at once. We also find that the generated networks match real networks on many characteristics, including density, clustering, connectivity, and degree distribution. However, we find that LLMs emphasize political homophily over all other types of homophily and significantly overestimate political homophily compared to real social networks.
Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards
ArXiv.org · 2025-07-03
preprintOpen accessSenior authorCompound AI systems integrating multiple components, such as Large Language Models, specialized tools, and traditional machine learning models, are increasingly deployed to solve complex real-world tasks. However, optimizing compound systems remains challenging due to their non-differentiable structures and diverse configuration types across components, including prompts, hyperparameters, and model parameters. To address this challenge, we propose Optimas, a unified framework for effective optimization of compound systems. The core idea of Optimas is to maintain one Local Reward Function (LRF) per component, each satisfying a local-global alignment property, i.e., each component's local reward correlates with the global system performance. In each iteration, Optimas efficiently adapts the LRFs to maintain this property while simultaneously maximizing each component's local reward. This approach enables independent updates of heterogeneous configurations using the designated optimization method, while ensuring that local improvements consistently lead to performance gains. We present extensive evaluations across five real-world compound systems to demonstrate that Optimas outperforms strong baselines by an average improvement of 11.92%, offering a general and effective approach for improving compound systems. Our website is at https://optimas.stanford.edu.
Surface-based Molecular Design with Multi-modal Flow Matching
2025-08-03 · 1 citations
articleOpen accessTherapeutic peptides show promise in targeting previously undruggable binding sites, with recent advancements in deep generative models enabling full-atom peptide co-design for specific protein receptors.However, the critical role of molecular surfaces in proteinprotein interactions (PPIs) has been underexplored.To bridge this gap, we propose an omni-design peptides generation paradigm, called SurfFlow, a novel surface-based generative algorithm that enables comprehensive co-design of sequence, structure, and surface for peptides.SurfFlow employs a multi-modality conditional flow matching (CFM) architecture to learn distributions of surface geometries and biochemical properties, enhancing peptide binding accuracy.Evaluated on the comprehensive PepMerge benchmark, SurfFlow consistently outperforms full-atom baselines across all metrics.These results highlight the advantages of considering molecular surfaces in de novo peptide discovery and demonstrate the potential of integrating multiple protein modalities for more effective therapeutic peptide discovery.
Recent grants
Expeditions: Collaborative Research: Global Pervasive Computational Epidemiology
NSF · $1.4M · 2020–2026
NSF · $540k · 2018–2024
CAREER: Mining structure and dynamics of groups of nodes in real-world networks
NSF · $541k · 2012–2017
III: Small: Collaborative Research: Mining Information Propagation on the Web
NSF · $419k · 2010–2015
RAPID: Collaborative Research: Computational Drug Repurposing for COVID-19
NSF · $100k · 2020–2021
Frequent coauthors
- 64 shared
Marinka Žitnik
- 55 shared
Jon Kleinberg
Cornell University
- 40 shared
Rex Ying
Yale University
- 38 shared
Jiaxuan You
- 37 shared
Rok Sosič
- 31 shared
Maria Brbić
- 29 shared
Tim Althoff
- 28 shared
David Hallac
Education
- 2009
Postdoc, Computer Science Department
Cornell University
- 2008
PhD, Machine Learning Department
Carnegie Mellon University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jure Leskovec
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup