
Jason Buenrostro
· Assistant Professor, Department of Stem Cell and Regenerative Biology (SCRB)VerifiedHarvard University · Molecular and Cellular Biology
Active 2010–2026
About
Jason Buenrostro, PhD, is a professor at Harvard University and is affiliated with the Broad Institute, focusing on epigenomics. His research involves studying the epigenetic mechanisms that regulate gene expression and contribute to cellular identity and function. As a leader in the field, he has contributed to advancing our understanding of chromatin accessibility and its role in development and disease. His work is characterized by developing innovative methods and tools for analyzing epigenomic data, which have broad applications in biomedical research. Dr. Buenrostro's research aims to uncover the molecular underpinnings of cellular states and transitions, providing insights that are crucial for understanding complex biological processes and for developing targeted therapies.
Research topics
- Biology
- Genetics
- Cell biology
- Computational biology
- Computer Science
- Immunology
- Philosophy
- Literature
- Virology
- Medicine
- Art
- Cancer research
- Neuroscience
- Endocrinology
- Remote sensing
- Internal medicine
- Environmental ethics
Selected publications
Transcription factor collaboration enables precise T cell state engineering
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-22
articleOpen accessSenior authorCorrespondingAbstract Transcription factors (TFs) collaborate to regulate gene expression programs that define cell fate. In CD8 + T cells, this coordinated regulation underlies exhaustion, a dysfunctional state that constrains immunity in chronic infection and cancer. Here, we screen for cell state-specific TFs by performing pooled overexpression screens of 3,548 TF and TF isoforms in primary T cells across multiple CD8 + T cell states. We identify 82 regulators that collaborate with exhaustion-specific programs and profile their effects using perturb-SHARE-seq, connecting perturbations to changes in chromatin accessibility and gene expression across 702,314 single cells. We identify 38 reproducible regulatory programs and construct a map of 12,616 TF-program connections that shape CD8 + T cell states, nominating KLF2 as predictive of positive response to CAR-T therapy. Using seq2PRINT, a deep learning framework that predicts functional TF interactions, we identify RUNX as a “master collaborator”, a TF that broadly collaborates with other factors, and uncover a RUNX2:KLF2 interaction that specifies exhaustion-associated programs. Mutation of the RUNX2:KLF2 protein interface attenuates KLF2-mediated repression of exhaustion, while synthetic tethering of RUNX2 to KLF2 leads to an amplification of the phenotype. More broadly, we identify the collaborative action of RUNX as a driver in CD8 + T cell states, and show that tethering TFs enables the rational engineering of cell state identity for cell and gene therapies.
Refining sequence-to-activity models by increasing model resolution
Bioinformatics Advances · 2026-05-01
articleOpen accessAbstract Decoding the cis-regulatory syntax that controls gene expression is essential for improving our understanding of cell differentiation and disease. Deep learning based sequence-to-activity (S2A) models learn to identify regulatory motifs and their syntax through modeling chromatin accessibility. Previously, we developed AI-TAC, a S2A model that predicts chromatin accessibility across various immune cell types in multi-task fashion, effectively decoding the regulatory syntax underlying immune cell differentiation. While ATAC-seq is commonly used to measure regional accessibility, it also provides high-resolution profiles, the distribution of Tn5 insertion sites, that offer additional insights into the precise location and strength of TF binding sites. Here we present bpAI-TAC, a base-pair resolution ATAC-seq model, and demonstrate that modeling ATAC-seq profiles alongside accessibility consistently improves predictions of differential chromatin accessibility across cell types. Moreover, we find that multi-task learning across related immune cell types consistently outperforms single-task models. To understand what additional information bpAI-TAC learns from ATAC-seq profiles, we systematically compare sequence attributions from models trained with and without ATAC-seq profiles. We identify novel motifs with strong effect sizes that emerge when profile data is included. Our findings suggest that modeling ATAC-seq at base-pair resolution enables the model to learn a more nuanced and sensitive representation of the cis-regulatory syntax driving immune cell-specific chromatin landscapes.
Epigenetic memory of colitis promotes tumour growth
Nature · 2026-03-25 · 6 citations
articleOpen accessSenior authorChronic inflammation is a well-established risk factor for cancer, but the underlying molecular mechanisms remain unclear1,2. Using a mouse model of colitis, we demonstrate that colonic stem cells retain an epigenetic memory of inflammation following disease resolution that persists for more than 100 days. Here we find that memory of colitis is characterized by a cumulative gain of activator protein 1 (AP-1) transcription factor activity, with durable changes to chromatin accessibility. Further, we develop SHARE-TRACE, a method that enables simultaneous profiling of gene expression, chromatin accessibility and clonal history in single cells, enabling high-resolution tracking of epigenomic memory. This approach reveals that memory of colitis is propagated cell-intrinsically and inherited through stem cell divisions, with some clones demonstrating stronger memory than others. Finally, we show that colitis primes stem cells for increased expression of an AP-1-regulated gene program following oncogenic mutation that accelerates tumour growth, a phenotype dependent on AP-1 activity. Together, our findings provide a mechanistic link between chronic inflammation and malignancy, revealing how long-lived epigenetic alterations in regenerative tissues may contribute to disease susceptibility and suggesting potential diagnostic and therapeutic strategies to mitigate cancer risk in patients with chronic inflammatory conditions. Colonic stem cells retain a memory of inflammation following disease resolution and there is a mechanistic link between chronic inflammation and malignancy, suggesting potential strategies to mitigate cancer risk in patients with chronic inflammatory conditions.
Toward AI-Powered Cancer Etiology Research
Cancer Discovery · 2026-04-13
articleAdvances in multimodal longitudinal data and artificial intelligence (AI) create new opportunities for cancer etiology research. We envision an AI-powered discovery workflow integrating an interoperable epidemiologic data ecosystem and causal inference frameworks to accelerate the identification of both cancer causes and the converging biological states for prevention.
Refining sequence-to-activity models by increasing model resolution
bioRxiv (Cold Spring Harbor Laboratory) · 2025-01-27 · 1 citations
preprintOpen accessDecoding the cis-regulatory syntax that controls gene expression is essential for improving our understanding of cell differentiation and disease. To identify regulatory motifs and their regulatory syntax, deep learning based sequence-to-activity (S2A) models learn transcription factor binding motifs and their combinations from DNA sequence by modeling measured chromatin accessibility. Previously, we developed AI-TAC, a S2A model that predicts chromatin accessibility across various immune cell types in multi-task fashion, effectively decoding the regulatory syntax underlying immune cell differentiation. While ATAC-seq is commonly used to measure regional accessibility, it also provides high-resolution profiles, the distribution of Tn5 insertion sites, that offer additional insights into the precise location and strength of TF binding sites. Here we demonstrate that modeling ATAC-seq profiles alongside accessibility consistently improves predictions of differential chromatin accessibility across cell types. Moreover, we also find that multi-task learning across related immune cell types consistently outperforms single-task models. To understand what additional information bpAITAC learns from ATAC-seq profiles, we systematically compare sequence attributions from models trained with and without ATAC-seq profiles. We identify novel motifs with strong effect sizes that emerge only when profile data is included. Our findings suggest that modeling ATAC-seq at base-pair resolution enables the model to learn a more nuanced and sensitive representation of the cis-regulatory syntax driving immune cell-specific chromatin landscapes.
Unified molecular approach for spatial epigenome, transcriptome, and cell lineages
Proceedings of the National Academy of Sciences · 2025-04-18 · 12 citations
articleOpen accessSpatial epigenomics and multiomics can provide fine-grained insights into cellular states but their widespread adoption is limited by the requirement for bespoke slides and capture chemistries for each data modality. Here, we present SPatial assay for Accessible chromatin, Cell lineages, and gene Expression with sequencing (SPACE-seq), a method that utilizes polyadenine-tailed epigenomic libraries to enable facile spatial multiomics using standard whole transcriptome reagents. Applying SPACE-seq to a human glioblastoma specimen, we reveal the state of the tumor microenvironment, extrachromosomal DNA copy numbers, and identify putative mitochondrial DNA variants.
Zenodo (CERN European Organization for Nuclear Research) · 2025-02-13 · 1 citations
datasetOpen accessSenior authorData associated with the multi-scale footprinting project. (1) Tn5_NN_model.h5 Pre-trained CNN-based Tn5 bias model implemented with Keras. Takes local DNA sequence context as input and predicts Tn5 insertion bias. See tutorial for how to use this model. (2) Tn5ModelTutorial.ipynb Tutorial showing how to use the pre-trained Tn5 bias model to score input sequences. (3) hg38Tn5Bias.tar.gz, hg19Tn5Bias.tar.gz, mm10Tn5Bias.tar.gz, mm39_bias_v2.h5, panTro6Tn5Bias.tar.gz, sacCer3Tn5Bias.tar.gz, dm6Tn5Bias.tar.gz, danRer11Tn5Bias.tar.gz, ce11Tn5Bias.tar.gz h5 files containing the genome-wide Tn5 bias pre-computed using our convolutional neural net model. (4) dispModel.tar.gz Zipped folder containing Tn5 cutting dispersion models for each footprint window radius. The footprint window size in our paper refers to the diameter the footprint window, which is twice the number listed here. During footprinting, these models are loaded into the footprintingProject object and then used for footprinting. (5) cisBP_mouse_pwms_2021.rds, cisBP_human_pwms_2021.rds Motif PWMs used in our study. (6) TFBS_model.h5 Pre-trained footprint-to-TF binding prediction models. The models takes local multi-scale footprints as input and predict whether a genomic position is bound by a TF if the corresponding motif is present. This is obsolete. For the best performance of TF binding prediction, please use our seq2PRINT-based TF binding prediction. (7) clusterLabels.txt, clusterLabelsAllTFs.txt Cluster labels of TFs. clusterLabels.txt is the clustering result directly obtained from clustering multi-scale footprints of all TFs with ChIP data. clusterLabelsAllTFs.txt includes other TFs without ChIP data. The cluster membership of these TFs were assigned based on motif homology among TFs. (8) BMMCTutorial.tar.gz Data needed for our R version tutorial. Content of this foder can be put into the /data/BMMCTutorial folder. (9) PBMC_bulk_ATAC_tutorial fragments files Files used by our PBMC bulk ATAC tutorial for scPrinter. See https://github.com/buenrostrolab/scPrinter for details. (10) PBMC_bulk_ATAC_tutorial example result TFBS bigwigs (Bcell_0_TFBS.bigwig, Bcell_1_TFBS.bigwig, Monocyte_0_TFBS.bigwig, Monocyte_1_TFBS.bigwig , Tcell_0_TFBS.bigwig, Tcell_1_TFBS.bigwig). Example result files generated by our PBMC bulk ATAC tutorial for scPrinter. See https://github.com/buenrostrolab/scPrinter for details. Here we filtered ATAC-seq peaks based on accessibility, keeping ~70k highly accessible peaks. (11) CTCF_degron.tar.gz Input data used for the CTCF degron analysis. See https://github.com/buenrostrolab/PRINT/blob/main/analyses/degron/ENCODE_CTCF_degron.ipynb for details. (12) obsBias.tsv Input data used for training the Tn5 bias model. For more details see https://github.com/buenrostrolab/PRINT/blob/main/code/predictBias.py (line 84)
Myeloid progenitor dysregulation fuels immunosuppressive macrophages in tumours
Nature · 2025-09-10 · 19 citations
articleOpen accessEarly transcriptional effects of inflammatory cytokines reveal highly redundant cytokine networks
The Journal of Experimental Medicine · 2025-01-28 · 9 citations
articleOpen accessInflammatory cytokines are fundamental mediators of the organismal response to injury, infection, or other harmful stimuli. To elucidate the early and mostly direct transcriptional signatures of inflammatory cytokines, we profiled all immunologic cell types by RNAseq after systemic exposure to IL1β, IL6, and TNFα. Our results revealed a significant overlap in the responses, with broad divergence between myeloid and lymphoid cells, but with very few cell-type-specific responses. Pathway and motif analysis identified several main controllers (NF-κB, IRF8, and PU.1), but the largest portion of the response appears to be mediated by MYC, which was also implicated in the response to γc cytokines. Indeed, inflammatory and γc cytokines elicited surprisingly similar responses (∼50% overlap in NK cells). Significant overlap with interferon-induced responses was observed, paradoxically in lymphoid but not myeloid cell types. These results point to a highly redundant cytokine network, with intertwined effects between disparate cytokines and cell types.
Cell stem cell · 2025-05-21 · 3 citations
article
Recent grants
Single-cell epigenomic and cellular plasticity
NIH · $2.5M · 2019–2024
Frequent coauthors
- 147 shared
Sai Ma
Huaibei Normal University
- 115 shared
Aviv Regev
Broad Institute
- 112 shared
Caleb A. Lareau
Memorial Sloan Kettering Cancer Center
- 74 shared
Vinay K. Kartha
VIR Biotechnology (United States)
- 72 shared
William J. Greenleaf
- 60 shared
Fabiana M. Duarte
- 46 shared
Leif S. Ludwig
- 46 shared
Andrew Earl
Broad Institute
Labs
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jason Buenrostro
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup