Casey S. Greene
VerifiedUniversity of Pennsylvania · Rehabilitation Medicine
Active 1973–2026
Research topics
- Medicine
- Computer Science
- Oncology
- Internal medicine
- Pathology
- Genetics
- Cancer research
- Psychology
- Biology
- Cartography
- Environmental health
- Geography
- Bioinformatics
- Demography
- Gerontology
Selected publications
HaihuaWang-hub/2020-workflows-paper: 2020 workflow paper
Zenodo (CERN European Organization for Nuclear Research) · 2026-01-20
otherOpen access2020 workflow paper
Transcriptomic subtypes in high-grade serous ovarian cancer are driven by tumor cellular composition
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-21
articleOpen accessHigh-grade serous ovarian carcinoma (HGSC) is an aggressive malignancy for which bulk transcriptomic subtypes are used to stratify tumors, interpret biology, and guide biomarker development. The four TCGA-derived subtypes, mesenchymal (C1.MES), immunoreactive (C2.IMM), proliferative (C5.PRO), and differentiated (C4.DIF), are consistently observed across cohorts. However, despite their prominence, these subtypes have not translated into therapeutic utility, and their biological basis remains unresolved. Here, we show that HGSC transcriptomic subtypes are largely determined by tumor cellular composition rather than intrinsic malignant transcriptional programs. By integrating controlled single-cell-derived pseudobulk simulations with deconvolution-based analysis of 1,834 primary HGSC tumors across RNA-seq and microarray cohorts, we demonstrate that subtype probabilities align along a composition-driven axis of stromal and immune variation. Cellular composition alone predicted subtype labels with high accuracy (ROC-AUC = 0.81-0.95) and explained a substantial fraction of subtype-associated transcriptomic variation, with the mesenchymal (C1.MES) subtype representing the most robust and reproducible example of composition-driven signal. Although a secondary, composition-independent expression signal is detectable, it does not define the dominant structure of subtype classification. These findings redefine HGSC transcriptomic subtypes as features of the tumor ecosystem rather than discrete malignant states. This reinterpretation has immediate implications for studies that use subtype labels to infer tumor-intrinsic biology and provides a generalizable framework for separating composition-driven and intrinsic signals in bulk tumor data.
Advances in Protein Function Prediction from the Fifth CAFA Challenge
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-30 · 1 citations
articleOpen accessThe Critical Assessment of Functional Annotation (CAFA) is a long-standing community effort to independently assess computational methods for protein function prediction, to highlight wellperforming methodologies, to identify bottlenecks in the field, and to provide a forum for the dissemination of results and exchange of ideas. In its fifth round (CAFA5) of triennial challenges, a partnership with Kaggle Inc. facilitated participation from a large community of data scientists and computational biologists through a competitive prospective challenge on the crowdsourcing platform. In this work, we present an in-depth analysis of the submitted predictions and report improvements in accuracy over all methods from the previous CAFA challenges. We further introduce a new evaluation setting for proteins with pre-existing (incomplete) annotations and identify the need for methods that better leverage existing annotations to predict those that will be discovered later. Finally, we characterize the prospective evaluation framework by examining performance on a strict set of unpublished annotations and across intermediate database releases. Our results indicate that recent developments in the field, such as the availability of protein language models and accurately predicted 3D structures, as well as the growth of experimental annotations through biocuration, have all contributed to performance improvements.
Cancer Research · 2026-04-03
articleAbstract The Single-cell Pediatric Cancer Atlas (ScPCA) Portal (https://scpca.alexslemonade.org/), developed and maintained by the Childhood Cancer Data Lab, is a data resource for uniformly processed single-cell and single-nuclei RNA sequencing data, as well as de-identified metadata from pediatric tumor samples. Originally comprised of data from 10 projects funded by Alex’s Lemonade Stand Foundation (ALSF), the Portal currently contains summarized gene expression data for over 700 samples across more than 50 cancer types drawn from ALSF-funded and community-contributed datasets. Downloads include gene expression data as SingleCellExperiment or AnnData objects containing raw and normalized counts, PCA and UMAP coordinates, and summary reports. Some samples have additional data from bulk RNA-seq, spatial transcriptomics, and/or feature barcoding (e.g., CITE-seq and cell hashing) included in the download. All data on the Portal were uniformly processed using scpca-nf, an efficient and open-source Nextflow workflow written and maintained by the Data Lab, which utilizes alevin-fry to quantify gene expression. Since presenting the ScPCA Portal at the 2024 AACR Annual Meeting, several new features have been added to the available data. Automated cell type annotation is now performed using three unique methods: SingleR, CellAssign, and SCimilarity. If two of the three methods agree, an ontology-aware consensus cell type label is assigned. The individual annotations and the consensus cell types are included in the cell metadata of the downloaded objects. Some projects also include manually-curated cell type annotations generated as part of the OpenScPCA project (https://openscpca.readthedocs.io). In addition, copy-number variation (CNV) inference is now performed on each sample using the InferCNV package, specifying the i6 HMM to quantify specific CNV events. Since InferCNV quantifies CNV events using a designated set of normal, or non-malignant, reference cells, consensus cell types are used to identify a diagnosis-appropriate normal cell reference for each sample. The total CNVs observed and the full HMM metadata table are stored in the processed SingleCellExperiment and AnnData objects. The updated cell type annotation and implementation of InferCNV are included as part of the open-source workflow, scpca-nf. The workflow and associated documentation are freely available at https://github.com/AlexsLemonade/scpca-nf. Finally, the ScPCA Portal hosts an instance of the UCSC Cell Browser, enabling users to visualize and interact with the gene expression data for all samples without needing to download the data. Comprehensive documentation about data processing and the contents of files on the portal, including a guide to getting started working with an ScPCA dataset, can be found at https://scpca.readthedocs.io. Citation Format: Allegra G. Hawkins, Joshua A. Shapiro, Stephanie J. Spielman, David S. Mejia, Deepashree Venkatesh Prasad, Nozomi Ichihara, Arkadii Yakovets, Avrohom M. Gottlieb, Kurt G. Wheeler, Chanté J. Bethell, Steven M. Foltz, Jennifer O'Malley, Casey S. Greene, Jaclyn N. Taroni. Improving the utility of the single-cell pediatric cancer atlas through updated cell type annotations, CNV inference, and visualization tools [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 3498.
Integrating single-cell and single-nucleus datasets improves bulk RNA-seq deconvolution
Cell Reports Methods · 2026-03-26
articleOpen accessSenior authorBulk RNA sequencing (RNA-seq) deconvolution typically uses single-cell RNA sequencing (scRNA-seq) references, but some cells are only detectable through single-nucleus RNA sequencing (snRNA-seq). Because snRNA-seq captures nuclear, not cytoplasmic, transcripts, its direct use as a reference could reduce deconvolution accuracy. We benchmarked integration strategies across four tissues, comparing principal component (PC)-based latent shifts, conditional and non-conditional scVI (single cell variational inference), and cross-modality differentially expressed gene (DEG) filtering. All approaches improved over raw snRNA-seq, but pruning cross-modality DEGs produced the largest gains, often matching or exceeding scRNA-only references. Conditional scVI performed comparably and was effective when matched scRNA-snRNA cell types were unavailable. In real adipose bulk samples, DEG pruning and conditional scVI provided the most robust cell-fraction estimates across donors and transformations. These results demonstrate that scRNA-seq should be prioritized as a reference when available, and we recommend appending snRNA-seq only after removing cross-modality DEGs; when DEG information is limited, conditional scVI is a practical alternative.
The Common Fund Data Ecosystem (CFDE)
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-12
articleOpen accessSenior authorThe NIH Common Fund Data Ecosystem (CFDE) integrates data resources from 18 NIH Common Fund programs for discovery and integrative analysis. These programs generate valuable but heterogeneous datasets that can be difficult to discover, access, and reuse. CFDE aims to provide a collaborative, community-built infrastructure that links and enriches Common Fund programs. We describe the evolution, structure, and core technologies of CFDE, including practical approaches that support submission, integration, visualization, and public release of multimodal data. Training programs and workforce initiatives lower barriers to adoption. CFDE has devised solutions to critical issues facing cross-program initiatives, including data scale and heterogeneity, dataset integration, and long-term sustainability. We demonstrate the utility of linking Common Fund resources through integrative tools and cross-dataset queries to yield insights that would otherwise be infeasible. Collectively, CFDE shows that a standards-driven, federated approach enhances and unifies cross-disciplinary resources, fostering collaboration and data-driven discovery.
HaihuaWang-hub/2020-workflows-paper: 2020 workflow paper
Zenodo (CERN European Organization for Nuclear Research) · 2026-01-20
otherOpen access2020 workflow paper
slolab/aac-manuscript: First preprint version
Open MIND · 2026-02-16
otherThe first version of the preprint for the AAC manuscript.
Deconvolved tumor adipocyte proportions and high grade serous ovarian carcinoma survival
bioRxiv (Cold Spring Harbor Laboratory) · 2026-01-19
articleOpen accessSenior authorCorrespondingBackground: Single-cell-based analyses of high-grade serous ovarian carcinoma (HGSOC) survival have largely ignored adipocytes, which are fragile and under-represented in single-cell references. Adipocytes are known active components of the tumor microenvironment in many cancers, and HGSOC tumors frequently metastasize to the omentum, a lining of adipose tissue. Methods: cohorts. We used stage-stratified Cox models to quantify the association between intratumoural adipocyte fractions and overall survival while adjusting for age, body mass index (BMI), race, and residual disease. We also evaluated associations with deconvolved immune, stromal, and epithelial cell groups. Results: A 10% increase in estimated tumor adipocyte content was associated with a 41% increase in the hazard of death (HR = 1.41, 95% CI 1.18-1.70, p = 0.0002) after adjusting for age, BMI and race (n=566). A 10% increase in immune cell proportion was associated with favorable survival (HR = 0.82, 95% CI 0.69-0.97, p = 0.024). Stromal and epithelial macro-fractions were not associated with survival. Associations with adipocyte and immune cell type proportions were unchanged in models additionally controlling the other cell type proportions. Results were similar after additionally adjusting for residual disease after debulking surgery. Conclusions: Adipocytes may be a tumor-intrinsic factor associated with adverse outcomes in HGSOC. Quantifying adipocyte burden using bulk RNA-seq could enhance risk stratification and guide the development of adipocyte-targeted therapies.
2025-11-01
article<h3>Background</h3> Chronic obstructive pulmonary disease (COPD) is the third leading cause of global mortality. Therapeutic options that effectively attenuate airway remodelling and disease progression remain limited. Epstein-Barr virus (EBV) has been implicated in immune dysregulation and lymphoid neogenesis in COPD, yet the immunological consequences of antiviral suppression in the airway are poorly defined. <h3>Aims</h3> To determine whether antiviral therapy alters cellular composition and inflammatory signalling in the bronchial airway microenvironment of COPD patients. <h3>Methods</h3> Spatial transcriptomics (10x Genomics Visium SD) was performed on paired bronchial biopsies (n=5) collected at baseline and following 8 weeks of treatment from a subset of patients enrolled in the EViSCO trial (NCT03699904), a randomised, placebo-controlled study of valaciclovir (1 g TID). Computational tools including Scanpy, Scanorama, and Cell2Location were used for cell type deconvolution and differential gene expression analysis. <h3>Results</h3> Unsupervised clustering identified distinct epithelial and immune populations with region-specific localisation. Enrichment analysis revealed global downregulation of prostaglandin transport and Toll-like receptor (TLR) pathways, indicative of reduced inflammatory signalling, particularly in T and NK cell-enriched regions. At individual patient level, we observed downregulation of genes associated with extracellular matrix components (e.g., collagen trimer, fibrillar matrix), immune activation pathways (e.g., leukocyte migration, chemotaxis), and epithelial remodelling processes (e.g., EMT, cell adhesion, proliferation). These changes suggest a transition from active inflammation and tissue remodelling toward a more quiescent, structurally resolved epithelial state. Reactome pathway analysis additionally showed downregulation of surfactant metabolism and related disease-associated genes post-treatment, potentially reflecting a resolution of inflammation-driven expression, restoring baseline levels of epithelial surfactant genes such as <i>SFTPA-D</i>, in alignment with broader suppression of immune and tissue remodelling pathways. <h3>Conclusions</h3> This study provides the first spatial transcriptomic characterisation of the airway response to EBV-targeted antiviral therapy in COPD. The findings suggest that treatment suppresses key inflammatory and remodelling pathways, offering mechanistic insight into virus-associated airway pathology and its resolution.
Recent grants
Network-based algorithms for target identification and drug repositioning from genetic associations
NIH · $3.2M · 2021–2024
NSF · $499k · 2015–2019
Characterization of high-grade serous ovarian cancer subtypes via single-cell profiling
NIH · $3.0M · 2019–2026
Frequent coauthors
- 221 shared
Gregory P. Way
Broad Institute
- 192 shared
Daniel Himmelstein
- 189 shared
Halie M. Rando
Smith College
- 152 shared
Anthony Gitter
- 151 shared
Alexandra Lee
- 135 shared
Jaclyn Taroni
Alex's Lemonade Stand Foundation
- 118 shared
Ronan Lordan
University of Pennsylvania
- 108 shared
Jennifer A. Doherty
Education
- 2009
Ph.D., Genetics
Dartmouth College Geisel School of Medicine
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Casey S. Greene
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup