Joseph Chen

Verified

University of California, San Diego · Astronomy and Astrophysics

Active 1991–2026

h-index96

Citations33.3k

Papers41471 last 5y

Funding$128.4M1 active

Faculty page

See your match with Joseph Chen — sign in to PhdFit.Sign in

Research topics

Immunology
Cell biology
Genetics
Biology
Endocrinology

Selected publications

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis
arXiv (Cornell University) · 2026-02-12
articleOpen access
Single-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.
Publisher OA PDF
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
arXiv (Cornell University) · 2026-02-12
articleOpen access
We present scPilot, the first systematic framework to practice omics-native reasoning: a large language model (LLM) converses in natural language while directly inspecting single-cell RNA-seq data and on-demand bioinformatics tools. scPilot converts core single-cell analyses, i.e., cell-type annotation, developmental-trajectory reconstruction, and transcription-factor targeting, into step-by-step reasoning problems that the model must solve, justify, and, when needed, revise with new evidence. To measure progress, we release scBench, a suite of 9 expertly curated datasets and graders that faithfully evaluate the omics-native reasoning capability of scPilot w.r.t various LLMs. Experiments with o1 show that iterative omics-native reasoning lifts average accuracy by 11% for cell-type annotation and Gemini-2.5-Pro cuts trajectory graph-edit distance by 30% versus one-shot prompting, while generating transparent reasoning traces explain marker gene ambiguity and regulatory logic. By grounding LLMs in raw omics data, scPilot enables auditable, interpretable, and diagnostically informative single-cell analyses. Code, data, and package are available at https://github.com/maitrix-org/scPilot
Publisher OA PDF
CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis
Open MIND · 2026-02-12
preprint
Single-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.
DOI
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
Open MIND · 2026-02-12
preprint
We present scPilot, the first systematic framework to practice omics-native reasoning: a large language model (LLM) converses in natural language while directly inspecting single-cell RNA-seq data and on-demand bioinformatics tools. scPilot converts core single-cell analyses, i.e., cell-type annotation, developmental-trajectory reconstruction, and transcription-factor targeting, into step-by-step reasoning problems that the model must solve, justify, and, when needed, revise with new evidence. To measure progress, we release scBench, a suite of 9 expertly curated datasets and graders that faithfully evaluate the omics-native reasoning capability of scPilot w.r.t various LLMs. Experiments with o1 show that iterative omics-native reasoning lifts average accuracy by 11% for cell-type annotation and Gemini-2.5-Pro cuts trajectory graph-edit distance by 30% versus one-shot prompting, while generating transparent reasoning traces explain marker gene ambiguity and regulatory logic. By grounding LLMs in raw omics data, scPilot enables auditable, interpretable, and diagnostically informative single-cell analyses. Code, data, and package are available at https://github.com/maitrix-org/scPilot
DOI
Transcriptional Readthrough at Atf4 Locus Suppresses Rps19bp1 and Impairs Heart Development
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-29
preprintOpen accessSenior authorCorresponding
BACKGROUND Activating Transcription Factor 4 (ATF4) functions as a transcriptional regulator in various cell types and tissues under both physiological and pathological conditions. While previous studies have linked ATF4 activation with promoting cardiomyocyte (CM) death in dilated cardiomyopathy (DCM), atrial fibrillation, and heart failure, its role in developing CMs remains unexplored. METHODS We generated multiple distinct CM-specific ( Atf4 cKO(e2/3/pA) and Atf4 cKO(e2) ) and global Atf4 knockout ( Atf4 7del/7del and Atf4 1ins/1ins ) mouse models targeting different Atf4 regions, as well as cardiomyocyte-specific deletion of Rps19bp1 to study cardiac phenotypes. Detailed morphological and molecular analyses were performed. RESULTS Atf4 cKO( e2/3 /pA) (targeting exon 2-3 including the polyadenylation signal (polyA)) mice exhibited severe cardiac defects and died before E17.5, likely due to ectopic activation of p53 signaling pathway resulting from Rps19bp1 downregulation, a potent suppressor of p53. Further investigation revealed that deleting the polyA signal of Atf4 in Atf4 cKO(e2/3/pA) mice led to transcriptional readthrough, resulting in the formation of an Atf4 - Cacna1i fusion transcript and Rps19bp1 downregulation. To avoid readthrough while abolishing ATF4 function, we introduced small indels into exon 3 of Atf4 in mice ( Atf4 7del/7del and Atf4 1ins/1ins ), which showed normal Rps19bp1 expression and cardiac morphology. Importantly, CM-specific deletion of Rps19bp1 recapitulated the cardiac defects and transcriptional change seen in Atf4 cKO(e 2 /3/pA) mice. CONCLUSIONS We found that the downregulation of Rps19bp1 , not loss of ATF4 function, underlying the cardiac phenotypes in Atf4 cKO(e2/3/pA) mice. The reduced expression of Rps19bp1 in Atf4 cKO(e2/3/pA) mice is likely due to the unintentional deletion of Atf4 polyA signal and subsequent transcriptional readthrough, underscoring the essential role of RPS19BP1, not ATF4, in cardiac development. Consistent Rps19bp1 downregulation has been observed in other tissue-specific Atf4 knockout models utilizing the Atf4 fl(e2/3/pA) allele, suggesting that previously reported Atf4 KO phenotypes may result from Atf4 transcriptional readthrough effects. These findings reveal a locus-dependent transcriptional interference mechanism and emphasize the importance of avoiding confounding cis effects in genetically engineered models. TRANSLATIONAL PERSPECTIVE Our findings clarify ATF4’s role in heart development by showing that cardiac defects in cardiomyocyte-specific ATF4 knockout mice—using a widely employed floxed ATF4 line—result from unintended downregulation of RPS19BP1 caused by transcriptional readthrough. This shifts the focus from ATF4 to RPS19BP1, a key regulator of p53 activity, as a potential driver of cardiac developmental abnormalities. Clinically, these insights caution against misinterpretation of genetic knockout models and highlight RPS19BP1 as a promising target for congenital heart disease and related cardiac dysfunctions, with potential implications for future therapies.
Publisher OA PDF DOI
Transcriptional readthrough at Atf4 locus suppresses Rps19bp1 and impairs heart development
Cardiovascular Research · 2025-11-14 · 2 citations
articleOpen accessSenior author
AIMS: Activating transcription factor 4 (ATF4) functions as a transcriptional regulator in various cell types and tissues under both physiological and pathological conditions. While previous studies have linked ATF4 activation with promoting cardiomyocyte (CM) death in dilated cardiomyopathy (DCM), atrial fibrillation, and heart failure, its role in developing CMs remains unexplored. METHODS AND RESULTS: We generated multiple distinct CM-specific (Atf4cKO(e2/3/pA) and Atf4cKO(e2)) and global Atf4 knockout (KO; Atf47del/7del and Atf41ins/1ins) mouse models targeting different Atf4 regions, as well as CM-specific deletion of Rps19bp1 to study cardiac phenotypes. Detailed morphological and molecular analyses were performed. Atf4cKO(e2/3/pA) [targeting exon 2-3 including the polyadenylation signal (polyA)] mice exhibited severe cardiac defects and died before E17.5, likely due to ectopic activation of the p53 signaling pathway resulting from Rps19bp1 downregulation, a potent suppressor of p53. Further investigation revealed that deleting the polyA signal of Atf4 in Atf4cKO(e2/3/pA) mice led to transcriptional readthrough, resulting in the formation of an Atf4-Cacna1i fusion transcript and Rps19bp1 downregulation. To avoid readthrough while abolishing ATF4 function, we introduced small indels into exon 3 of Atf4 in mice (Atf47del/7del and Atf41ins/1ins), which showed normal Rps19bp1 expression and cardiac morphology. Importantly, CM-specific deletion of Rps19bp1 recapitulated the cardiac defects and transcriptional change seen in Atf4cKO(e2/3/pA) mice. CONCLUSION: We found that the downregulation of Rps19bp1, not the loss of ATF4 function, underlies the cardiac phenotypes in Atf4cKO(e2/3/pA) mice. The reduced expression of Rps19bp1 in Atf4cKO(e2/3/pA) mice is likely due to the unintentional deletion of Atf4 polyA signal and subsequent transcriptional readthrough, underscoring the essential role of RPS19BP1, not ATF4, in cardiac development. Consistent Rps19bp1 downregulation has been observed in other tissue-specific Atf4 KO models utilizing the Atf4fl(e2/3/pA) allele, suggesting that previously reported Atf4 KO phenotypes may result from Atf4 transcriptional readthrough effects. These findings reveal a locus-dependent transcriptional interference mechanism and emphasize the importance of avoiding confounding cis effects in genetically engineered models.
Publisher OA PDF DOI
Design of a mechanically flexible membrane electrode based on TPAE binder for lithium extraction from high Mg/Li ratio brines
Separation and Purification Technology · 2025-11-06 · 3 citations
article1st author
Publisher DOI
FOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance
2025-10-19
preprintOpen access
Recently, Multi-modal Large Language Models (MLLMs) have shown remarkable effectiveness for multi-modal tasks due to their abilities to generate and understand cross-modal data. However, processing long sequences of visual tokens extracted from visual backbones poses a challenge for deployment in real-time applications. To address this issue, we introduce FOLDER, a simple yet effective plug-and-play module designed to reduce the length of the visual token sequence, mitigating both computational and memory demands during training and inference. Through a comprehensive analysis of the token reduction process, we analyze the information loss introduced by different reduction strategies and develop FOLDER to preserve key information while removing visual redundancy. We showcase the effectiveness of FOLDER by integrating it into the visual backbone of several MLLMs, significantly accelerating the inference phase. Furthermore, we evaluate its utility as a training accelerator or even performance booster for MLLMs. In both contexts, FOLDER achieves comparable or even better performance than the original models, while dramatically reducing complexity by removing up to 70% of visual tokens.
Publisher OA PDF DOI
Telitacicept plus low-dose mycophenolate mofetil in the treatment of IgA nephropathy: a retrospective study
Clinical and Experimental Medicine · 2025-08-11 · 3 citations
articleOpen access
Presently, no specific therapies have been recognized for immunoglobulin A nephropathy (IgAN). Mycophenolate mofetil (MMF) has been verified effective for Chinese patients with IgAN. Telitacicept is a full-human TACI-FC fusion preventing B cells maturation and activation, and it has been proven to be beneficial for IgAN in a phase II clinical trial. This study was designed to observe the efficacy and safety of telitacicept plus low-dose MMF for IgAN treatment. This retrospective cohort study included 24 patients with IgAN, and patients were treated with telitacicept plus MMF. The primary outcome was settled as the changing in proteinuria and estimated glomerular filtration rate (eGFR). The subordinate outcome was set as the changing in hematuria. The mean follow-up time was 23 months. The median baseline proteinuria was 2.5 (1.74, 6.58) g/d, and eGFR was 94.97 (56.8, 120.67) mL/min/1.73 m2. There were noteworthy reductions in proteinuria at 3, 6, 9, 12, 15, 18, 21 and 24 months when compared to the baseline levels [1.45 (0.78, 1.8) g/d [p = 0.0122], 0.505 (0.26, 0.99) g/d [p < 0.0001], 0.48 (0.28, 0.76) g/d [p < 0.0001], 0.3 (0.17, 0.85) g/d [p < 0.0001], 0.23 (0.18, 0.575) g/d [p < 0.0001], 0.18 (0.12, 0.325) g/d [p < 0.0001], 0.14 (0.105, 0.22) g/d [p < 0.0001] and 0.14 (0.103, 0.278) g/d [p < 0.0001]]. All patients maintained stable eGFR during follow-up times. Besides, telitacicept plus MMF remarkably alleviated the hematuria. Telitacicept plus MMF treatment led to not only remarkable clinically significant reduction in proteinuria and hematuria, but also stable serum creatinine value of patients with IgAN without adverse side effects.
Publisher OA PDF DOI
Rotational stiffness and moment-rotation response for embedded transfer connections in steel-reinforced concrete vertically irregular structures
Journal of Building Engineering · 2025-08-05 · 1 citations
articleCorresponding
Publisher DOI

Recent grants

BAG3 in Cardiac function and disease
NIH · $1.6M · 2016–2019
Training in Cardiovascular Physiology and Pharmacology
NIH · $13.3M · 1979–2028
NIH Grant P01HL046345
NIH · $36.2M · 2014
NIH Grant R01HL066100
NIH · $5.3M · 2016
Phenotyping Core
NIH · $1.3M · 2017

Frequent coauthors

Sylvia Μ. Evans
University of California, San Diego
119 shared
Nancy D. Dalton
117 shared
Kirk L. Peterson
University of California, San Diego
106 shared
Yusu Gu
University of California, San Diego
85 shared
Julius Bogomolovas
62 shared
Kenneth R. Chien
Karolinska Institutet
57 shared
Kirk U. Knowlton
Intermountain Medical Center
54 shared
Paola Cattaneo
53 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Joseph Chen

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you