
Charles Lee
· Moghadam Family Professor, EmeritusVerifiedStanford University · Korean Studies
Active 1958–2024
Research topics
- Biology
- Genetics
- Evolutionary biology
- Machine Learning
- Computer Science
- Computational biology
- Zoology
Selected publications
The complete sequence and comparative analysis of ape sex chromosomes
Nature · 2024 · 122 citations
- Biology
- Evolutionary biology
- Genetics
. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.
The Complete Sequence and Comparative Analysis of Ape Sex Chromosomes
bioRxiv (Cold Spring Harbor Laboratory) · 2023 · 24 citations
- Biology
- Evolutionary biology
- Genetics
Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.
Nature Genetics · 2022 · 354 citations
- Biology
- Genetics
Cell · 2022 · 1009 citations
- Computer Science
- Biology
- Machine Learning
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.
Haplotype-resolved diverse human genomes and integrated analysis of structural variation
Science · 2021 · 795 citations
- Biology
- Genetics
- Evolutionary biology
Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.
Recent grants
NIH · $13.0M · 2019
Cancer Genetics Research Program
NIH · $80.2M · 1997–2027
NIH · $1.6M · 2011
NIH · $1.7M · 2013
NIH · $1.6M · 2012
Frequent coauthors
- 302 shared
Richard A. Gibbs
Baylor College of Medicine
- 275 shared
Eric Boerwinkle
The University of Texas Health Science Center at Houston
- 245 shared
Christiane Reitz
New York Hospital Queens
- 245 shared
Richard Mayeux
Columbia University
- 245 shared
Badri N. Vardarajan
Columbia University Irving Medical Center
- 245 shared
Sandra Barral
NewYork–Presbyterian Hospital
- 244 shared
Kara L. Hamilton‐Nelson
Dr. John T. Macdonald Foundation
- 244 shared
Sudha Seshadri
Framingham Heart Study
Education
- 1996
Ph.D. Medical Sciences
University of Alberta
- 1993
M.S. Expiremental Pathology, Pathology
University of Alberta
- 1990
B.S., Genetics
University of Alberta
Similar researchers at Stanford University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Charles Lee
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup