
About
Dr. Janet Song is an Assistant Professor in the Department of Human Evolutionary Biology at Harvard University. Her research focuses on identifying and characterizing the genetic variants that contributed to human-specific traits, with a particular emphasis on the evolution of the human brain. Her work aims to understand how human-specific changes to the brain impact neurodevelopmental and neuropsychiatric diseases in modern humans. She is associated with the Song Lab at Harvard University, which investigates the genetic basis of human evolution and brain development.
Research topics
- Biology
- Neuroscience
- Genetics
Selected publications
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-23
articleOpen access1st authorCorrespondingAbstract Length variation in tandem repeats is a well-established driver of disease risk and is commonly assumed to arise from persistent genomic instability. Here, we characterize TRACT, a 30-bp variable number tandem repeat (VNTR) intronic to the calcium channel gene CACNA1C . TRACT exhibits extreme length variation (3-30+ kb) and has been previously linked to risk for bipolar disorder and schizophrenia. By examining multiple human cohorts, we find that TRACT alleles are strikingly bimodal in both length and sequence composition. Short alleles (TRACT S , ∼6 kb) and long alleles (TRACT L , ∼24 kb) are enriched for distinct 30-bp variants and are found on separate haplotypes that arose prior to the human migration out of Africa. Our data suggest that these ancient alleles expanded via perfect repeat tracts that were disrupted by accumulated mutations to result in relative length stability in extant humans, where there is no evidence for overt germline or somatic instability. Interestingly, neuropsychiatric disease risk is associated with specific 30-bp variants within TRACT S alleles, but not with overall TRACT length or with 30-bp variants enriched in TRACT L alleles. Instead, TRACT L alleles are associated with decreased gene expression in fibroblasts and testis. Together, these findings motivate joint examination of both sequence composition and length variation to fully understand the effects of VNTRs on evolution, trait variation, and disease risk.
Genomic approaches for understanding the evolution of the human brain
Nature Neuroscience · 2026-04-21
article1st authorCorrespondingHuman-specific tandem repeat in <i>CACNA1C</i> modulates responses to neuronal stimulation
bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-16 · 2 citations
preprintOpen access1st authorAbstract The recent development of long-read sequencing has made it possible to catalog variable number tandem repeats (VNTRs) in the human genome. However, little is known about their functional consequences. Here, we characterized the effect of TRACT, a VNTR that is unique to humans and that has sequence variants linked to risk for bipolar disorder and schizophrenia. By adding or removing this VNTR in both mouse models and human neural organoids, we find that TRACT, which is intronic to the L-type voltage-gated calcium channel gene CACNA1C , increases intracellular calcium after neuronal stimulation and leads to widespread changes in activity-dependent transcription programs in neurons. TRACT-dependent changes are enriched for genes associated with synapse formation and plasticity, and partially recapitulate evolutionary changes in activity-dependent transcription between species. These findings demonstrate that a single, human-specific, non-coding element can strongly affect the neuronal response to stimulation, and motivate the study of VNTRs as a genetic source of phenotypic variation in both evolution and disease.
bioRxiv (Cold Spring Harbor Laboratory) · 2025-04-01 · 4 citations
preprintOpen accessAfter birth, sensory inputs to neurons trigger the induction of activity-dependent genes (ADGs) that mediate many aspects of neuronal maturation and plasticity. To identify human-specific ADGs, we characterized these genes in human-chimpanzee tetraploid neurons. We identified 235 ADGs that are differentially expressed between human and chimpanzee neurons and found that their nearby regulatory sites are species-biased in their binding of the transcription factor FOS. An assessment of these sites revealed that many are enriched for single nucleotide variants that promote or eliminate FOS binding in human neurons. Disrupting the function of individual species-biased FOS-bound enhancers diminishes expression of nearby genes and affects the firing dynamics of human neurons. Our findings indicate that FOS-bound enhancers are frequent sites of evolution and that they regulate human-specific ADGs that may contribute to the unusually protracted and complex process of postnatal human brain development.
Transcriptomic Convergence and the Female Protective Effect in Autism
bioRxiv (Cold Spring Harbor Laboratory) · 2025-01-22 · 6 citations
preprintOpen accessABSTRACT Autism spectrum disorder (ASD) is a common neurodevelopmental condition characterized by deficits in social communication as well as restricted and/or repetitive behaviors. ASD is highly heritable 1 , with a complex genetic architecture: large-scale studies have identified dosage-altering copy number variants (CNV) and single nucleotide variants (SNV) that implicate hundreds of genes as individually rare causes of ASD (ASD genes) 2–4 , with common variation at multiple loci also contributing substantially to risk 5 . Understanding how disruptions to these functionally diverse genes lead to the shared core features of ASD remains a major challenge 6 . Moreover, ASD is three- to four-fold more common in males than females 7,8 , and autistic females tend to carry more autosomal risk alleles for ASD compared to autistic males 9–13 , but the biological basis of this “female protective effect” (FPE) is unknown 14,15 . Here we show that individual perturbations of 18 ASD genes converge on shared effects on gene expression, including widespread downregulation of other ASD genes. De novo reconstruction of a gene regulatory network (GRN) enabled the identification of central transcriptional regulators, including the prominent ASD gene CHD8 as well as novel candidates such as REST , that drive this transcriptomic convergence in ASD. Furthermore, the X-linked transcription factor ZFX , which is expressed from both the active and the “inactive” X chromosomes in females 16 , emerged as a key activator of many ASD genes: we propose that the higher ZFX expression level observed in female brain can buffer damaging mutations in diverse ASD genes, contributing to the FPE. Together, these results reveal how key GRNs can become broadly and similarly dysregulated upon disruption of individual ASD genes and provide molecular insight into the female protective effect in ASD.
2025-05-01
articleOpen accessHuman-chimpanzee tetraploid system defines mechanisms of species-specific neural gene regulation
bioRxiv (Cold Spring Harbor Laboratory) · 2025-03-31 · 7 citations
preprintOpen access1st authorAbstract A major challenge in human evolutionary biology is to pinpoint genetic differences that underlie human-specific traits, such as increased neuron number and differences in cognitive behaviors. We used human-chimpanzee tetraploid cells to distinguish gene expression changes due to cis -acting sequence variants that change local gene regulation, from trans expression changes due to species differences in the cellular environment. In neural progenitor cells, examination of both cis and trans changes — combined with CRISPR inhibition and transcription factor motif analyses — identified cis -acting, species-specific gene regulatory changes, including to TNIK, FOSL2 , and MAZ , with widespread trans effects on neurogenesis-related gene programs. In excitatory neurons, we identified POU3F2 as a key cis -regulated gene with trans effects on synaptic gene expression and neuronal firing. This study identifies cis -acting genomic changes that cause cascading trans gene regulatory effects to contribute to human neural specializations, and provides a general framework for discovering genetic differences underlying human traits.
Pretraining Improves Prediction of Genomic Datasets Across Species
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-24
preprintOpen accessRecent studies suggest that deep neural network models trained on thousands of human genomic datasets can accurately predict genomic features, including gene expression and chromatin accessibility. However, training these models is computation- and time-intensive, and datasets of comparable size do not exist for most other organisms. Here, we identify modifications to an existing state-of-the-art model that improve model accuracy while reducing training time and computational cost. Using this stream-lined model architecture, we investigate the ability of models pretrained on human genomic datasets to transfer performance to a variety of different tasks. Models pretrained on human data but fine-tuned on genomic datasets from diverse tissues and species achieved significantly higher prediction accuracy while significantly reducing training time compared to models trained from scratch, with Pearson correlation coefficients between experimental results and predictions as high as 0.8. Further, we found that including excessive training tasks decreased model performance and that this compromised performance could be partially but not completely rescued by fine-tuning. Thus, simplifying model architecture, applying pretrained models, and carefully considering the number of training tasks may be effective and economical techniques for building new models across data types, tissues, and species.
Cell Genomics · 2024-07-16 · 31 citations
articleOpen accessLittle is known about the role of non-coding regions in the etiology of autism spectrum disorder (ASD). We examined three classes of non-coding regions: human accelerated regions (HARs), which show signatures of positive selection in humans; experimentally validated neural VISTA enhancers (VEs); and conserved regions predicted to act as neural enhancers (CNEs). Targeted and whole-genome analysis of >16,600 samples and >4,900 ASD probands revealed that likely recessive, rare, inherited variants in HARs, VEs, and CNEs substantially contribute to ASD risk in probands whose parents share ancestry, which enriches for recessive contributions, but modestly contribute, if at all, in simplex family structures. We identified multiple patient variants in HARs near IL1RAPL1 and in VEs near OTX1 and SIM1 and showed that they change enhancer activity. Our results implicate both human-evolved and evolutionarily conserved non-coding regions in ASD risk and suggest potential mechanisms of how regulatory changes can modulate social behavior.
Communications Medicine · 2024-02-21 · 8 citations
articleOpen access1st authorCorrespondingBACKGROUND: Geographical variations in mood and psychotic disorders have been found in upper-income countries. We looked for geographic variation in these disorders in Colombia, a middle-income country. We analyzed electronic health records from the Clínica San Juan de Dios Manizales (CSJDM), which provides comprehensive mental healthcare for the one million inhabitants of Caldas. METHODS: We constructed a friction surface map of Caldas and used it to calculate the travel-time to the CSJDM for 16,295 patients who had received an initial diagnosis of mood or psychotic disorder. Using a zero-inflated negative binomial regression model, we determined the relationship between travel-time and incidence, stratified by disease severity. We employed spatial scan statistics to look for patient clusters. RESULTS: We show that travel-times (for driving) to the CSJDM are less than 1 h for ~50% of the population and more than 4 h for ~10%. We find a distance-decay relationship for outpatients, but not for inpatients: for every hour increase in travel-time, the number of expected outpatient cases decreases by 20% (RR = 0.80, 95% confidence interval [0.71, 0.89], p = 5.67E-05). We find nine clusters/hotspots of inpatients. CONCLUSIONS: Our results reveal inequities in access to healthcare: many individuals requiring only outpatient treatment may live too far from the CSJDM to access healthcare. Targeting of resources to comprehensively identify severely ill individuals living in the observed hotspots could further address treatment inequities and enable investigations to determine factors generating these hotspots.
Frequent coauthors
- 77 shared
Christopher A. Walsh
Mount Sinai Hospital
- 52 shared
Taehwan Shin
Howard Hughes Medical Institute
- 48 shared
Xuyu Qian
Howard Hughes Medical Institute
- 40 shared
Connor Kenny
Boston Children's Museum
- 36 shared
Dilenny M. Gonzalez
Howard Hughes Medical Institute
- 36 shared
Ellen M. DeGennaro
Boston Children's Hospital
- 36 shared
Ryan N. Doan
Harvard University
- 32 shared
Samantha G. Beck
Harvard University
Labs
SONG LABPI
Education
- 2020
Ph.D., Genetics
Stanford University
- 2013
A.B. in Chemical and Physical Biology, Secondary in Computer Science
Harvard University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Janet Song
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup