New insights into the variation and evolution of human centromeres


Date
Location
501 Wartik
Event
Seminar

Abstract

Advances in long-read sequencing technologies and genome assembly algorithms have enabled the complete sequence assembly of centromeres in several human genomes. However, the variation in centromeres across individuals, populations, and evolutionary contexts remain largely unexplored, leaving the global landscape of centromere diversity poorly understood. Here, we sought to expand our understanding of centromere variation and evolution by generating complete sequence assemblies of thousands of human centromeres from individuals spanning 5 continental and 28 population groups. To do this, we developed a suite of tools to identify, repair, and characterize human centromeres in near-complete genome assemblies generated by the Human Genome Structural Variation Consortium (HGSVC). We discovered 226 new major centromere haplotypes and 1,870 new α-satellite higher-order repeat (HOR) variants, which we validated in a broader set of complete centromeres generated by the Human Pangenome Reference Consortium (HPRC). Additionally, we uncovered complex patterns of genetic and epigenetic variation, including large-scale duplications, inversion, deletions, mobile element insertions, and multi-kinetochores, that shape centromeric architecture. Comparing 2,110 complete human centromeres to 5,747 complete centromeres recently assembled from the HPRC, we show that centromeres have different mutation rates that likely influence their sequence organization and the location of the kinetochore. We validate these mutation rates in a four-generation family, spanning 28 family members and 473 accurately assembled centromeres, showing that centromeres mutate in different ways depending on their sequence identity and repeat organization, with the kinetochore the most mutable. We propose a model that integrates centromere genetic, epigenetic, and evolutionary variation, revealing the unique evolutionary trajectory of each centromere in the human genome.

Bio

Glennis Logsdon, Ph.D., is an Assistant Professor in the Department of Genetics and a Core Member of the Epigenetics Institute at the University of Pennsylvania Perelman School of Medicine, where she studies the sequence, variation, evolution, and function of human centromeres. Using long-read sequencing technologies and computational methods, Dr. Logsdon was the first to determine the complete sequence of a human autosome (chromosome 8), and she helped to determine the sequence of all centromeres in the first complete human genome, CHM13. Dr. Logsdon’s lab is involved in several national consortia, including the Telomere-to-Telomere (T2T) Consortium, Human Pangenome Reference Consortium (HPRC) and Human Genome Structural Variation Consortium (HGSVC), where they aim to build complete, T2T genomes and pangenomes that better represent human diversity. Her lab also works with non-profit, patient-led organizations, such as Project 8p, to better understand complex structural variation in the human genome.