The Human Pangenome Reference Consortium (HPRC) Release 2 represents a major milestone in reference genomics, comprising 470 high-quality, phased haplotypes from 232 globally diverse individuals. These telomere-to-telomere assemblies achieve unprecedented completeness in previously inaccessible regions, including centromeres, subtelomeres, and segmental duplications, providing a foundation for studying complex genomic regions that have been systematically excluded from population-scale analyses. In this talk, I will first present key improvements and data from HPRC Release 2, then focus on two complementary advances that unlock centromere biology. Centromeres, despite their critical roles in chromosome segregation and genome stability, remain among the least characterized regions of the human genome. Leveraging HPRC assemblies, we define centromere-spanning haplotypes (cenhaps), which are extended linkage blocks preserved by suppressed recombination across constitutive heterochromatin. Analysis of cenhaps across 470 haplotypes reveals that many coalesce over a million years ago, with evidence of Neanderthal and Denisovan introgression. These ancient haplotypes are strongly associated with specific satellite array structures and contain hundreds of genes in pericentromeric regions, enabling new applications in GWAS and rare variant discovery. Building from this work, we developed Centrolign, a graph-based alignment tool specifically designed to handle the extreme repetitiveness of centromeric tandem arrays. By combining uniqueness-driven alignment with progressive multiple alignment strategies, Centrolign accurately aligns alpha satellite higher-order repeats, revealing phylogenetic structure and enabling precise estimation of mutation rates and structural variant frequencies within satellite arrays. Together, cenhaps and Centrolign establish a comprehensive framework for incorporating centromeres into population genomics and advancing our understanding of human genome evolution and diversity.
Karen Miga, Associate Professor, Biomolecular Engineering, UC Santa Cruz Dr. Miga is an Associate Professor in the Biomolecular Engineering Department at UCSC, Director of the UCSC Sequence Technology Center, and an Associate Director of the UCSC Genomics Institute. In 2019, she co-founded the Telomere-to-Telomere (T2T) Consortium, an open, community-based effort to generate the first complete assembly of a human genome. Additionally, Dr. Miga is the PI and Director of the Genome Center for the Human Pangenome Reference Consortium (HPRC). Central to Dr. Miga’s research program is the emphasis on satellite DNA biology and the use of long-read and new genome technologies to construct high-quality genetic and epigenetic maps of human peri/centromeric regions.