Non-canonical (non-B) DNA motifs are genomic sequences capable of folding into three-dimensional structures distinct from the canonical right-handed helix. These structures regulate gene expression but also serve as mutation hotspots and are linked to cancer. Because non-B DNA is difficult to sequence, its annotations have been in-complete in most genome assemblies. Telomere-to-telomere (T2T) assemblies now overcome this limitation. Here, we provide a comprehensive analysis of seven types of non-B DNA motifs (e.g., G-quadruplexes and Z-DNA) in the zebra finch T2T genome. Motif content varied strongly by chromosome categories; gene-rich dot chromosomes showed the highest motif levels (16.1-31.7%), microchromosomes intermediate levels (6.5-18.9%), and macrochromosomes the lowest (6.0-7.1%). Within chromosomes, Z-DNA was enriched at some centromeres, and G-quadruplexes were enriched at promoters and 5’UTRs. Low methylation at G-quadruplexes suggests they can form and contribute to gene regulation in these regions. Comparable patterns of non-B DNA distribution were observed in the near T2T chicken genome, and in four other diverse bird species with high quality genomes. Overall, our findings indicate that the non-B DNA distribution reflects the distinctive architecture of avian genomes, implicating non-canonical DNA in gene expression and centromere organization. The unusually high density on dot chromosomes is negatively correlated with PacBio sequencing depth, and thus helps explain why these chromosomes have posed exceptional challenges for sequencing.
Linnéa Smeds is a Post Doc in the Makova Lab at Penn State University. She is a trained bioinformatician and worked for several years in a research group at Uppsala University, Sweden, focusing on speciation genomics, sex chromosome evolution and mutation rates, before taking on a PhD in evolutionary genetics. Her thesis focused on conservation genomics in the highly inbred Scandinavian wolf population and included analyses on Y chromosome haplotypes, admixture, genetic load and structural variation. In the Makova lab, Linnéa is investigating non-canonical DNA in humans and other great apes using T2T genomes.