Professor University of Memphis Memphis, Tennessee
With the advent of long-read sequencing technologies and the increasing affordability of long-read sequencing and scaffolding technologies, there is now a rapidly increasing number of high-quality beetle (Coleoptera) genome assemblies available for study. However, there remains a discrepancy between the sequenced organisms and the higher-level taxonomic diversity of Coleoptera, potentially contributing bias to our understanding of genome evolution. In the present talk, we examine the current status of Coleoptera de-novo genomics. We report the results of comparative analyses of publicly available genome assemblies and assess these genomes for taxonomic representation, assembly quality, annotation quality, and completeness. Among the 310 reference-quality genomes that were publicly available as of 01 May 2024, 47 families of Coleoptera were represented, with a large fraction of genomes representing the families Chrysomelidae and Staphylinidae (48 species each). Nearly 49.03% of these genomes were scaffolded to chromosome level; however, only 12% had an annotation available. Approximately 56% of the genomes contained more than 98% of the Arthropod universal single-copy genes as complete copies. About 63% of the genomes had a scaffold N50 greater than 1 Mb, while nearly 10% of the genomes were highly fragmented. Despite their taxonomic skew, these genomes offer valuable insights into Coleoptera genome evolution. For example, many of them harbour an extensive number of recent transposable element insertions. Additionally, we observed several lineage-specific repetitive sequence expansion events. We aim to further study Coleoptera genome architecture using these data to identify lineage-specific features and genomic innovations underlying important features of the biology of Coleoptera.