The Balearic shearwater (Puffinus mauretanicus) is the most threatened seabird in Europe. The fossil record suggests that human colonisation of the Balearic Islands resulted in a sharp decrease of the population size. Currently, populations continue to be decimated mainly due to predation by introduced mammals and bycatch in longline fisheries, and some studies predict their extinction by 2070. We present the first high-quality reference genome for the species which was obtained by a combination of short and long-read sequencing. Our hybrid assembly includes 4,169 scaffolds, with a scaffold N50 of 2.1 Mbp, a genome length of 1.2 Gbp, and BUSCO completeness of 96%, which is amongst the highest across sequenced avian species. This reference genome allowed us to study critical aspects relevant to the conservation status of the species, such as an evaluation of overall heterozygosity levels and the reconstruction of its historical demography. Our phylogenetic analysis using whole-genome information resolves current uncertainties in the order Procellariiformes systematics. Comparative genomics analyses uncover a set of candidate genes that may have played an important role into the adaptation to a pelagic lifestyle of Procellariiformes, including those for the enhancement of fishing capabilities, night vision and the development of natriuresis. This reference genome will be the keystone for future developments of genetic tools in conservation efforts for this Critically Endangered species.

Johnma Rondón

and 5 more

Paula Escuer

and 7 more

We present the chromosome-level genome assembly of Dysdera silvatica Schmidt, 1981, a nocturnal ground-dwelling spider endemic from the Canary Islands. The genus Dysdera has undergone a remarkable diversification in this archipelago mostly associated with shifts in the level of trophic specialization, becoming an excellent model to study the genomic drivers of adaptive radiations. The new assembly (1.37 Gb; and scaffold N50 of 174.2 Mb), was performed using the chromosome conformation capture scaffolding technique, represents a continuity improvement of more than 4,500 times with respect to the previous version. The seven largest scaffolds or pseudochromosomes cover 87% of the total assembly size and match consistently with the seven chromosomes of the karyotype of this species, including the characteristic large X chromosome. To illustrate the value of this new resource we performed a comprehensive analysis of the two major arthropod chemoreceptor gene families (i.e., gustatory and ionotropic receptors). We identified 545 chemoreceptor sequences distributed across all pseudochromosomes, with a notable underrepresentation in the X chromosome. At least 54% of them localize in 83 genomic clusters with a significantly lower evolutionary distances between them than the average of the family, suggesting a recent origin of many of them. This chromosome-level assembly is the first high-quality genome representative of the Synspermiata clade, and just the third among spiders, representing a new valuable resource to gain insights into the structure and organization of chelicerate genomes, including the role that structural variants, repetitive elements and large gene families played in the extraordinary biology of spiders.

Joan Ferrer Obiol

and 7 more

Joel Vizueta

and 2 more

Gene annotation is a critical bottleneck in genomic research, especially for the comprehensive study of very large gene families in the genomes of non-model organisms. Despite the recent progress in automatic methods, state-of-the-art tools used for this task often produce inaccurate annotations, such as fused, chimeric, partial or even completely absent gene models for many family copies, errors that require considerable extra efforts to be corrected. Here we present BITACORA, a bioinformatics solution that integrates popular sequence similarity-based search tools and Perl scripts to facilitate both the curation of these inaccurate annotations and the identification of previously undetected gene family copies directly in genomic DNA sequences. We tested the performance of BITACORA in annotating the members of two chemosensory gene families with different repertoire size in seven available genome sequences, and compared its performance with that of Augustus-PPX, a tool also designed to improve automatic annotations using a sequence similarity-based approach. Despite the relatively high fragmentation of some of these drafts, BITACORA was able to improve the annotation of many members of these families and detected thousands of new chemoreceptors encoded in genome sequences. The program creates general feature format (GFF) files, with both curated and newly identified gene models, and FASTA files with the predicted proteins. These outputs can be easily integrated in genomic annotation editors, greatly facilitating subsequent manual annotation and downstream evolutionary analyses.