High-Throughput Sequencing
High-Throughput Sequencing (HTS) has enabled sequencing of thousands to millions of sequences at one time and has changed our view on Earth’s biodiversity at all organismal levels (Deiner et al. 2017). Because of constantly declining costs, HTS technologies put forth new tools to address questions previously tackled through labor-intensive (and often less efficient) cloning steps, and amplicon sequencing of loci with high information content (target sequencing) is probably the most straightforward application (Ekblom and Galindo 2010). A further key contribution of HTS lies in the possibility of exploring datasets that cover wide taxonomic and/or geographic breadths. Good inter- and intra-specific taxon sampling is usually required to address the processes underlying speciation, diversification, distribution and species assembly, especially when taxonomic uncertainties and high diversity are involved. Hence, the utilization of HTS is especially cost-effective as many individuals can be combined (multiplexed) in the same sequencing run and rare variants can be readily detected (Babik et al. 2009; Glenn 2011).
Studies characterizing the abundance and patterns of intragenomic nrDNA polymorphisms in different organisms are increasing (Stage and Eickbush 2007; Ganley and Kobayashi 2007; Bik et al. 2013; Straub et al. 2012; Mahelka et al. 2013; Wang et al. 2016; Symonová 2019). Determination of the full sequence of the 35S cistrons was successful in different plant groups (e.g. Malè et al. 2014; Turner et al. 2016; Ji et al. 2019), although so far with little use to open phylogenetic questions. This is partly due to limited sampling (usually a single individual per species or higher taxonomic units). Simon et al. (2012) used a deep sequencing approach to detect intragenomic ITS polymorphisms among populations ofArabidopsis . However, phylogenetic studies investigating intragenomic nrDNA polymorphism patterns across many species within the same genus are still scarce (e.g. Song et al. 2012; Weitemeier et al. 2015), and the full extent of the divergence of the 5S-IGS intra-genomic variants in plants has not yet been adequately explored (cf. Galián et al. 2014).
In this pilot study, we generated amplicon data of the intergenic spacer of the 5S nuclear ribosomal DNA cistron (5S-IGS) using High-Throughput Sequencing (HTS) from six geographic samples of different composition: pure samples, including only material of a single target species, and mixed samples, including all species found at a certain place. The investigated species cover all common lineages of western Eurasian oaks (sects Cerris , Ilex and Quercus ). Amplicon data were analyzed using our clone-sequence data as reference, and the taxonomic resolution of the 5S-IGS region was assessed comparing the performance of similarity algorithm (Basic Local Alignment Search Tool—BLAST ) and evolutionary approaches (Maximum Likelihood trees—ML ; Evolutionary Placement Algorithm—EPA ).
To our knowledge, this work is the first to thoroughly analyze intragenomic variation of the nuclear ribosomal 5S region in plants (see also Heitkam et al. 2015, who identified and developed probes for 5S genes while screening Illumina-generated sequences for repetitive genome elements). The potential applications of our approach are manifold and span from the delineation of oak species, the assessment of intra- and inter-species diversity, the detection of hybridization/introgression patterns, the identification of cryptic lineages, to gaining genetic insights into the structure, assembly, function and evolution of oak communities.