Introduction
Diatoms (Bacillariophyceae) rank among the most important components of aquatic food webs and play an important role in carbon fixation (Mann 1999). Because of their fast response and narrow optima for multiple environmental variables, diatoms are excellent indicators of ecosystem health (Dixit, Smol, Kingston & Charles 1992; Pan, Stevenson, Hill, Herlihy & Collins 1996), and may provide early warning signals for aquatic ecosystem changes in face of anthropogenic pressures such as eutrophication (Wang et al. 2012) or heavy metal contamination (Chen et al. 2015). The standard methods for assessing diatom communities rely on counting and identifying their silicified cell walls (valves) using mostly light microscopy (e.g. European-Committee-for-Standardization 2014). But with the rapid development and continuously decreasing costs of high-throughput sequencing (HTS) technologies, the metabarcoding approach, allowing simultaneous identification of multiple species from environmental samples, has become an alternative tool for fast biodiversity assessment (Ruppert, Kline & Rahman 2019). Because morphological analyses of diatoms (and other microorganisms) are labor-intensive, require expertise and are prone to inter-investigator variation, metabarcoding, referred to as ‘Biomonitoring 2.0’ (Baird & Hajibabaei 2012), may have the potential to outperform the traditional, low throughput, monitoring methods.
Metabarcoding-based biodiversity studies, however, may face various difficulties, starting from DNA extraction to data processing in complex bioinformatics pipelines (Sinha et al. 2017; Anslan et al.2018; Hardge et al. 2018). Therefore, the suitability of metabarcoding approach for assessing diatom communities have been the research focus for several studies. Although the DNA barcoding library for accurate species level detection is still incomplete for diatoms, metabarcoding is a promising tool for biomonitoring of community assemblages of these organisms as it has been shown to produce similar results compared with morphological analyses (Zimmermann, Glöckner, Jahn, Enke & Gemeinholzer 2015; Apotheloz-Perret-Gentil et al.2017; Vasselon, Rimet, Tapolczai & Bouchez 2017; Keck, Vasselon, Rimet, Bouchez & Kahlert 2018; Rimet et al. 2018; Rimet, Vasselon, Barbara & Bouchez 2018; Rivera et al. 2018). The majority of diatom community studies are applied to biofilms of epilithic diatom species from rivers and lakes, with the goal of assessing current-state water quality. Because diatom silicified valves are usually well preserved in sediments, they also constitute important indicators for inferring paleo-environmental conditions such as water pH, nutrient dynamics, and temperature (Douglas & Smol 2010). However, only few studies have estimated the suitability of metabarcoding for identifying diatom communities directly from sediment samples and have assessed its consistency with microscopy (Dulias, Stoof-Leichsenring, Pestryakova & Herzschuh 2017; Piredda et al. 2017). Although morphological and metabarcoding data sets from these studies have demonstrated highly correlated results, it is not clear how this pattern is related to the quantity of sediment used for DNA extraction or affected by the use of different bioinformatics pipelines. The quantity of sediment used strongly depends on the approach taken for DNA extraction; it is common to use DNA isolation kits which allow input of ‘large’ quantities (usually up to 10 g) of environmental sample, to potentially capture the complete community represented in the sample. However, DNA extraction methods, for example the ‘universal’ Power Soil Kit (Hermans, Buckley & Lear 2018), which process much less material and thus use less chemicals, cost only a fractional amount and may represent attractive alternatives for DNA metabarcoding of large numbers of samples. Multiple publicly available tools exist for bioinformatics processing of large sets of sequencing data, amongst which QIIME (Caporaso et al.2010) and mothur (Schloss et al. 2009) are the most commonly used, but some studies have highlighted that an inappropriate choice of software and settings may heavily affect the final results (Majaneva, Hyytiäinen, Varvio, Nagai & Blomster 2015; Anslan et al. 2018). Also for diatom communities, recent studies have suggested that the choice of bioinformatics pipelines may affect the outcome of metabarcoding studies (Tapolczai, Keck, Bouchez, Rimet & Vasselon 2019; Rivera, Vasselon, Bouchez & Rimet 2020). Here, we investigate diatom communities from Nam Co, a saline lake on the Tibetan Plateau, and from nearby ponds and tributaries. Our aim is to explore whether the characterization of diatom community structure via metabarcoding is dependent on the quantity of sediment used for DNA extraction by comparing the two most commonly used DNA isolation kits, PowerMax Soil and Power Soil (Qiagen, Germany), and by applying those to 10 g and 0.5 g (wet weight) of surface sediment samples, respectively. We further tested the consistency of the metabarcoding results obtained via three different bioinformatics pipelines by applying exact sequence variants (ESV) and two OTU clustering approaches. In addition, we assess how the metabarcoding data sets (from 10 g vs . 0.5 g of sediments) compare with the morphological analyses of diatoms from the same samples, and how these datasets relate with environmental variables.