Bioinformatic treatment
The bioinformatic treatment of sequence data was performed using the
OBITools software suite (Boyer et al., 2016). First, forward and reverse
reads were assembled using the illuminapairedend program, keeping
only sequences with an alignment score higher than 40. Aligned sequences
were assigned to the corresponding PCR replicate using the programngsfilter , by allowing two and zero mismatches on primers and
tags, respectively. After sequence dereplication using obiuniq ,
bad-quality sequences (i.e. containing “N”), sequences whose length
fell outside the expected size interval (below 45 bp for Bact02, below
68 bp Fung02 and below 36 bp for Euka02) and singletons were filtered
out. The obiclean program was run to detect potential PCR or
sequencing errors with the -r option set at 0.5: in a PCR reaction,
sequences are tagged as “heads” when they are at least twice as
abundant as other related sequences differing by one base. Only the
sequences tagged as “heads” in at least one PCR were kept.
Taxonomic assignment was conducted using the ecotag program based
on a reference database constructed from EMBL (version 136) by running
the ecoPCR program (Ficetola et al., 2010). More specifically,ecoPCR carried out an in silico PCR with the primer pair
used for the experiment and allowing three mismatches per primer. The
obtained reference databases were further curated by keeping only
sequences assigned at the species, genus and family levels.
Further data filtering was performed in R version 3.6.1 (R Core Team,
2018) to remove spurious sequences that can bias ecological conclusions
drawn from DNA metabarcoding data (Calderón‐Sanou et al., 2020). More
specifically, we discarded from our dataset MOTUs with a best identity
<85% (Fung02, Bact02) or <80% (Euka02), observed
less than five times overall or in more than one extraction or PCR
negative control (Zinger, Bonin, et al., 2019a). Furthermore, we removed
all MOTUs that were detected in less than two PCR replicates of the same
sample, as they often represent false positives (Ficetola et al., 2015).