Statistical analyses
For all taxonomic groups, we used Generalized Linear Mixed Models
(GLMMs) to test if the different treatment lead to differences in the
observed MOTU richness. In GLMMs, the number of MOTUs per sample was
calculated and used as a dependent effect, the five treatments were used
as predictors, and sample identity was used as a random factor. The
model was performed with the generalized poisson distribution error
using the R package glmmTMB (Brooks et al., 2017), in order to
take into account overdispersion (Consul & Famoye, 1992). If GLMM
detected significant differences among treatments, we used treatment
contrasts to test if each treatment led to communities significantly
different from those unraveled by the “control” condition. Treatment
contrasts are standard non-orthogonal contrasts, in which each category
(treatment) is compared to a user-defined reference category, and are
appropriate to compare multiple treatments against one single control
category (in this case, immediate extraction; (Field, Miles, & Field,
2015). The uncorrected number of MOTUs tends to overestimate the actual
taxonomic richness (Calderón‐Sanou et al., 2020). Therefore, we repeated
this analysis twice: considering all the observed MOTUs, and excluding
the rare MOTUs (i.e. MOTUs with frequency < 1% in each
sample).
Subsequently, we used multivariate analyses to assess the variation of
bacteria, fungi and eukaryotic communities across habitats and
treatments. Before running multivariate analyses, we calculated the
proportion of reads of each MOTU in each sample. Relative abundance
values were then transformed using the Box-Cox transformation, which
simultaneously solves the double-zero problem and improves the
multivariate normality of data (Legendre & Borcard, 2018).
First, we used Nonmetric MultiDimensional Scaling (NMDS) to describe
differences in communities among the three habitats, and check whether
different treatments yield different interpretations of ecological
relationships among samples. NMDS uses an optimization process to find a
configuration of points (samples) in a space with a small number of
dimensions, and is suitable for metabarcoding analyses that aim to
reconstruct variation in community composition as well as possible,
without preserving any particular distance measure among objects
(Borcard, Gillet, & Legendre, 2011; Chen & Ficetola, 2020; Paliy &
Shankar, 2016). Given its robustness and flexibility, NMDS is often used
as the first step to characterize the similarity of communities in
metabarcoding studies (Chen & Ficetola, 2020; Paliy & Shankar, 2016).
NMDS was run on the Euclidean distance computed on
Box–Cox-chord-transformed data (Legendre & Borcard, 2018), by building
1,000 ordinations.
Second, we used ProcMod , a Procrustes-based analysis (Coissac &
Gonindard-Melodelima, 2019), to measure the multivariate correlations
between the communities obtained using the different treatments.ProcMod can be used to measure the shared variation between
matrices, and is particularly appropriate to test relationships between
datasets obtained through DNA metabarcoding and metagenomics (Coissac &
Gonindard-Melodelima, 2019). Procrustes analyses tend to overfit the
data, therefore we used a modified version of Procrustes correlation
that is robust to highly-dimensional data and allows a correct
estimation of the shared variation between data sets (Coissac &
Gonindard-Melodelima, 2019). The Procrustes-based correlation tests were
performed using the corls function in the R packageProcMod, using 1,000 randomizations to test the mean covariance
between random matrices (Coissac & Gonindard-Melodelima, 2019).
Third, we used redundancy analysis (RDA) to measure the amount of
variation among communities that is explained by differences in habitat
and treatments (Legendre & Legendre, 2012; Ter Braak, 1986). With
habitat typology and treatment as constraining matrices, we used
treatment contrasts to test if each treatment led to communities
significantly different from those unraveled by the control treatment.
Thus, significant treatment contrasts indicate that results between
control and experimental treatments differ in an important way, while
non-significant results mean that deviation from ideal conditions is not
specifically pronounced. Significance of RDA and treatment contrasts was
tested through 10,000 permutations using the vegan package in R
(Borcard et al., 2011; Oksanen et al., 2019).
For bacteria only, RDA detected significant differences between the
control treatment and some of the treatments. We thus ran a similarity
percentage analysis with the simper R function (Clarke, 1993)
from vegan to identify the taxa contributing to the overall
pairwise treatment difference (Geyer et al., 2014). Significance was
tested using 50,000 permutations. Given the large number of tests
performed, the significance of tests was corrected using the False
Discovery Rate (FDR) method with the fdrtool package (Strimmer,
2008). FDR has greater power than traditional approaches (e.g.
Bonferroni correction) when performing multiple comparisons (Benjamini
& Hochberg, 1995). All statistical analyses were performed in the R
environment.