Investigation of traits that impact the probability of detection
of macrobenthos in bulk DNA and eDNA
To determine whether the detection of macrobenthos species in bulk DNA
and eDNA datasets could be linked to morphological and biological traits
of the species, four categorical traits were scored: body size
(< 20 mm, 20-100 mm, >100 mm), body skeleton
(chitin, CaCO3 or soft tissue), larval stage (benthic or
pelagic) and longevity (<1, 1-3, 3-10, >10 years)
based on the species-trait dataset of Breine et al. (2018), and for body
skeleton based on Brusca and Brusca (2002) and Motokawa (1984). To
investigate how these four traits affect the probability of detection
(presence/absence) of the species in bulk DNA and eDNA samples, we used
a generalized linear mixed effects model approach with the glmer
function in the lme4 package (Bates, Maechler, Bolker, & Walker, 2015).
The mixed logistic regression model was built with sample type (bulk DNA
or eDNA) and the four traits as fixed effects, while Sample ID was
considered as a random effect because eDNA from the ethanol preservative
and bulk DNA came from the same sample. We used a logit-link and assumed
a binomially distributed error for the presence/absence response
variable. We started with simple models always including sample type and
the interaction of sample type with one of the four traits. Only traits
with significant terms were then used to build more complex models. The
final model was chosen based on the lowest AIC and the lowest number of
parameters. This model included sample type, body size and body skeleton
along with the two way interaction terms with sample type. The
significance level of the Type III tests of the fixed model terms was
generated using the Anova function of the car package (Fox & Weisberg,
2019). To visualize the effects of the model terms, least squares means
of the fitted model were calculated using the lsmeans function of the
emmeans package v 1.4.7 (Lenth, 2020). The obtained probabilities were
visualised using ggplot. Posthoc pairwise significance testing for
differences between body sizes within sample type and between body
skeleton categories within sample type were conducted using the lsmeans
function. We restricted the trait analyses to the dataset of primer set
A, because this primer set detected most of the morphological species
(see results).