Investigation of traits that impact the probability of detection of macrobenthos in bulk DNA and eDNA
To determine whether the detection of macrobenthos species in bulk DNA and eDNA datasets could be linked to morphological and biological traits of the species, four categorical traits were scored: body size (< 20 mm, 20-100 mm, >100 mm), body skeleton (chitin, CaCO3 or soft tissue), larval stage (benthic or pelagic) and longevity (<1, 1-3, 3-10, >10 years) based on the species-trait dataset of Breine et al. (2018), and for body skeleton based on Brusca and Brusca (2002) and Motokawa (1984). To investigate how these four traits affect the probability of detection (presence/absence) of the species in bulk DNA and eDNA samples, we used a generalized linear mixed effects model approach with the glmer function in the lme4 package (Bates, Maechler, Bolker, & Walker, 2015). The mixed logistic regression model was built with sample type (bulk DNA or eDNA) and the four traits as fixed effects, while Sample ID was considered as a random effect because eDNA from the ethanol preservative and bulk DNA came from the same sample. We used a logit-link and assumed a binomially distributed error for the presence/absence response variable. We started with simple models always including sample type and the interaction of sample type with one of the four traits. Only traits with significant terms were then used to build more complex models. The final model was chosen based on the lowest AIC and the lowest number of parameters. This model included sample type, body size and body skeleton along with the two way interaction terms with sample type. The significance level of the Type III tests of the fixed model terms was generated using the Anova function of the car package (Fox & Weisberg, 2019). To visualize the effects of the model terms, least squares means of the fitted model were calculated using the lsmeans function of the emmeans package v 1.4.7 (Lenth, 2020). The obtained probabilities were visualised using ggplot. Posthoc pairwise significance testing for differences between body sizes within sample type and between body skeleton categories within sample type were conducted using the lsmeans function. We restricted the trait analyses to the dataset of primer set A, because this primer set detected most of the morphological species (see results).