The absence of morphological species in DNA metabarcoding
datasets is not necessarily linked to the reference databases used for
taxonomic assignment
Importantly, within the metazoan diversity DNA metabarcoding was not
able to detect all morphological species, despite the fact that our
reference database included 51 out of the 57 morphospecies. Our results
illustrate that the reference database is not the only cause of the low
number of ASVs with taxonomic assignments. When including all biological
replicates, our best performing primer set (A) was unable to detect 19
out of the 59 morphological species. Only six of these 19 species did
not have a reference sequence, indicating that other factors than the
reference database are influencing species detection in metabarcode
datasets. First, the wetlab procedure can lead to missing species
because of differences in efficiency of DNA extraction between species
or because primers insufficiently match with the COI gene of those
species. All 19 species are relatively large animals, and for several
species more than one individual was found in a particular sample.
Several species have soft tissues, it is therefore unlikely that DNA
extraction would be problematic for these species. Eleven species
belonged to the Polychaeta, a class characterized by high species and
sequence diversity in the COI gene (Carr, 2012). It is therefore likely
that for these species, the primers were not a good match. This is
further strengthened by the fact that for six polychaete species voucher
specimens were available and good DNA was extracted, while no PCR
product or bad Sanger sequences were obtained. This illustrates the
great benefit of primer free methods for biodiversity assessment,
although these are at this point more expensive than DNA metabarcoding
(Giebner et al., 2020). Second, taxonomic assignment through the RDP
classifier may be inefficient. One species, Acrocnida brachiata ,
was identified using the Midori reference dataset which contains 151 COI
sequences of this species, including a sequence that is identical to our
own reference sequence. It is known that the content and size of the
training set strongly impacts taxonomic assignment with RDP Classifier
(Ritari et al., 2015). Finally, the morphological identification by our
experts may have been incorrect. However, the morpho-taxonomic analyses
of macrobenthos is under accreditation (accreditation certificate nr.
315-TEST, following NBN EN ISO/IEC 17025:2005) and the two experts that
have conducted the morphological identification have very low
misidentification rates (at most 3 taxa have been misidentified in a
sample that underwent quality control over the last 9 years), suggesting
that misidentification can only have a minor impact on our results.