Multiplex barcoding and whole organism community
metabarcoding
While classical barcode sequencing involves the individualisation of
both the PCR and sequencing reactions, HTS platforms now offer the
opportunity to pool thousands of amplicons from individual specimens via
tagged amplicon sequencing (Creedy et al., 2020; Hebert et al., 2018,
Shokralla et al., 2014, Srivathsan et al., 2019, 2021; de Kerdrel et al.
2020). This can be scaled up to 10,000 multiplexed individuals within a
single MinION flow cell (Srivathsan et al., 2019, 2021) or several
hundred thousand for one lane of NovaSeq 6000 when a reduced length
“mini barcode” is used (Yeo, Srivathsan & Meier, 2020). HTS multiplex
barcoding provides a direct link between DNA sequences and the
individuals from which they were amplified. This has several advantages.
It allows one to sort physical specimens to putative species and to
resolve taxonomic disagreements between barcodes and other data (Wang,
Srivathsan, Foo, Yamane, & Meier, 2018). This is necessary when the
associated sequence appears unusual (e.g. unexpectedly high sequence
divergence) or species delimitation approaches with different algorithms
return conflicting results (Meier et al., 2021). It also allows one to
return to the DNA extract, should there be interest in further exploring
the nuclear genome, diet content or microbiome of specific specimens
(Kennedy et al., 2020). Another very obvious advantage is that abundance
estimates can be directly extracted from the DNA sequence data.
In contrast to multiplex barcoding, whole organism community DNA
(wocDNA) metabarcoding (Andújar et al., 2018; Creedy et al., 2021; Yu et
al., 2012) involves a single DNA extraction for multiple individuals
from multiple species, that is subsequently PCR amplified and sequenced,
typically using the Illumina platform. This reduces the individualised
processing of specimens, which is particularly relevant for hyperdiverse
(and minute specimen) arthropod assemblages (e.g. Arribas et al., 2016;
Creedy et al., 2019) and/or high numbers of community samples (e.g. for
long-term or broad-scale approaches). However, there are a number of
ways in which the information content of wocDNA metabarcode data is
different from multiplex barcode data, either requiring additional data
processing or placing limits on inferences that can be derived. An
important feature of wocDNA metabarcode sequence output is the
difficulty to discern spurious sequences (PCR and DNA sequencing
artefacts, contamination, nuclear copies, or different combinations of
these) from real (but low abundance) sequences in the community sample.
With appropriate laboratory protocols, design and bioinformatic
processing, contamination issues and PCR and DNA sequencing artefacts
can be substantially reduced (e.g. Alberdi, Aizpurua, Gilbert, &
Bohmann, 2018; Creedy et al., 2021). It has also recently become
possible to effectively remove nuclear copies of mtDNA sequences,
providing for haplotype-level resolution from wocDNA metabarcode data
(Andújar et al., 2021). Within wocDNA metabarcode data, there is no
correspondence between sequences and the individual from which they are
derived. While biodiversity patterns can still be explored without
taxonomic assignment, species-level taxonomic assignment is generally a
desirable feature, and in this case, can be only achieved with
taxonomically assigned barcode reference sequences. Even without
species-specific reference libraries, arthropod sequence assignment to
some taxonomic level can be achieved using public repositories (e.g.
GenBank or BOLD). Finally, the extrapolation of abundance data from
metabarcode sequence output is complicated, but several promising
approaches for deriving abundance data from standardised samples have
been developed (e.g. Ji et al., 2020; Krehenwinkel et al., 2017; Luo,
Ji, & Yu, 2022).
The choice of HTS barcoding approach to catalogue arthropod biodiversity
will be dependent on the specific objectives to be addressed. However,
there are potential synergies from combining high throughput generation
of community-level data by wocDNA metabarcoding, together with vouchered
sequencing by multiplex barcoding. Vouchering may be considered
unnecessary when well-parameterized reference libraries are available,
but it is otherwise an essential consideration for future taxonomic
assignment of metabarcoding reads and for completing reference barcode
databases. Individualised and validated barcodes generated by multiplex
barcoding are also of particular relevance for the bioinformatic
processing of metabarcode reads (Andújar et al., 2021).