Sequencing metrics
The pilot sequencing (N = 79 including sequencing blanks) led to a total
of 412 million paired end raw reads, which initial processing steps
(pairing, filtering by length, quality and ambiguity, and
demultiplexing) reduced to 342 million (“Pilot”, Table 1). Removal of
chimeric, erroneous and rare sequences further reduced the dataset, and
after taxonomic identification 284 mill. reads (69% of raw) distributed
over 49 697 zOTUs were isolated. Most of the reads that were
subsequently filtered out were assigned to the consumer taxon
Maxillopoda (98% of assigned reads), whereas reads identified as
contaminants or symbionts accounted for 1.6% and 0.2% respectively.
The final dataset of putative prey counted 1.2 million reads (0.4% of
the assigned reads) in 1500 zOTUs. Distributed over 75 real samples, the
pilot averaged 16 000 prey reads per copepod consumer.
The final sequencing, with an increased number of samples (N = 456
including sequencing blanks), yielded 5.4 billion paired end raw reads
(“Full”, Table 1). Of these, approximately 4.3 billion reads (79% of
raw) in 130 000 zOTUs were subsequently assigned to taxonomy. After
discarding zOTUs assigned to Maxillopoda (98% of assigned reads),
contaminants (1.1%) and symbionts (0.2%), the putative prey counted
52.2 million reads in 22 391 zOTUs. This corresponded to 1.2% of the
assigned reads, or 1.0% of the raw reads, and a mean depth of
~120 000 prey reads per copepod consumer. Compared to
dividends from relevant literature using dissection or blocking primers,
the average prey reads per sample of both sequencing runs were more than
two times greater (Table 2).
Table 1: Summary of read and zOTU abundances before and during
bioinformatic processing (Step 1-4), according to sample type (real
samples or extraction negatives), and according to taxonomic identity
(consumer, symbiont, contamination, prey). The total number of samples
(N) are presented for both sequencing runs, and the number of extraction
negatives and real samples are indicated in parentheses for the pilot
(np) and for the full sequencing (nf).
Sample types and identified taxa are also presented with percentage-wise
contributions to the total of assigned reads (percentage of assigned;
POA) or to assigned reads from real samples (POA†).