Bioinformatics analysis
Raw sequence reads from the Illumina MiSeq were evaluated using FASTQC
(http:// www.bioinformatics.babraham.ac.uk/projects/fastqc). Raw
sequence reads were demultiplexed using a custom Python script before
bioinformatic processing using metaBEAT v0.97.11
(https://github.com/HullUnibioinformatics/metaBEAT), which is a custom
pipeline that incorporates commonly used open source software. The
program Trimmomatic 0.32 (Bolger et al., 2014) was used for quality
trimming and removal of locus primers from the raw sequence reads.
Average read quality was assessed in 5-bp sliding windows starting from
the 3’ of the read, and reads were clipped until the average quality per
window was above phred 30. All reads shorter than a defined minimum 90
bp read length were discarded. Sequence pairs were subsequently merged
into single high-quality reads using the program FLASH 1.2.11 (Magoč &
Salzberg, 2011). Reads surviving quality filtering and trimming were
screened for chimeric sequences against a custom, curated reference
database using the uchime_ref function implemented in vsearch
1.1 (https://github.com/torognes/vsearch). The reference database was
developed at the University of Hull (Hänfling et al., 2016) and
supplemented with asp (GenBank accession numbers: MT163435, MT163450,
MT163449) and marena whitefish (Coregonus maraena ) (GenBank
accession numbers: MT163451, MT163458, MT163460) to represent all fish
species in the study catchment. Sequences were clustered at 100%
identity using VSEARCH v1.1. Clusters represented by less than three
sequences were considered sequencing error and were omitted from further
analysis. Nonredundant sets of query sequences were then compared to the
reference database using BLAST (Zhang et al., 2000). BLAST output was
interpreted using a custom python function, which implements a lowest
common ancestor approach for taxonomic assignment, similar to the
strategy used by MEGAN 5.10.6 (Huson et al., 2007). BLAST hits were only
considered if they possessed a minimum identity of 99% and 90% query
coverage.